CAPTURING & PROCESSING REAL-TIME DATA ON AWS
|
|
- Beryl Houston
- 8 years ago
- Views:
Transcription
1 CAPTURING & PROCESSING REAL-TIME DATA ON 2015 Amazon.com, Inc. and Its affiliates. All rights reserved. May not be copied, modified, or distributed in whole or in part without the express consent of Amazon.com, Inc.
2 Agenda Real-Time Analytics Data Ingestion Data Processing n Architecture n AWS Lambda Customer Implementations
3 Real-Time Analytics Real-time Ingest! Highly Scalable" Durable" Elastic " Replay-able Reads" " Continuous Processing FX! + Load-balancing incoming streams" Fault-tolerance, Checkpoint / Replay" Elastic" Enable multiple apps to process in parallel" Continuous, real-time workloads! Low end-to-end latency! Continuous data flow!
4 Data Ingestion
5 Starting simple... foo-analysis.com Global top-10
6 Distributing the workload Elastic Beanstalk foo-analysis.com Global top-10
7 Or using a Elastic Data Broker Local top-10 Local top-10 Local top-10 Elastic Beanstalk foo-analysis.com Global top-10
8 Amazon Kinesis Managed Stream Elastic Beanstalk foo-analysis.com K I N E S I S Partition Key Worker My top-10 Sequence Number Data Record Global top-10 Data Record Stream Shard
9 Amazon Kinesis Common Data Broker Data Sources Data Sources Availability Zone Availability Zone Availability Zone [Data Archive] App. 1 App. 2 S3 Data Sources Data Sources AWS Endpoint Shard 1 Shard 2 Shard N [Metric Extraction] App. 3 [Sliding Window Analysis] DynamoDB Redshift App. 4 Data Sources [Machine Learning] EMR
10 Amazon Kinesis Distributed Streams From batch to continuous processing Scale shards elastically UP or DOWN without losing sequencing Workers can replay records for up to 24 hours Scale up to GB/sec without losing durability Records stored across multiple availability zones Multiple parallel Kinesis Apps output to anything RDBMS, S3, In-house Data Warehouse, Messaging, another stream, JavaSDK, PythonSDK, etc.
11 Data Processing
12 Emerging Architecture Data Streams Spark Storm KCL Streaming Analytics Notifications & Alerts APIs Dashboards/ visualizations Real Time Micro Batch Data Archive DW Hadoop Batch Analysis Dashboards/ visualizations Deep Learning Batch
13 Real-time: Event-based processing Producer Amazon Kinesis Kinesis Storm Spout Apache Storm Elas7Cache (Redis) Node.js Client (D3) hap://blogs.aws.amazon.com/bigdata/post/tx36lyscy2r0a9b/implement- a- Real- 7me- Sliding- Window- Applica7on- Using- Amazon- Kinesis- and- Apache
14 Micro-Batches: Drip feeding the data hap://blogs.aws.amazon.com/bigdata/post/tx2anln1pgeldju/best- Prac7ces- for- Micro- Batch- Loading- on- Amazon- RedshiY
15 Offline Batch: Hadoop for discovery Offline Analysis Producer Amazon Kinesis Kinesis Applica7on S3 EMR Ad- hoc Analysis Amazon Kinesis Hive Pig EMR Cascading MapReduce hap://blogs.aws.amazon.com/bigdata/post/tx36lyscy2r0a9b/implement- a- Real- 7me- Sliding- Window- Applica7on- Using- Amazon- Kinesis- and- Apache
16 Putting it together Producer Amazon Kinesis Apache Storm DynamoDB App Client Real Time KCL RedshiY BI Tools Micro Batch Batch KCL S3 EMR
17 AWS Lambda An event-driven computing service for dynamic applications AWS Lambda func/ons can be triggered by data stream updates from Amazon Kinesis and Amazon DynamoDB. For instance, you can watch for a pabern, such as an address, and trigger an alert.
18 A focus on functions, data and events S3 event notifications DynamoDB Streams Kinesis events Custom events Cloud func7ons
19 Putting AWS Lambda to work Server-free back-end Data triggers IoT Stream processing Indexing & synchronization
20 AWS Lambda for reactive computing Photo bucket S3 Extract Metadata Cloud Function Metadata DynamoDB Trending Cloud Function Trending DynamoDB NotifyCloud Function SNS Push notification
21 Processing Events from Kinesis Write million of events from Kinesis into Elas7search with only 60 lines of code!!! haps://gist.github.com/tylr/ e8baf45c07ced23ef013 hap://docs.aws.amazon.com/lambda/latest/dg/walkthrough- kinesis- events- adminuser.html
22 Customer deployments on AWS
23 GREE International re:invent 2014 GAM301 - Real-Time Game Analytics with Amazon Kinesis, Redshift, and DynamoDB Session - Slide: gam301-realtime-game-analytics-with-amazon-kinesisamazon-redshift-and-amazon-dynamodb-awsreinvent-2014
24 Key Requirements for Analytics Initial Requreiments Data collection & streaming to database Zero data loss Zero data corruption Guaranteed data delivery New Requirements Near real-time data latency Real-time ad-hoc analysis Ease of adding consumers Managed Service
25 Data Collection Source of Data Mobile Devices Game Servers Ad Networks Data Sizes Size of event ~ 1 KB 500M+ events/day 500G+/day & growing JSON format
26 Architecture
27 SocialMetrix re:invent 2014 ARC202: Real-World Real-Time Analytics Session: Slides: real-world-real-time-analytics mhfinaledit
28 Drivers for architecture evolution More customers, bigger customers Add new features Keep costs under control
29 Requirements at 4th iteration Monitor millions of social media profiles Make data accessible (exploration, PoC) Improve UI response times Testing our data pipelines Reprocessing (faster)
30 Architecture
31 Cost over Architecture Costs Customers Active Customers #1 #2 #3 #4
32 THANK YOU!!!
Real Time Big Data Processing
Real Time Big Data Processing Cloud Expo 2014 Ian Meyers Amazon Web Services Global Infrastructure Deployment & Administration App Services Analytics Compute Storage Database Networking AWS Global Infrastructure
More informationHadoop & Spark Using Amazon EMR
Hadoop & Spark Using Amazon EMR Michael Hanisch, AWS Solutions Architecture 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Agenda Why did we build Amazon EMR? What is Amazon EMR?
More informationLambda Architecture for Batch and Real- Time Processing on AWS with Spark Streaming and Spark SQL. May 2015
Lambda Architecture for Batch and Real- Time Processing on AWS with Spark Streaming and Spark SQL May 2015 2015, Amazon Web Services, Inc. or its affiliates. All rights reserved. Notices This document
More informationThing Big: How to Scale Your Own Internet of Things. Walter'Pernstecher'-'pernstec@amazon.de' Dr.'Markus'Schmidberger'-'schmidbe@amazon.
Thing Big: How to Scale Your Own Internet of Things Walter'Pernstecher'-'pernstec@amazon.de' Dr.'Markus'Schmidberger'-'schmidbe@amazon.de' Internet of Things is the network of physical objects or "things"
More informationSAS BIG DATA SOLUTIONS ON AWS SAS FORUM ESPAÑA, OCTOBER 16 TH, 2014 IAN MEYERS SOLUTIONS ARCHITECT / AMAZON WEB SERVICES
SAS BIG DATA SOLUTIONS ON AWS SAS FORUM ESPAÑA, OCTOBER 16 TH, 2014 IAN MEYERS SOLUTIONS ARCHITECT / AMAZON WEB SERVICES AWS GLOBAL INFRASTRUCTURE 10 Regions 25 Availability Zones 51 Edge locations WHAT
More informationAmazon Kinesis and Apache Storm
Amazon Kinesis and Apache Storm Building a Real-Time Sliding-Window Dashboard over Streaming Data Rahul Bhartia October 2014 Contents Contents Abstract Introduction Reference Architecture Amazon Kinesis
More informationBeyond Lambda - how to get from logical to physical. Artur Borycki, Director International Technology & Innovations
Beyond Lambda - how to get from logical to physical Artur Borycki, Director International Technology & Innovations Simplification & Efficiency Teradata believe in the principles of self-service, automation
More informationBIG DATA. Using the Lambda Architecture on a Big Data Platform to Improve Mobile Campaign Management. Author: Sandesh Deshmane
BIG DATA Using the Lambda Architecture on a Big Data Platform to Improve Mobile Campaign Management Author: Sandesh Deshmane Executive Summary Growing data volumes and real time decision making requirements
More informationIntroduction to AWS in Higher Ed
Introduction to AWS in Higher Ed Lori Clithero loricli@amazon.com 206.227.5054 University of Washington Cloud Day 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved. 2 Cloud democratizes
More informationEnd to End Solution to Accelerate Data Warehouse Optimization. Franco Flore Alliance Sales Director - APJ
End to End Solution to Accelerate Data Warehouse Optimization Franco Flore Alliance Sales Director - APJ Big Data Is Driving Key Business Initiatives Increase profitability, innovation, customer satisfaction,
More informationTechnology Enablement
SOLUTION OVERVIEW 1 ABOUT TECHMILEAGE Founded in 2008 / Tempe, Arizona Over 100 engagements Full range of business & technology services Software Development, Big Data, Cloud/AWS, BI, Advanced Analytics
More informationBIG DATA ANALYTICS REFERENCE ARCHITECTURES AND CASE STUDIES
BIG DATA ANALYTICS REFERENCE ARCHITECTURES AND CASE STUDIES Relational vs. Non-Relational Architecture Relational Non-Relational Rational Predictable Traditional Agile Flexible Modern 2 Agenda Big Data
More informationBackground on Elastic Compute Cloud (EC2) AMI s to choose from including servers hosted on different Linux distros
David Moses January 2014 Paper on Cloud Computing I Background on Tools and Technologies in Amazon Web Services (AWS) In this paper I will highlight the technologies from the AWS cloud which enable you
More informationDesigning Agile Data Pipelines. Ashish Singh Software Engineer, Cloudera
Designing Agile Data Pipelines Ashish Singh Software Engineer, Cloudera About Me Software Engineer @ Cloudera Contributed to Kafka, Hive, Parquet and Sentry Used to work in HPC @singhasdev 204 Cloudera,
More informationReal-time Big Data Analytics with Storm
Ron Bodkin Founder & CEO, Think Big June 2013 Real-time Big Data Analytics with Storm Leading Provider of Data Science and Engineering Services Accelerating Your Time to Value IMAGINE Strategy and Roadmap
More informationBuilding Real-Time Analytics Into Big Data Applications
Building Real-Time Analytics Into Big Data Applications Shawn Gandhi, Solutions Architect @shawnagram 2015, Amazon Web Services, Inc. or its affiliates. All rights reserved { } "payerid": "Joe", "productcode":
More informationCloud Big Data Architectures
Cloud Big Data Architectures Lynn Langit QCon Sao Paulo, Brazil 2016 About this Workshop Real-world Cloud Scenarios w/aws, Azure and GCP 1. Big Data Solution Types 2. Data Pipelines 3. ETL and Visualization
More informationBig Data on AWS. Services Overview. Bernie Nallamotu Principle Solutions Architect
on AWS Services Overview Bernie Nallamotu Principle Solutions Architect \ So what is it? When your data sets become so large that you have to start innovating around how to collect, store, organize, analyze
More informationCIS 4930/6930 Spring 2014 Introduction to Data Science Data Intensive Computing. University of Florida, CISE Department Prof.
CIS 4930/6930 Spring 2014 Introduction to Data Science Data Intensive Computing University of Florida, CISE Department Prof. Daisy Zhe Wang Cloud Computing and Amazon Web Services Cloud Computing Amazon
More informationGetting Real Real Time Data Integration Patterns and Architectures
Getting Real Real Time Data Integration Patterns and Architectures Nelson Petracek Senior Director, Enterprise Technology Architecture Informatica Digital Government Institute s Enterprise Architecture
More informationMicroservices on AWS
Microservices on AWS AWS Summit Berlin 2016 Matthias Jung, Solutions Architect Julien Simon, Evangelist April, 12 th, 2016 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Agenda
More informationAzure Data Lake Analytics
Azure Data Lake Analytics Compose and orchestrate data services at scale Fully managed service to support orchestration of data movement and processing Connect to relational or non-relational data
More informationMore Data in Less Time
More Data in Less Time Leveraging Cloudera CDH as an Operational Data Store Daniel Tydecks, Systems Engineering DACH & CE Goals of an Operational Data Store Load Data Sources Traditional Architecture Operational
More informationSAP and Hortonworks Reference Architecture
SAP and Hortonworks Reference Architecture Hortonworks. We Do Hadoop. June Page 1 2014 Hortonworks Inc. 2011 2014. All Rights Reserved A Modern Data Architecture With SAP DATA SYSTEMS APPLICATIO NS Statistical
More informationAIST Data Symposium. Ed Lenta. Managing Director, ANZ Amazon Web Services
AIST Data Symposium Ed Lenta Managing Director, ANZ Amazon Web Services Why are companies adopting cloud computing and AWS so quickly? #1: Agility The primary reason businesses are moving so quickly to
More informationLuncheon Webinar Series May 13, 2013
Luncheon Webinar Series May 13, 2013 InfoSphere DataStage is Big Data Integration Sponsored By: Presented by : Tony Curcio, InfoSphere Product Management 0 InfoSphere DataStage is Big Data Integration
More informationBig Data Use Case: Business Analytics
Big Data Use Case: Business Analytics Starting point A telecommunications company wants to allude to the topic of Big Data. The established Big Data working group has access to the data stock of the enterprise
More informationBig Data Architecture & Analytics A comprehensive approach to harness big data architecture and analytics for growth
MAKING BIG DATA COME ALIVE Big Data Architecture & Analytics A comprehensive approach to harness big data architecture and analytics for growth Steve Gonzales, Principal Manager steve.gonzales@thinkbiganalytics.com
More informationRazvoj Java aplikacija u Amazon AWS Cloud: Praktična demonstracija
Razvoj Java aplikacija u Amazon AWS Cloud: Praktična demonstracija Robert Dukarić University of Ljubljana Faculty of Computer and Information Science Laboratory for information systems integration Competence
More informationEmerging Requirements and DBMS Technologies:
Emerging Requirements and DBMS Technologies: When Is Relational the Right Choice? Carl Olofson Research Vice President, IDC April 1, 2014 Agenda 2 Why Relational in the First Place? Evolution of Databases
More informationArchitectural patterns for building real time applications with Apache HBase. Andrew Purtell Committer and PMC, Apache HBase
Architectural patterns for building real time applications with Apache HBase Andrew Purtell Committer and PMC, Apache HBase Who am I? Distributed systems engineer Principal Architect in the Big Data Platform
More informationBig Data Analytics Platform @ Nokia
Big Data Analytics Platform @ Nokia 1 Selecting the Right Tool for the Right Workload Yekesa Kosuru Nokia Location & Commerce Strata + Hadoop World NY - Oct 25, 2012 Agenda Big Data Analytics Platform
More informationInnovative Geschäftsmodelle Ermöglicht durch die AWS Cloud
Innovative Geschäftsmodelle Ermöglicht durch die AWS Cloud Rolf Kersten Business Development Manager Amazon Web Services Germany GmbH 2. Juli 2014 2014 Software AG. All rights reserved. Sechs Dinge, die
More informationHDP Hadoop From concept to deployment.
HDP Hadoop From concept to deployment. Ankur Gupta Senior Solutions Engineer Rackspace: Page 41 27 th Jan 2015 Where are you in your Hadoop Journey? A. Researching our options B. Currently evaluating some
More informationStreaming items through a cluster with Spark Streaming
Streaming items through a cluster with Spark Streaming Tathagata TD Das @tathadas CME 323: Distributed Algorithms and Optimization Stanford, May 6, 2015 Who am I? > Project Management Committee (PMC) member
More informationBig data blue print for cloud architecture
Big data blue print for cloud architecture -COGNIZANT Image Area Prabhu Inbarajan Srinivasan Thiruvengadathan Muralicharan Gurumoorthy Praveen Codur 2012, Cognizant Next 30 minutes Big Data / Cloud challenges
More informationBig Data Open Source Stack vs. Traditional Stack for BI and Analytics
Big Data Open Source Stack vs. Traditional Stack for BI and Analytics Part I By Sam Poozhikala, Vice President Customer Solutions at StratApps Inc. 4/4/2014 You may contact Sam Poozhikala at spoozhikala@stratapps.com.
More informationDeploy. Friction-free self-service BI solutions for everyone Scalable analytics on a modern architecture
Friction-free self-service BI solutions for everyone Scalable analytics on a modern architecture Apps and data source extensions with APIs Future white label, embed or integrate Power BI Deploy Intelligent
More informationBig Data Pipeline and Analytics Platform
Big Data Pipeline and Analytics Platform Using NetflixOSS and Other Open Source Software Sudhir Tonse (@stonse) Danny Yuan (@g9yuayon) Netflix is a log generating company that also happens to stream movies
More informationChukwa, Hadoop subproject, 37, 131 Cloud enabled big data, 4 Codd s 12 rules, 1 Column-oriented databases, 18, 52 Compression pattern, 83 84
Index A Amazon Web Services (AWS), 50, 58 Analytics engine, 21 22 Apache Kafka, 38, 131 Apache S4, 38, 131 Apache Sqoop, 37, 131 Appliance pattern, 104 105 Application architecture, big data analytics
More informationHow to Leverage Cloud to Quickly Build Scalable Applications
How to Leverage Cloud to Quickly Build Scalable Applications Chris Keyser Principal Solution Architect David Polley Senior Director Cloud Product Management Cloud Growth Recent IDC cloud research shows
More informationBusiness Intelligence for Big Data
Business Intelligence for Big Data Will Gorman, Vice President, Engineering May, 2011 2010, Pentaho. All Rights Reserved. www.pentaho.com. What is BI? Business Intelligence = reports, dashboards, analysis,
More informationAmazon Web Services. 2015 Annual ALGIM Conference. Tim Dacombe-Bird Regional Sales Manager Amazon Web Services New Zealand
Amazon Web Services 2015 Annual ALGIM Conference Tim Dacombe-Bird Regional Sales Manager Amazon Web Services New Zealand 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Agenda Who
More informationFrom Spark to Ignition:
From Spark to Ignition: Fueling Your Business on Real-Time Analytics Eric Frenkiel, MemSQL CEO June 29, 2015 San Francisco, CA What s in Store For This Presentation? 1. MemSQL: A real-time database for
More informationThe Future of Data Management
The Future of Data Management with Hadoop and the Enterprise Data Hub Amr Awadallah (@awadallah) Cofounder and CTO Cloudera Snapshot Founded 2008, by former employees of Employees Today ~ 800 World Class
More informationBig Data Web Analytics Platform on AWS for Yottaa
Big Data Web Analytics Platform on AWS for Yottaa Background Yottaa is a young, innovative company, providing a website acceleration platform to optimize Web and mobile applications and maximize user experience,
More informationAnalytics on Spark & Shark @Yahoo
Analytics on Spark & Shark @Yahoo PRESENTED BY Tim Tully December 3, 2013 Overview Legacy / Current Hadoop Architecture Reflection / Pain Points Why the movement towards Spark / Shark New Hybrid Environment
More informationConjugating data mood and tenses: Simple past, infinite present, fast continuous, simpler imperative, conditional future perfect
Matteo Migliavacca (mm53@kent) School of Computing Conjugating data mood and tenses: Simple past, infinite present, fast continuous, simpler imperative, conditional future perfect Simple past - Traditional
More informationLogentries Insights: The State of Log Management & Analytics for AWS
Logentries Insights: The State of Log Management & Analytics for AWS Trevor Parsons Ph.D Co-founder & Chief Scientist Logentries 1 1. Introduction The Log Management industry was traditionally driven by
More informationtuplejump The data engineering platform
` tuplejump The data engineering platform tuplejump A startup with a vision to simplify data engineering and empower the next generation of data powered miracles! Rohit Founder and CEO Satya Founder and
More informationIntegrating a Big Data Platform into Government:
Integrating a Big Data Platform into Government: Drive Better Decisions for Policy and Program Outcomes John Haddad, Senior Director Product Marketing, Informatica Digital Government Institute s Government
More informationScalability in the Cloud HPC Convergence with Big Data in Design, Engineering, Manufacturing
Scalability in the Cloud HPC Convergence with Big Data in Design, Engineering, Manufacturing July 7, 2014 David Pellerin, Business Development Principal Amazon Web Services What Do We Hear From Customers?
More informationBig Data JAMES WARREN. Principles and best practices of NATHAN MARZ MANNING. scalable real-time data systems. Shelter Island
Big Data Principles and best practices of scalable real-time data systems NATHAN MARZ JAMES WARREN II MANNING Shelter Island contents preface xiii acknowledgments xv about this book xviii ~1 Anew paradigm
More informationBIG DATA ANALYTICS For REAL TIME SYSTEM
BIG DATA ANALYTICS For REAL TIME SYSTEM Where does big data come from? Big Data is often boiled down to three main varieties: Transactional data these include data from invoices, payment orders, storage
More informationRoadmap Talend : découvrez les futures fonctionnalités de Talend
Roadmap Talend : découvrez les futures fonctionnalités de Talend Cédric Carbone Talend Connect 9 octobre 2014 Talend 2014 1 Connecting the Data-Driven Enterprise Talend 2014 2 Agenda Agenda Why a Unified
More informationCapitalize on Big Data for Competitive Advantage with Bedrock TM, an integrated Management Platform for Hadoop Data Lakes
Capitalize on Big Data for Competitive Advantage with Bedrock TM, an integrated Management Platform for Hadoop Data Lakes Highly competitive enterprises are increasingly finding ways to maximize and accelerate
More informationiway Roadmap: 2011 and Beyond Dave Watson SVP, iway Software
iway Roadmap: 2011 and Beyond Dave Watson SVP, iway Software iway Software Products DataMigrator Core Integration Server iway Service Manager Information Management/Data Governance B2B Gateway Managed
More informationNative Connectivity to Big Data Sources in MSTR 10
Native Connectivity to Big Data Sources in MSTR 10 Bring All Relevant Data to Decision Makers Support for More Big Data Sources Optimized Access to Your Entire Big Data Ecosystem as If It Were a Single
More informationRyan Horn, Lead Software Engineer at Twilio. November 12, 2014 Las Vegas. BDT312 Using the Cloud to Scale from a Database to a Data Platform
BDT312 Using the Cloud to Scale from a Database to a Data Platform Ryan Horn, Lead Software Engineer at Twilio November 12, 2014 Las Vegas 2014 Amazon.com, Inc. and its affiliates. All rights reserved.
More informationIntroduction to Amazon Web Services! Leo Zhadanovsky! @leozh leo@amazon.com! Senior Solutions Architect
Introduction to Amazon Web Services! Leo Zhadanovsky! @leozh leo@amazon.com! Senior Solutions Architect AWS HISTORY About How didamazon Amazon Web Services! Deep experience in building and operating global
More informationthe missing log collector Treasure Data, Inc. Muga Nishizawa
the missing log collector Treasure Data, Inc. Muga Nishizawa Muga Nishizawa (@muga_nishizawa) Chief Software Architect, Treasure Data Treasure Data Overview Founded to deliver big data analytics in days
More informationBig Data and Market Surveillance. April 28, 2014
Big Data and Market Surveillance April 28, 2014 Copyright 2014 Scila AB. All rights reserved. Scila AB reserves the right to make changes to the information contained herein without prior notice. No part
More informationBIG DATA: FROM HYPE TO REALITY. Leandro Ruiz Presales Partner for C&LA Teradata
BIG DATA: FROM HYPE TO REALITY Leandro Ruiz Presales Partner for C&LA Teradata Evolution in The Use of Information Action s ACTIVATING MAKE it happen! Insights OPERATIONALIZING WHAT IS happening now? PREDICTING
More informationScalable Architecture on Amazon AWS Cloud
Scalable Architecture on Amazon AWS Cloud Kalpak Shah Founder & CEO, Clogeny Technologies kalpak@clogeny.com 1 * http://www.rightscale.com/products/cloud-computing-uses/scalable-website.php 2 Architect
More informationBig Data and Industrial Internet
Big Data and Industrial Internet Keijo Heljanko Department of Computer Science and Helsinki Institute for Information Technology HIIT School of Science, Aalto University keijo.heljanko@aalto.fi 16.6-2015
More informationBig Data for everyone Democratizing big data with the cloud. Steffen Krause Technical Evangelist @AWS_Aktuell skrause@amazon.de
Big Data for everyone Democratizing big data with the cloud Steffen Krause Technical Evangelist @AWS_Aktuell skrause@amazon.de Does this Data make me look big? Overview Designing big data solutions in
More informationBuilding Scalable Big Data Infrastructure Using Open Source Software. Sam William sampd@stumbleupon.
Building Scalable Big Data Infrastructure Using Open Source Software Sam William sampd@stumbleupon. What is StumbleUpon? Help users find content they did not expect to find The best way to discover new
More informationwww.boost ur skills.com
www.boost ur skills.com AWS CLOUD COMPUTING WORKSHOP Write us at training@boosturskills.com BOOSTURSKILLS No 1736 1st Amrutha College Road Kasavanhalli,Off Sarjapur Road,Bangalore-35 1) Introduction &
More informationOracle s Big Data solutions. Roger Wullschleger. <Insert Picture Here>
s Big Data solutions Roger Wullschleger DBTA Workshop on Big Data, Cloud Data Management and NoSQL 10. October 2012, Stade de Suisse, Berne 1 The following is intended to outline
More informationReal-time Data Analytics mit Elasticsearch. Bernhard Pflugfelder inovex GmbH
Real-time Data Analytics mit Elasticsearch Bernhard Pflugfelder inovex GmbH Bernhard Pflugfelder Big Data Engineer @ inovex Fields of interest: search analytics big data bi Working with: Lucene Solr Elasticsearch
More informationHADOOP BIG DATA DEVELOPER TRAINING AGENDA
HADOOP BIG DATA DEVELOPER TRAINING AGENDA About the Course This course is the most advanced course available to Software professionals This has been suitably designed to help Big Data Developers and experts
More informationSOLVING REAL AND BIG (DATA) PROBLEMS USING HADOOP. Eva Andreasson Cloudera
SOLVING REAL AND BIG (DATA) PROBLEMS USING HADOOP Eva Andreasson Cloudera Most FAQ: Super-Quick Overview! The Apache Hadoop Ecosystem a Zoo! Oozie ZooKeeper Hue Impala Solr Hive Pig Mahout HBase MapReduce
More informationAmazon Web Services. Lawrence Berkeley LabTech Conference 9/10/15. Jamie Baker Federal Scientific Account Manager AWS WWPS bakjames@amazon.
Web Services Lawrence Berkeley LabTech Conference 9/10/15 Jamie Baker Federal Scientific Account Manager AWS WWPS bakjames@amazon.com 2015, Web Services, Inc. or its Affiliates. All rights reserved. AWS
More informationCisco IT Hadoop Journey
Cisco IT Hadoop Journey Srini Desikan, Program Manager IT 2015 MapR Technologies 1 Agenda Hadoop Platform Timeline Key Decisions / Lessons Learnt Data Lake Hadoop s place in IT Data Platforms Use Cases
More informationWhy Big Data in the Cloud?
Have 40 Why Big Data in the Cloud? Colin White, BI Research January 2014 Sponsored by Treasure Data TABLE OF CONTENTS Introduction The Importance of Big Data The Role of Cloud Computing Using Big Data
More informationPulsar Realtime Analytics At Scale. Tony Ng April 14, 2015
Pulsar Realtime Analytics At Scale Tony Ng April 14, 2015 Big Data Trends Bigger data volumes More data sources DBs, logs, behavioral & business event streams, sensors Faster analysis Next day to hours
More informationSo What s the Big Deal?
So What s the Big Deal? Presentation Agenda Introduction What is Big Data? So What is the Big Deal? Big Data Technologies Identifying Big Data Opportunities Conducting a Big Data Proof of Concept Big Data
More informationGanzheitliches Datenmanagement
Ganzheitliches Datenmanagement für Hadoop Michael Kohs, Senior Sales Consultant @mikchaos The Problem with Big Data Projects in 2016 Relational, Mainframe Documents and Emails Data Modeler Data Scientist
More informationApache Ignite TM (Incubating) - In- Memory Data Fabric Fast Data Meets Open Source
Apache Ignite TM (Incubating) - In- Memory Data Fabric Fast Data Meets Open Source DMITRIY SETRAKYAN Founder, PPMC http://www.ignite.incubator.apache.org @apacheignite @dsetrakyan Agenda About In- Memory
More informationHDP Enabling the Modern Data Architecture
HDP Enabling the Modern Data Architecture Herb Cunitz President, Hortonworks Page 1 Hortonworks enables adoption of Apache Hadoop through HDP (Hortonworks Data Platform) Founded in 2011 Original 24 architects,
More informationDatabricks. A Primer
Databricks A Primer Who is Databricks? Databricks vision is to empower anyone to easily build and deploy advanced analytics solutions. The company was founded by the team who created Apache Spark, a powerful
More informationHadoop Evolution In Organizations. Mark Vervuurt Cluster Data Science & Analytics
In Organizations Mark Vervuurt Cluster Data Science & Analytics AGENDA 1. Yellow Elephant 2. Data Ingestion & Complex Event Processing 3. SQL on Hadoop 4. NoSQL 5. InMemory 6. Data Science & Machine Learning
More informationManaging Big Data with Hadoop & Vertica. A look at integration between the Cloudera distribution for Hadoop and the Vertica Analytic Database
Managing Big Data with Hadoop & Vertica A look at integration between the Cloudera distribution for Hadoop and the Vertica Analytic Database Copyright Vertica Systems, Inc. October 2009 Cloudera and Vertica
More informationAligning Your Strategic Initiatives with a Realistic Big Data Analytics Roadmap
Aligning Your Strategic Initiatives with a Realistic Big Data Analytics Roadmap 3 key strategic advantages, and a realistic roadmap for what you really need, and when 2012, Cognizant Topics to be discussed
More informationQLIKVIEW INTEGRATION TION WITH AMAZON REDSHIFT John Park Partner Engineering
QLIKVIEW INTEGRATION TION WITH AMAZON REDSHIFT John Park Partner Engineering June 2014 Page 1 Contents Introduction... 3 About Amazon Web Services (AWS)... 3 About Amazon Redshift... 3 QlikView on AWS...
More informationData Integration Hub
Data Integration Hub Data Integration Hub Provides a Better Way Actual Customer Point-to-Point Data Architecture Modern Data Integration Hub Masked Informatica Data Integration Hub Accelerate data projects
More informationThe Inside Scoop on Hadoop
The Inside Scoop on Hadoop Orion Gebremedhin National Solutions Director BI & Big Data, Neudesic LLC. VTSP Microsoft Corp. Orion.Gebremedhin@Neudesic.COM B-orgebr@Microsoft.com @OrionGM The Inside Scoop
More informationMICROSTRATEGY ON AWS
MICROSTRATEGY ON AWS Presented by: MicroStrategy World 2015 Tuesday, January 27th 3:30 4:30 PM Track 8 Session 3 WWW.IOLAP.COM 1 INTRODUCTIONS iolap Data Warehousing and Business Intelligence consultancy
More informationBERLIN. 2015, Amazon Web Services, Inc. or its affiliates. All rights reserved
BERLIN 2015, Amazon Web Services, Inc. or its affiliates. All rights reserved Build Your Mobile App Faster with AWS Mobile Services Jan Metzner AWS Solutions Architect @janmetzner Danilo Poccia AWS Technical
More informationAWS Lambda. Developer Guide
AWS Lambda Developer Guide AWS Lambda: Developer Guide Copyright 2015 Amazon Web Services, Inc. and/or its affiliates. All rights reserved. Amazon's trademarks and trade dress may not be used in connection
More informationBig Data at Cloud Scale
Big Data at Cloud Scale Pushing the limits of flexible & powerful analytics Copyright 2015 Pentaho Corporation. Redistribution permitted. All trademarks are the property of their respective owners. For
More informationThe Game of Big Data! Analytics Infrastructure at KIXEYE
The Game of Big Data! Analytics Infrastructure at KIXEYE Randy Shoup @randyshoup linkedin.com/in/randyshoup QCon New York, June 13 2014 Free-to-Play Real-time Strategy Games Web and mobile Strategy and
More informationHow Transactional Analytics is Changing the Future of Business A look at the options, use cases, and anti-patterns
How Transactional Analytics is Changing the Future of Business A look at the options, use cases, and anti-patterns Table of Contents Abstract... 3 Introduction... 3 Definition... 3 The Expanding Digitization
More informationProcessing and Analyzing Streams. CDRs in Real Time
Processing and Analyzing Streams of CDRs in Real Time Streaming Analytics for CDRs 2 The V of Big Data Velocity means both how fast data is being produced and how fast the data must be processed to meet
More informationLambda Architecture. Near Real-Time Big Data Analytics Using Hadoop. January 2015. Email: bdg@qburst.com Website: www.qburst.com
Lambda Architecture Near Real-Time Big Data Analytics Using Hadoop January 2015 Contents Overview... 3 Lambda Architecture: A Quick Introduction... 4 Batch Layer... 4 Serving Layer... 4 Speed Layer...
More informationDatabricks. A Primer
Databricks A Primer Who is Databricks? Databricks was founded by the team behind Apache Spark, the most active open source project in the big data ecosystem today. Our mission at Databricks is to dramatically
More informationAmazon Redshift & Amazon DynamoDB Michael Hanisch, Amazon Web Services Erez Hadas-Sonnenschein, clipkit GmbH Witali Stohler, clipkit GmbH 2014-05-15
Amazon Redshift & Amazon DynamoDB Michael Hanisch, Amazon Web Services Erez Hadas-Sonnenschein, clipkit GmbH Witali Stohler, clipkit GmbH 2014-05-15 2014 Amazon.com, Inc. and its affiliates. All rights
More informationAutomated Data Ingestion. Bernhard Disselhoff Enterprise Sales Engineer
Automated Data Ingestion Bernhard Disselhoff Enterprise Sales Engineer Agenda Pentaho Overview Templated dynamic ETL workflows Pentaho Data Integration (PDI) Use Cases Pentaho Overview Overview What we
More informationBig Data Infrastructure at Spotify
Big Data Infrastructure at Spotify Wouter de Bie Team Lead Data Infrastructure June 12, 2013 2 Agenda Let s talk about Data Infrastructure, how we did it, what we learned and how we ve failed Some Context
More informationDashboard Engine for Hadoop
Matt McDevitt Sr. Project Manager Pavan Challa Sr. Data Engineer June 2015 Dashboard Engine for Hadoop Think Big Start Smart Scale Fast Agenda Think Big Overview Engagement Model Solution Offerings Dashboard
More information