HDMQ :Towards In-Order and Exactly-Once Delivery using Hierarchical Distributed Message Queues. Dharmit Patel Faraj Khasib Shiva Srivastava
|
|
|
- Harry Stokes
- 10 years ago
- Views:
Transcription
1 HDMQ :Towards In-Order and Exactly-Once Delivery using Hierarchical Distributed Message Queues Dharmit Patel Faraj Khasib Shiva Srivastava
2 Outline What is Distributed Queue Service? Major Queue Service Amazon SQS, Couch RQS, Apache Hedwig, Apache Kafka, RabbitMQ, ActiveMQ. Design and Implementation of HDMQ Operation Overview Refinements Performance Evaluation Conclusion Future Work References
3 How Distributed Queue Service look like? In today s world distributed message queues is used in many systems and play different roles such as content delivery, notification system and message delivery tools
4 Distributed Queue Service Multiple Client are connected to this Service. Each client is flooding the queue with the messages. Each messages is stored on multiple nodes for reliability. Each message is at least delivered once. It is important for the queue services to be able to deliver messages in larger scales, at the same time it must be highly scalable and provide parallel access.
5 Major Distributed Queue Services 1. Amazon SQS (Commercial State of the art) 2. Apache Kafka 3. Apache Hedwig 4. Couch RQS 5. ActiveMQ 6. Rabbit MQ
6 Amazon SQS(Simple Queue Service) Amazon SQS is a distributed, message delivery service, which is highly reliable, scalable, simple and secure. SQS is Distributed over multiple data center. No Single point of Failure Guarantees extremely high availability. Deliver Unlimited number of messages at any time. Size of the message <= 256 KB. Ensures at least 1 delivery of message. Retains messages up to 14 days. Allows batching of messages up to 10 messages or 256 KB message size in total. Comes with a price tag of $0.50 every 1 Million Request. It doesn't guarantee message order. Price tag increases for message size > 64 KB.
7 Apache Kafka Highly Scalable as it is designed to allow single cluster to serve as the central backbone. Each Kafka fiber maintains a partitioned log. Relies heavily on the file system for storing cache messages. Kafka nodes perform load balancing. Uses Asynchronous Message Sending. It replicates its log information for each topic. Supports Batch compression of messages. Bottleneck is Disk speed and File locking mechanism used.
8 Couch-RQS Based on database system, which is called Couch DB. Couch DB is a fast light weigh NOSQL DB. Uses database File to store its information which gives bad performance. It might be faster than any SQL or No-SQL database. Couch RQS cannot run safely in a distributed/replicated environment and cannot scale high, cannot provide high availability.
9 Apache Hedwig Publish-subscribe system designed to carry large amount of data. It is designed with the goal to give guaranteed delivery. Topic based publisher and subscriber. Incremental Scalability and high availability. Client publish message associated with a topic, and they subscribe to a topic to receive all messages published with that topic. Hedwig Instance(region) consist of number of servers called hubs. Hubs partition up topic ownership among themselves.
10 Active MQ Message Oriented Library, which ensures reliability between distributed process. Optimized to avoid overhead with a P2P or Server Client Model for pushing message to the receiver. Uses its own communication Protocol to ensure speed and reliability. Communication between server is using simple message. With each node launch, node launches the server to listen to any incoming messages and handle them. Highly Configurable but its slow and has issue lost/duplicate message. Three kind of scaling available in Active MQ like Default Transport, Horizontal scaling and Partitioning.
11 Rabbit MQ Robust messaging system, open platform and supported a large number of client developer platform. It uses asynchronous messaging for application to connect and scale. Options to tradeoff between performance, reliability, including persistence, delivery acknowledgements, publisher confirms and high availability. Uses mirroring for handling hardware failure. It offers management UI to monitor and control every aspect of message broker. Can report memory usage information for connections, queues, plugins and other processes in memory. Ships in the ready to use state, and can be customized in environment Variables, configuration file, runtime parameters and policies.
12 Design of HDMQ
13 Details for Design Storage Nodes: All the storage in two hierarchical regions, where a sub region consists of ~10 nodes and a router node, the main region consists of multiple sub regions. All the main regions together make up the storage node system. Front End Nodes: These are the nodes that clients interact with and make request to. Each front-end node maintains a local hash-table for that contains updates for Area for each queue ID. Currently we are using 10:1 ratio for number of storage nodes vs. front-end nodes. Queue ID Manager Node: We use one queue ID node in the system that determines the storage region for new queues and generate area (queue ID) for the new nodes
14 Example for Setup For example assume we have 10,000 total storage nodes and x number of front-end nodes. This system will break down the nodes in regions and sub regions down to where each of lowest hierarchy region contain ~ 10 nodes. In this case we can divide 10,000 nodes in 10 regions of 1000 nodes (1 to 10), then each 1000 node in region of 100 nodes and this 100 node regions in set of 10 nodes. So for example node 2287 will have area 2, 2, 8
15 Operation Overview Write Operation: This are the step for it. 1.Front end node will route the message to the given area. 2.The router in that area will determine which node will be next for insert. 3.This router will follow round robin strategy until all the 10 nodes in the region are full. 4.If region is full, the message will be routers to next available region. Front end node will maintain hash table and when write operation overflows the current region, it will get updated.
16 Read Operation For Read Operation Following are the steps:- 1. Front end nodes use the area to determine the region where message are stored for that queue. 2. They initiate read request to the router for that region to read messages. 3. The reading of message is again done in round robin strategy. 4. This will maintain message order among different storage nodes. 5. If there is overflow of messages to another region, then the queue id is updated in the front end nodes are able to forward the read request to the overflow region.
17 Queue ID Manager Node We also have a queue id manager node that will maintain the list of queue ID. Its job is to generate new ID based on system load. It will also assign the initial area to it. This node is low stress node so we need only 1~3 nodes to manage the system.
18 Refinements Exactly One Delivery Ordering of Message Large Message Size Mirrored Section Behavior
19 Performance Evaluation 20 client M1.xlarge 1 Million Message We found that on an average % of total messages are found in SQS as repeated messages
20 SQS VS HDMQ 20 client M1.xlarge 1 Million Message 20 client m2.4xlarge 1 Million Message 10 Front-end nodes(m1.xlarge) Elastic Load Balancer 20 M3 Double Extra Large Instance m3.xlarge Load Balancer
21 Latency 20 client M1.xlarge 1 Million Message 20 client m2.4xlarge 1 Million Message 10 Front-end nodes 20 M3 Double Extra Large Instance m3.xlarge Load Balancer
22 Throughput 20 client M1.xlarge 1 Million Message 20 client m2.4xlarge 1 Million Message 10 Front-end nodes 20 M3 Double Extra Large Instance m3.xlarge Load Balancer
23 Increasing number of Nodes 32 KB message 20 client M1.xlarge 1 Million Message 20 client m2.4xlarge 1 Million Message 10 Front-end nodes 20 M3 Double Extra Large Instance m3.xlarge Load Balancer
24 Cost Per Request
25 Conclusion The HDMQ adding and retrieving latency is lower than the SQS latency. We also observed that Throughput for adding in HDMQ is little lower than the SQS system but if we implement the router level load balancer then the throughput would be much higher than SQS. We also observed that the average receiving throughput of HDMQ is much more higher than the average throughput of Amazon SQS. If we combine the average throughput of adding and receiving, HDMQ would be much more faster than Amazon SQS. We also observed that the throughput of HDMQ with increasing number of nodes is also higher than the Amazon SQS. We also conclude that the cost for implementing the system right now is little higher as we are implementing the system on top of Amazon Web Services using EC2 instance, but if we have message aware queue and our own private cloud, we can reduce that price by a great amount. We found HDMQ to outperform SQS by up to 10-20% in throughput, 100% in latency, and 50% less in costs.
26 Future Work We will be implementing our own load balancer in future so that our framework is completely independent from Amazon Web Services. We will also implement queue-monitoring service. We will also try to increase the Adding message throughput by implementing local router level load balancer. We will also provide more throughputs still maintaining the reliability by providing asynchronous replication. We also want to design the framework message aware so that it can scale according to the incoming message size. This will not only reduce the cost of operating per request but will also help us to be aware about the right storage node for the incoming messages size so that the system can scale itself. We will also try to configure the number of replicas for the message nodes. As of right now its by default 1 if you start the system with replication.
27 References [1] Dongfang Zhao and Ioan Raicu, Supporting Large Scale Data-Intensive Computing with the FusionFS Distributed File System, 2013 [2] Amazon SQS, [online] 2013, [3] Hedwig, [online] 2013, [4] RabbitMQ in action: distributed messaging for everyone, Videla, Alvaro; Williams, Jason J W, Shelter Island NY : Manning, p. [5] Jay Kreps, Neha Narkhede and Jun Rao, Kafka: a Distributed Messaging System for Log Processing, [6] Snyder, Bruce, Dejan Bosanac, and Rob Davies. "Introduction to Apache ActiveMQ." Active MQ in Action: [7] Couch-RQS, [online] 2013, [8] Iman Sadooghi and Ioan Raicu, CloudKon: a Cloud enabled Distributed task execution framework, 2013 [9] Apache Hedwig [online] 2013,
28 Thank You
29
WSO2 Message Broker. Scalable persistent Messaging System
WSO2 Message Broker Scalable persistent Messaging System Outline Messaging Scalable Messaging Distributed Message Brokers WSO2 MB Architecture o Distributed Pub/sub architecture o Distributed Queues architecture
Big Data Analytics - Accelerated. stream-horizon.com
Big Data Analytics - Accelerated stream-horizon.com StreamHorizon & Big Data Integrates into your Data Processing Pipeline Seamlessly integrates at any point of your your data processing pipeline Implements
Tushar Joshi Turtle Networks Ltd
MySQL Database for High Availability Web Applications Tushar Joshi Turtle Networks Ltd www.turtle.net Overview What is High Availability? Web/Network Architecture Applications MySQL Replication MySQL Clustering
Putting Apache Kafka to Use!
Putting Apache Kafka to Use! Building a Real-time Data Platform for Event Streams! JAY KREPS, CONFLUENT! A Couple of Themes! Theme 1: Rise of Events! Theme 2: Immutability Everywhere! Level! Example! Immutable
BENCHMARKING CLOUD DATABASES CASE STUDY on HBASE, HADOOP and CASSANDRA USING YCSB
BENCHMARKING CLOUD DATABASES CASE STUDY on HBASE, HADOOP and CASSANDRA USING YCSB Planet Size Data!? Gartner s 10 key IT trends for 2012 unstructured data will grow some 80% over the course of the next
STeP-IN SUMMIT 2014. June 2014 at Bangalore, Hyderabad, Pune - INDIA. Performance testing Hadoop based big data analytics solutions
11 th International Conference on Software Testing June 2014 at Bangalore, Hyderabad, Pune - INDIA Performance testing Hadoop based big data analytics solutions by Mustufa Batterywala, Performance Architect,
JoramMQ, a distributed MQTT broker for the Internet of Things
JoramMQ, a distributed broker for the Internet of Things White paper and performance evaluation v1.2 September 214 mqtt.jorammq.com www.scalagent.com 1 1 Overview Message Queue Telemetry Transport () is
Amazon Cloud Storage Options
Amazon Cloud Storage Options Table of Contents 1. Overview of AWS Storage Options 02 2. Why you should use the AWS Storage 02 3. How to get Data into the AWS.03 4. Types of AWS Storage Options.03 5. Object
Introducing Storm 1 Core Storm concepts Topology design
Storm Applied brief contents 1 Introducing Storm 1 2 Core Storm concepts 12 3 Topology design 33 4 Creating robust topologies 76 5 Moving from local to remote topologies 102 6 Tuning in Storm 130 7 Resource
Cluster Computing. ! Fault tolerance. ! Stateless. ! Throughput. ! Stateful. ! Response time. Architectures. Stateless vs. Stateful.
Architectures Cluster Computing Job Parallelism Request Parallelism 2 2010 VMware Inc. All rights reserved Replication Stateless vs. Stateful! Fault tolerance High availability despite failures If one
Amazon Web Services Primer. William Strickland COP 6938 Fall 2012 University of Central Florida
Amazon Web Services Primer William Strickland COP 6938 Fall 2012 University of Central Florida AWS Overview Amazon Web Services (AWS) is a collection of varying remote computing provided by Amazon.com.
STREAM PROCESSING AT LINKEDIN: APACHE KAFKA & APACHE SAMZA. Processing billions of events every day
STREAM PROCESSING AT LINKEDIN: APACHE KAFKA & APACHE SAMZA Processing billions of events every day Neha Narkhede Co-founder and Head of Engineering @ Stealth Startup Prior to this Lead, Streams Infrastructure
Benchmarking Couchbase Server for Interactive Applications. By Alexey Diomin and Kirill Grigorchuk
Benchmarking Couchbase Server for Interactive Applications By Alexey Diomin and Kirill Grigorchuk Contents 1. Introduction... 3 2. A brief overview of Cassandra, MongoDB, and Couchbase... 3 3. Key criteria
The CF Brooklyn Service Broker and Plugin
Simplifying Services with the Apache Brooklyn Catalog The CF Brooklyn Service Broker and Plugin 1 What is Apache Brooklyn? Brooklyn is a framework for modelling, monitoring, and managing applications through
Kafka: a Distributed Messaging System for Log Processing
Kafka: a Distributed Messaging System for Log Processing Jay Kreps LinkedIn Corp jkreps@linkedincom Neha Narkhede LinkedIn Corp nnarkhede@linkedincom Jun Rao LinkedIn Corp jrao@linkedincom ABSTRACT Log
Big Data Analytics - Accelerated. stream-horizon.com
Big Data Analytics - Accelerated stream-horizon.com Legacy ETL platforms & conventional Data Integration approach Unable to meet latency & data throughput demands of Big Data integration challenges Based
BASICS OF SCALING: LOAD BALANCERS
BASICS OF SCALING: LOAD BALANCERS Lately, I ve been doing a lot of work on systems that require a high degree of scalability to handle large traffic spikes. This has led to a lot of questions from friends
Appendix A Core Concepts in SQL Server High Availability and Replication
Appendix A Core Concepts in SQL Server High Availability and Replication Appendix Overview Core Concepts in High Availability Core Concepts in Replication 1 Lesson 1: Core Concepts in High Availability
F1: A Distributed SQL Database That Scales. Presentation by: Alex Degtiar ([email protected]) 15-799 10/21/2013
F1: A Distributed SQL Database That Scales Presentation by: Alex Degtiar ([email protected]) 15-799 10/21/2013 What is F1? Distributed relational database Built to replace sharded MySQL back-end of AdWords
Exploring Oracle E-Business Suite Load Balancing Options. Venkat Perumal IT Convergence
Exploring Oracle E-Business Suite Load Balancing Options Venkat Perumal IT Convergence Objectives Overview of 11i load balancing techniques Load balancing architecture Scenarios to implement Load Balancing
Apache Hadoop. Alexandru Costan
1 Apache Hadoop Alexandru Costan Big Data Landscape No one-size-fits-all solution: SQL, NoSQL, MapReduce, No standard, except Hadoop 2 Outline What is Hadoop? Who uses it? Architecture HDFS MapReduce Open
Real World Enterprise SQL Server Replication Implementations. Presented by Kun Lee [email protected]
Real World Enterprise SQL Server Replication Implementations Presented by Kun Lee [email protected] About Me DBA Manager @ CoStar Group, Inc. MSSQLTip.com Author (http://www.mssqltips.com/sqlserverauthor/15/kunlee/)
SAP HANA - Main Memory Technology: A Challenge for Development of Business Applications. Jürgen Primsch, SAP AG July 2011
SAP HANA - Main Memory Technology: A Challenge for Development of Business Applications Jürgen Primsch, SAP AG July 2011 Why In-Memory? Information at the Speed of Thought Imagine access to business data,
CASE STUDY: Oracle TimesTen In-Memory Database and Shared Disk HA Implementation at Instance level. -ORACLE TIMESTEN 11gR1
CASE STUDY: Oracle TimesTen In-Memory Database and Shared Disk HA Implementation at Instance level -ORACLE TIMESTEN 11gR1 CASE STUDY Oracle TimesTen In-Memory Database and Shared Disk HA Implementation
Scalable Architecture on Amazon AWS Cloud
Scalable Architecture on Amazon AWS Cloud Kalpak Shah Founder & CEO, Clogeny Technologies [email protected] 1 * http://www.rightscale.com/products/cloud-computing-uses/scalable-website.php 2 Architect
DISTRIBUTED SYSTEMS [COMP9243] Lecture 9a: Cloud Computing WHAT IS CLOUD COMPUTING? 2
DISTRIBUTED SYSTEMS [COMP9243] Lecture 9a: Cloud Computing Slide 1 Slide 3 A style of computing in which dynamically scalable and often virtualized resources are provided as a service over the Internet.
Scaling out a SharePoint Farm and Configuring Network Load Balancing on the Web Servers. Steve Smith Combined Knowledge MVP SharePoint Server
Scaling out a SharePoint Farm and Configuring Network Load Balancing on the Web Servers Steve Smith Combined Knowledge MVP SharePoint Server Scaling out a SharePoint Farm and Configuring Network Load Balancing
Building Scalable Big Data Infrastructure Using Open Source Software. Sam William sampd@stumbleupon.
Building Scalable Big Data Infrastructure Using Open Source Software Sam William sampd@stumbleupon. What is StumbleUpon? Help users find content they did not expect to find The best way to discover new
Cloud DBMS: An Overview. Shan-Hung Wu, NetDB CS, NTHU Spring, 2015
Cloud DBMS: An Overview Shan-Hung Wu, NetDB CS, NTHU Spring, 2015 Outline Definition and requirements S through partitioning A through replication Problems of traditional DDBMS Usage analysis: operational
Distributed File System. MCSN N. Tonellotto Complements of Distributed Enabling Platforms
Distributed File System 1 How do we get data to the workers? NAS Compute Nodes SAN 2 Distributed File System Don t move data to workers move workers to the data! Store data on the local disks of nodes
Module 14: Scalability and High Availability
Module 14: Scalability and High Availability Overview Key high availability features available in Oracle and SQL Server Key scalability features available in Oracle and SQL Server High Availability High
SWIFT. Page:1. Openstack Swift. Object Store Cloud built from the grounds up. David Hadas Swift ATC. HRL [email protected] 2012 IBM Corporation
Page:1 Openstack Swift Object Store Cloud built from the grounds up David Hadas Swift ATC HRL [email protected] Page:2 Object Store Cloud Services Expectations: PUT/GET/DELETE Huge Capacity (Scale) Always
Lecture 6 Cloud Application Development, using Google App Engine as an example
Lecture 6 Cloud Application Development, using Google App Engine as an example 922EU3870 Cloud Computing and Mobile Platforms, Autumn 2009 (2009/10/19) http://code.google.com/appengine/ Ping Yeh ( 葉 平
Future Internet Technologies
Future Internet Technologies Big (?) Processing Dr. Dennis Pfisterer Institut für Telematik, Universität zu Lübeck http://www.itm.uni-luebeck.de/people/pfisterer FIT Until Now Architectures -Server SPDY
EWeb: Highly Scalable Client Transparent Fault Tolerant System for Cloud based Web Applications
ECE6102 Dependable Distribute Systems, Fall2010 EWeb: Highly Scalable Client Transparent Fault Tolerant System for Cloud based Web Applications Deepal Jayasinghe, Hyojun Kim, Mohammad M. Hossain, Ali Payani
Andes: a highly scalable persistent messaging system
Andes: a highly scalable persistent messaging system Charith Wickramarachchi, Srinath Perera, Shammi Jayasinghe,Sanjiva Weerawarana WSO2 Inc. Mountain View, CA USA {charith, srinath, shammi, sanjiva}@wso2.com
Learning Management Redefined. Acadox Infrastructure & Architecture
Learning Management Redefined Acadox Infrastructure & Architecture w w w. a c a d o x. c o m Outline Overview Application Servers Databases Storage Network Content Delivery Network (CDN) & Caching Queuing
In Memory Accelerator for MongoDB
In Memory Accelerator for MongoDB Yakov Zhdanov, Director R&D GridGain Systems GridGain: In Memory Computing Leader 5 years in production 100s of customers & users Starts every 10 secs worldwide Over 15,000,000
Building a Reliable Messaging Infrastructure with Apache ActiveMQ
Building a Reliable Messaging Infrastructure with Apache ActiveMQ Bruce Snyder IONA Technologies Bruce Synder Building a Reliable Messaging Infrastructure with Apache ActiveMQ Slide 1 Do You JMS? Bruce
Performance Benchmark for Cloud Block Storage
Performance Benchmark for Cloud Block Storage J.R. Arredondo vjune2013 Contents Fundamentals of performance in block storage Description of the Performance Benchmark test Cost of performance comparison
Architectural patterns for building real time applications with Apache HBase. Andrew Purtell Committer and PMC, Apache HBase
Architectural patterns for building real time applications with Apache HBase Andrew Purtell Committer and PMC, Apache HBase Who am I? Distributed systems engineer Principal Architect in the Big Data Platform
TECHNOLOGY WHITE PAPER Jun 2012
TECHNOLOGY WHITE PAPER Jun 2012 Technology Stack C# Windows Server 2008 PHP Amazon Web Services (AWS) Route 53 Elastic Load Balancing (ELB) Elastic Compute Cloud (EC2) Amazon RDS Amazon S3 Elasticache
Hypertable Architecture Overview
WHITE PAPER - MARCH 2012 Hypertable Architecture Overview Hypertable is an open source, scalable NoSQL database modeled after Bigtable, Google s proprietary scalable database. It is written in C++ for
High Throughput Computing on P2P Networks. Carlos Pérez Miguel [email protected]
High Throughput Computing on P2P Networks Carlos Pérez Miguel [email protected] Overview High Throughput Computing Motivation All things distributed: Peer-to-peer Non structured overlays Structured
Tier Architectures. Kathleen Durant CS 3200
Tier Architectures Kathleen Durant CS 3200 1 Supporting Architectures for DBMS Over the years there have been many different hardware configurations to support database systems Some are outdated others
Evaluation of NoSQL databases for large-scale decentralized microblogging
Evaluation of NoSQL databases for large-scale decentralized microblogging Cassandra & Couchbase Alexandre Fonseca, Anh Thu Vu, Peter Grman Decentralized Systems - 2nd semester 2012/2013 Universitat Politècnica
Database Scalability and Oracle 12c
Database Scalability and Oracle 12c Marcelle Kratochvil CTO Piction ACE Director All Data/Any Data [email protected] Warning I will be covering topics and saying things that will cause a rethink in
Performance Testing of Big Data Applications
Paper submitted for STC 2013 Performance Testing of Big Data Applications Author: Mustafa Batterywala: Performance Architect Impetus Technologies [email protected] Shirish Bhale: Director of Engineering
Scalable Internet Services and Load Balancing
Scalable Services and Load Balancing Kai Shen Services brings ubiquitous connection based applications/services accessible to online users through Applications can be designed and launched quickly and
Scalability of web applications. CSCI 470: Web Science Keith Vertanen
Scalability of web applications CSCI 470: Web Science Keith Vertanen Scalability questions Overview What's important in order to build scalable web sites? High availability vs. load balancing Approaches
Amazon EC2 XenApp Scalability Analysis
WHITE PAPER Citrix XenApp Amazon EC2 XenApp Scalability Analysis www.citrix.com Table of Contents Introduction...3 Results Summary...3 Detailed Results...4 Methods of Determining Results...4 Amazon EC2
Intro to AWS: Storage Services
Intro to AWS: Storage Services Matt McClean, AWS Solutions Architect 2015, Amazon Web Services, Inc. or its affiliates. All rights reserved AWS storage options Scalable object storage Inexpensive archive
Network Infrastructure Services CS848 Project
Quality of Service Guarantees for Cloud Services CS848 Project presentation by Alexey Karyakin David R. Cheriton School of Computer Science University of Waterloo March 2010 Outline 1. Performance of cloud
Case study: d60 Raptor smartadvisor. Jan Neerbek Alexandra Institute
Case study: d60 Raptor smartadvisor Jan Neerbek Alexandra Institute Agenda d60: A cloud/data mining case Cloud Data Mining Market Basket Analysis Large data sets Our solution 2 Alexandra Institute The
Design for Failure High Availability Architectures using AWS
Design for Failure High Availability Architectures using AWS Harish Ganesan Co founder & CTO 8KMiles www.twitter.com/harish11g http://www.linkedin.com/in/harishganesan Sample Use Case Multi tiered LAMP/LAMJ
Agenda. Some Examples from Yahoo! Hadoop. Some Examples from Yahoo! Crawling. Cloud (data) management Ahmed Ali-Eldin. First part: Second part:
Cloud (data) management Ahmed Ali-Eldin First part: ZooKeeper (Yahoo!) Agenda A highly available, scalable, distributed, configuration, consensus, group membership, leader election, naming, and coordination
Graph Database Proof of Concept Report
Objectivity, Inc. Graph Database Proof of Concept Report Managing The Internet of Things Table of Contents Executive Summary 3 Background 3 Proof of Concept 4 Dataset 4 Process 4 Query Catalog 4 Environment
Scaling Pinterest. Yash Nelapati Ascii Artist. Pinterest Engineering. Saturday, August 31, 13
Scaling Pinterest Yash Nelapati Ascii Artist Pinterest is... An online pinboard to organize and share what inspires you. Growth March 2010 Page views per day Mar 2010 Jan 2011 Jan 2012 May 2012 Growth
Can the Elephants Handle the NoSQL Onslaught?
Can the Elephants Handle the NoSQL Onslaught? Avrilia Floratou, Nikhil Teletia David J. DeWitt, Jignesh M. Patel, Donghui Zhang University of Wisconsin-Madison Microsoft Jim Gray Systems Lab Presented
Cloud Computing with Microsoft Azure
Cloud Computing with Microsoft Azure Michael Stiefel www.reliablesoftware.com [email protected] http://www.reliablesoftware.com/dasblog/default.aspx Azure's Three Flavors Azure Operating
Online Transaction Processing in SQL Server 2008
Online Transaction Processing in SQL Server 2008 White Paper Published: August 2007 Updated: July 2008 Summary: Microsoft SQL Server 2008 provides a database platform that is optimized for today s applications,
Cloud Computing: Meet the Players. Performance Analysis of Cloud Providers
BASEL UNIVERSITY COMPUTER SCIENCE DEPARTMENT Cloud Computing: Meet the Players. Performance Analysis of Cloud Providers Distributed Information Systems (CS341/HS2010) Report based on D.Kassman, T.Kraska,
Building Scalable Applications Using Microsoft Technologies
Building Scalable Applications Using Microsoft Technologies Padma Krishnan Senior Manager Introduction CIOs lay great emphasis on application scalability and performance and rightly so. As business grows,
Couchbase Server Under the Hood
Couchbase Server Under the Hood An Architectural Overview Couchbase Server is an open-source distributed NoSQL document-oriented database for interactive applications, uniquely suited for those needing
NoSQL Data Base Basics
NoSQL Data Base Basics Course Notes in Transparency Format Cloud Computing MIRI (CLC-MIRI) UPC Master in Innovation & Research in Informatics Spring- 2013 Jordi Torres, UPC - BSC www.jorditorres.eu HDFS
High Availability for Database Systems in Cloud Computing Environments. Ashraf Aboulnaga University of Waterloo
High Availability for Database Systems in Cloud Computing Environments Ashraf Aboulnaga University of Waterloo Acknowledgments University of Waterloo Prof. Kenneth Salem Umar Farooq Minhas Rui Liu (post-doctoral
White Paper. Optimizing the Performance Of MySQL Cluster
White Paper Optimizing the Performance Of MySQL Cluster Table of Contents Introduction and Background Information... 2 Optimal Applications for MySQL Cluster... 3 Identifying the Performance Issues.....
Diagram 1: Islands of storage across a digital broadcast workflow
XOR MEDIA CLOUD AQUA Big Data and Traditional Storage The era of big data imposes new challenges on the storage technology industry. As companies accumulate massive amounts of data from video, sound, database,
High Availability Essentials
High Availability Essentials Introduction Ascent Capture s High Availability Support feature consists of a number of independent components that, when deployed in a highly available computer system, result
Assignment # 1 (Cloud Computing Security)
Assignment # 1 (Cloud Computing Security) Group Members: Abdullah Abid Zeeshan Qaiser M. Umar Hayat Table of Contents Windows Azure Introduction... 4 Windows Azure Services... 4 1. Compute... 4 a) Virtual
Agenda. Enterprise Application Performance Factors. Current form of Enterprise Applications. Factors to Application Performance.
Agenda Enterprise Performance Factors Overall Enterprise Performance Factors Best Practice for generic Enterprise Best Practice for 3-tiers Enterprise Hardware Load Balancer Basic Unix Tuning Performance
FAWN - a Fast Array of Wimpy Nodes
University of Warsaw January 12, 2011 Outline Introduction 1 Introduction 2 3 4 5 Key issues Introduction Growing CPU vs. I/O gap Contemporary systems must serve millions of users Electricity consumed
Preview of Oracle Database 12c In-Memory Option. Copyright 2013, Oracle and/or its affiliates. All rights reserved.
Preview of Oracle Database 12c In-Memory Option 1 The following is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into any
Scaling in the Cloud with AWS. By: Eli White (CTO & Co-Founder @ mojolive) eliw.com - @eliw - mojolive.com
Scaling in the Cloud with AWS By: Eli White (CTO & Co-Founder @ mojolive) eliw.com - @eliw - mojolive.com Welcome! Why is this guy talking to us? Please ask questions! 2 What is Scaling anyway? Enabling
Web Email DNS Peer-to-peer systems (file sharing, CDNs, cycle sharing)
1 1 Distributed Systems What are distributed systems? How would you characterize them? Components of the system are located at networked computers Cooperate to provide some service No shared memory Communication
CLOUD DEVELOPMENT BEST PRACTICES & SUPPORT APPLICATIONS
whitepaper CLOUD DEVELOPMENT BEST PRACTICES & SUPPORT APPLICATIONS - Cloud Development Best Practices and Support Applications CLOUD DEVELOPMENT BEST PRACTICES 1 Cloud-based solutions are increasingly
Social Networks and the Richness of Data
Social Networks and the Richness of Data Getting distributed Webservices Done with NoSQL Fabrizio Schmidt, Lars George VZnet Netzwerke Ltd. Content Unique Challenges System Evolution Architecture Activity
International Journal of Engineering Research & Management Technology
International Journal of Engineering Research & Management Technology March- 2015 Volume 2, Issue-2 Survey paper on cloud computing with load balancing policy Anant Gaur, Kush Garg Department of CSE SRM
Part V Applications. What is cloud computing? SaaS has been around for awhile. Cloud Computing: General concepts
Part V Applications Cloud Computing: General concepts Copyright K.Goseva 2010 CS 736 Software Performance Engineering Slide 1 What is cloud computing? SaaS: Software as a Service Cloud: Datacenters hardware
Cloud Computing Now and the Future Development of the IaaS
2010 Cloud Computing Now and the Future Development of the IaaS Quanta Computer Division: CCASD Title: Project Manager Name: Chad Lin Agenda: What is Cloud Computing? Public, Private and Hybrid Cloud.
GraySort on Apache Spark by Databricks
GraySort on Apache Spark by Databricks Reynold Xin, Parviz Deyhim, Ali Ghodsi, Xiangrui Meng, Matei Zaharia Databricks Inc. Apache Spark Sorting in Spark Overview Sorting Within a Partition Range Partitioner
Amazon EC2 Product Details Page 1 of 5
Amazon EC2 Product Details Page 1 of 5 Amazon EC2 Functionality Amazon EC2 presents a true virtual computing environment, allowing you to use web service interfaces to launch instances with a variety of
Hadoop & Spark Using Amazon EMR
Hadoop & Spark Using Amazon EMR Michael Hanisch, AWS Solutions Architecture 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Agenda Why did we build Amazon EMR? What is Amazon EMR?
Amazon AWS in.net. Presented by: Scott Reed [email protected]
Amazon AWS in.net Presented by: Scott Reed [email protected] Objectives Cloud Computing What Amazon provides Why Amazon Web Services? Q&A Instances Interacting with Instances Management Console Command
Tuning Tableau Server for High Performance
Tuning Tableau Server for High Performance I wanna go fast PRESENT ED BY Francois Ajenstat Alan Doerhoefer Daniel Meyer Agenda What are the things that can impact performance? Tips and tricks to improve
On- Prem MongoDB- as- a- Service Powered by the CumuLogic DBaaS Platform
On- Prem MongoDB- as- a- Service Powered by the CumuLogic DBaaS Platform Page 1 of 16 Table of Contents Table of Contents... 2 Introduction... 3 NoSQL Databases... 3 CumuLogic NoSQL Database Service...
ZooKeeper. Table of contents
by Table of contents 1 ZooKeeper: A Distributed Coordination Service for Distributed Applications... 2 1.1 Design Goals...2 1.2 Data model and the hierarchical namespace...3 1.3 Nodes and ephemeral nodes...
GoGrid Implement.com Configuring a SQL Server 2012 AlwaysOn Cluster
GoGrid Implement.com Configuring a SQL Server 2012 AlwaysOn Cluster Overview This documents the SQL Server 2012 Disaster Recovery design and deployment, calling out best practices and concerns from the
This presentation covers virtual application shared services supplied with IBM Workload Deployer version 3.1.
This presentation covers virtual application shared services supplied with IBM Workload Deployer version 3.1. WD31_VirtualApplicationSharedServices.ppt Page 1 of 29 This presentation covers the shared
LARGE-SCALE DATA STORAGE APPLICATIONS
BENCHMARKING AVAILABILITY AND FAILOVER PERFORMANCE OF LARGE-SCALE DATA STORAGE APPLICATIONS Wei Sun and Alexander Pokluda December 2, 2013 Outline Goal and Motivation Overview of Cassandra and Voldemort
A programming model in Cloud: MapReduce
A programming model in Cloud: MapReduce Programming model and implementation developed by Google for processing large data sets Users specify a map function to generate a set of intermediate key/value
Reference Architecture, Requirements, Gaps, Roles
Reference Architecture, Requirements, Gaps, Roles The contents of this document are an excerpt from the brainstorming document M0014. The purpose is to show how a detailed Big Data Reference Architecture
Cloud Design and Implementation. Cheng Li MPI-SWS Nov 9 th, 2010
Cloud Design and Implementation Cheng Li MPI-SWS Nov 9 th, 2010 1 Modern Computing CPU, Mem, Disk Academic computation Chemistry, Biology Large Data Set Analysis Online service Shopping Website Collaborative
NoSQL Performance Test In-Memory Performance Comparison of SequoiaDB, Cassandra, and MongoDB
bankmark UG (haftungsbeschränkt) Bahnhofstraße 1 9432 Passau Germany www.bankmark.de [email protected] T +49 851 25 49 49 F +49 851 25 49 499 NoSQL Performance Test In-Memory Performance Comparison of SequoiaDB,
<Insert Picture Here> Oracle In-Memory Database Cache Overview
Oracle In-Memory Database Cache Overview Simon Law Product Manager The following is intended to outline our general product direction. It is intended for information purposes only,
