Scalability of Master-Worker Architecture on Heroku



Similar documents
A new era for the Life Sciences industry

Accenture Cloud Platform Unlocks Agility and Control

Accenture cloud application migration services

CUMULUX WHICH CLOUD PLATFORM IS RIGHT FOR YOU? COMPARING CLOUD PLATFORMS. Review Business and Technology Series

G-Cloud II Services Service Definition Accenture Cloud PaaS Implementation Services AWS Beanstalk

Technology. Accenture Data Center Services

Accenture HR Audit and Compliance as-a-service

Cloud computing empowering your digital transformation

Unlocking potential with SAP S/4HANA

G-Cloud IV Framework Service Definition Accenture Medical Imaging Managed Service (AMIMS)

Planning the Migration of Enterprise Applications to the Cloud

IBM Global Technology Services September NAS systems scale out to meet growing storage demand.

Scalable Architecture on Amazon AWS Cloud

Accenture Customer Engagement. A Comprehensive Digital Marketing Managed Service Built on Adobe Marketing Cloud

DEVOPS: INNOVATIVE ENGINEERING PRACTICES FOR CONTINUOUS SOFTWARE DELIVERY

An Oracle White Paper November Leveraging Massively Parallel Processing in an Oracle Environment for Big Data Analytics

Final Project Proposal. CSCI.6500 Distributed Computing over the Internet

A Scalable Network Monitoring and Bandwidth Throttling System for Cloud Computing

Accenture CAS: Trade Promotion Optimization

BIGS: A Framework for Large-Scale Image Processing and Analysis Over Distributed and Heterogeneous Computing Resources

Platform as a Service: The IBM point of view

Amazon Web Services. Elastic Compute Cloud (EC2) and more...

Migration Scenario: Migrating Batch Processes to the AWS Cloud

Grid Scheduling Dictionary of Terms and Keywords

Accenture and Salesforce.com. Delivering enterprise cloud solutions that help accelerate business value and enable high performance

Boosting Business Agility through Software-defined Networking

Enterprise HPC & Cloud Computing for Engineering Simulation. Barbara Hutchings Director, Strategic Partnerships ANSYS, Inc.

Accenture and Software as a Service: Moving to the Cloud to Accelerate Business Value for High Performance

Paul Brebner, Senior Researcher, NICTA,

Clustering and Queue Replication:

The Accenture Foundation Platform for Oracle. Enter

Convergence, personalization and high quality: Accenture helps Telecom Italia consolidate multimedia services to deliver a seamless customer

Multichannel Attribution

JAVA IN THE CLOUD PAAS PLATFORM IN COMPARISON

BUILDING A SCALABLE BIG DATA INFRASTRUCTURE FOR DYNAMIC WORKFLOWS

Driving workload automation across the enterprise

Big Data and Natural Language: Extracting Insight From Text

G-Cloud III Services Service Definition Accenture Cloud Integration Services

Technology Consulting. Infrastructure Consulting: Next-Generation Data Center

VMware vrealize Automation

DataStax Enterprise, powered by Apache Cassandra (TM)

Compliance and the Cloud. Guiding principles and architecture for addressing Life Science compliance in the cloud

COMPARISON OF VMware VSHPERE HA/FT vs stratus

Accenture CAS: Solution Implementation Making change happen

SAP at Accenture The journey to high performance in the close process

An Oracle White Paper September Oracle Database and the Oracle Database Cloud

EXPLORATION TECHNOLOGY REQUIRES A RADICAL CHANGE IN DATA ANALYSIS

VMware vcloud Automation Center 6.1

Contents. Pentaho Corporation. Version 5.1. Copyright Page. New Features in Pentaho Data Integration 5.1. PDI Version 5.1 Minor Functionality Changes

G-Cloud III Framework Service Definition Accenture Azure Cloud Services

Oracle Database Backup Service. Secure Backup in the Oracle Cloud

Using SUSE Cloud to Orchestrate Multiple Hypervisors and Storage at ADP

Cluster, Grid, Cloud Concepts

G-Cloud IV Services Service Definition Accenture Managed Services for SaaS

On- Prem MongoDB- as- a- Service Powered by the CumuLogic DBaaS Platform

How To Handle Big Data With A Data Scientist

The power of collaboration: Accenture capabilities + Dell solutions

Dynamic Round Robin for Load Balancing in a Cloud Computing

Accenture NewsPage Sales Force Automation: Empower your people

Last time. Today. IaaS Providers. Amazon Web Services, overview

Accenture CAS: Support and Maintenance Making a difference

RED HAT CLOUD SUITE FOR APPLICATIONS

Accenture Foundation Platform for Oracle

Big Data Use Case. How Rackspace is using Private Cloud for Big Data. Bryan Thompson. May 8th, 2013

CA Automation Suite for Data Centers

Big Data Analytics - Accelerated. stream-horizon.com

Our core strengths can be found at the intersection of several competencies

VMware vrealize Automation

IBM Tivoli Storage Manager Suite for Unified Recovery

Cloud 101. Mike Gangl, Caltech/JPL, 2015 California Institute of Technology. Government sponsorship acknowledged

Analyzing Big Data with AWS

What s Trending in Analytics for the Consumer Packaged Goods Industry?

Accenture NewsPage Distributor Management System: The engine behind your business

So What s the Big Deal?

G-Cloud III Services Service Definition Accenture Cloud Security Services

Accenture Duck Creek Driving efficiency and high performance through Property & Casualty insurance software

Why Big Data in the Cloud?

The Accenture/ Siemens PLM Software Alliance

How your business can successfully monetize API enablement. An illustrative case study

Hybrid Development and Test USE CASE

Virtualization with Microsoft Windows Server 2003 R2, Enterprise Edition

High Performance Computing Cloud Offerings from IBM Technical Computing IBM Redbooks Solution Guide

G-Cloud II Services Service Definition Accenture Cloud SaaS Implementation Services Google Apps

Using GPUs in the Cloud for Scalable HPC in Engineering and Manufacturing March 26, 2014

BIG DATA Alignment of Supply & Demand Nuria de Lama Representative of Atos Research &

Getting the Most Out of VMware Mirage with Hitachi Unified Storage and Hitachi NAS Platform WHITE PAPER

INTRODUCTION THE CLOUD

Transcription:

Scalability of Master- Architecture on Heroku Vibhor Aggarwal, Shubhashis Sengupta, Vibhu Soujanya Sharma, Aravindan Santharam Accenture Technology Labs Page 0

Table of Contents Synopsis... 2 Introduction... 3 Architecture Overview... 4 Experiment Results... 5 Conclusion... 6 Copyright 2013 Accenture. All rights reserved. Accenture, its logo, and High Performance Delivered are trademarks of Accenture. Page 1

Synopsis Accenture Technology Labs has been focusing on developing technology thought leadership and software tools and frameworks for application life-cycle management for cloud. One of the core initiatives under the broad umbrella of application lifecycle management is called Migration Assessment Tool (MAT), a tool that analyzes legacy applications from the perspectives of technical services, architecture, performance, security, data and scalability for migration to cloud. In the context of evaluating architectures of legacy applications and re-factoring them to a scalable architecture in a target platform as a service (PaaS), Accenture has been exploring various options. It is in that context of performance scalability of applications in PaaS platforms that we embarked on this experimental study with Heroku TM. Heroku TM (http://www.heroku.com) is a platform as a service that provides a more powerful way of scaling web-facing and backend applications. Accenture has built a unique master-worker architecture with Heroku TM worker dynos, message queuing service, and a NoSQL data store service that scaled impressively. The application architecture has been tested for industry-scale master-worker parallel-processing problems and results have been quite promising. An image rendering algorithm based on Monte Carlo integration, written in Java with a complex CPU-bound workload, scaled up-to 1024 worker dynos, running for 407 dyno hours with 98% processing efficiency and a total elapsed time of 41 minutes. Such a job on a single symmetric multi-processing (SMP) machine will take days to complete. Copyright 2013 Accenture. All rights reserved. Accenture, its logo, and High Performance Delivered are trademarks of Accenture. Page 2

Introduction Parallel processing based on master-worker pattern has widespread industry applications, especially in high-performance, cluster and grid computing. Examples of such applications are: Monte Carlo simulations for Finance (e.g. option pricing, Value-at-Risk calculation) Mesh algorithms and computational fluid dynamics based applications for Automotive and Heavy industries (e.g. crash simulation) Image rendering applications for gaming and visualization Traditional enterprise batch applications One of the key advantages of employing cloud computing is its ability to scale-up infinitely to match the application needs. While PaaS platforms are being widely used for Web facing workloads (serving web portals, content, collaboration and other business processes), they also hold immense possibilities for providing elastic application infrastructure for batch-oriented jobs. For a perfectly parallel application, theoretically, this means that it can complete any amount of work load on the cloud in almost no time. This is a hugely attractive proposition as compared to hosting the application in-house, if the application load varies significantly. In-house infrastructure is usually difficult to scale and the lag is also higher than the instant scaling options available on cloud. Master- architecture is frequently employed for distributed computation where the master acts as the central authority to drive the computation forward. The master is in charge of delegating relevant tasks to the workers who perform them independently in parallel. The workers typically don't communicate with each other or use the master to route messages. Therefore, the global state of affairs is generally available at the master, making it an essential entity of the system. Copyright 2013 Accenture. All rights reserved. Accenture, its logo, and High Performance Delivered are trademarks of Accenture. Page 3

Architecture Overview The Accenture team, with the help of Heroku, Inc engineers, has tested a scalable master-worker architecture, implemented in Java, using dynos as master and worker nodes for task processing with inbuilt load-balancing mechanism, message queue to help facilitate the communication between the system entities, and a back-end NoSQL data storage where job data is kept. The system architecture is shown in Figure 1. The batch job is broken into individual tasks to be run on the dynos. RabbitMQ was used for communication between the dynos. Input and output data from the task computations was stored in MongoDB along with timestamps to measure the timing for each task. RabbitMQ Master MongoDB Figure 1 - Master-worker Architecture on Heroku using RabbitMQ and MongoDB Two applications were used for running the experiments: High-fidelity rendering is the process of generating realistic images from a three-dimensional description of an environment using physically-based material properties of the objects and light source details. The computation is carried out by solving the Rendering Equation using Monte Carlo integration. The image can be subdivided into set of tiles which can be rendered in parallel and then the results can be combined to form the final image. Two workloads (W1, W2) with different input data were used to study the scalability for rendering. As the rendering computation is a randomized algorithm, another workload (W3) is studied which performed fixed number of computations to calculate the first N prime numbers. Five types of tasks were computed by varying N, and each type is queued up 3600 times. Copyright 2013 Accenture. All rights reserved. Accenture, its logo, and High Performance Delivered are trademarks of Accenture. Page 4

Experiment Results The speedup and efficiency graphs are plotted for the three workloads in Figure 2 with varying number of dynos. It can be seen that the scalability for all the workloads is almost linear for up to 128 dynos (with efficiency close to 100%) after which it became sub-linear. The efficiency loss was mainly due to the limitations imposed on the beta version of the RabbitMQ add-on. Speedup (log 2 scale) 256 64 16 4 1 0% 4 8 16 32 64 128 256 512 Number of s (log 2 scale) Figure 2 - Speedup and Efficiency graphs 120% 100% W1 W2 W3 Ideal Speedup W1 - Efficiency W2 - Efficiency W3 - Efficiency 80% 60% 40% 20% Wall- %me Efficiency The experiments were continued with up-to 512 dynos with an add-on instance of RabbitMQ in the native Heroku TM platform. The team then carried out a controlled experiment by hosting the message queue component in a large instance of Amazon EC2 TM and by ramping up the dynos to 1024 nodes in a controlled manner. The job was completed in 41 minutes with a processing efficiency of nearly 98% percent, showing excellent scalability of the Heroku platform architecture. The log from Heroku TM system console (Figure 3) shows that the dynos performed smoothly and dyno-grid ran the full load gracefully with headroom to spare. This is indeed a very impressive achievement. (ask anyone who has run and managed a 1024 full loaded Unix cluster) Figure 3 Snapshot of Heroku TM railgun server instances Copyright 2013 Accenture. All rights reserved. Accenture, its logo, and High Performance Delivered are trademarks of Accenture. Page 5

Conclusion The experiment proved conclusively that Accenture s innovative master-worker architecture scales very well on Heroku TM s worker dyno nodes making use of scalable Advanced Message Queuing Protocol (AMQP) servers for task management and communication. Prime-time cluster / grid-based backend jobs can be shifted to Heroku TM PaaS cloud, keeping the cost of ownership fairly low. Copyright 2013 Accenture. All rights reserved. Accenture, its logo, and High Performance Delivered are trademarks of Accenture. Page 6

About Accenture Accenture is a global management consulting, technology services and outsourcing company, with more than 259,000 people serving clients in more than 120 countries. Combining unparalleled experience, comprehensive capabilities across all industries and business functions, and extensive research on the world s most successful companies, Accenture collaborates with clients to help them become high- performance businesses and governments. The company generated net revenues of US$27.9 billion for the fiscal year ended Aug. 31, 2012. Its home page is www.accenture.com. Copyright 2013 Accenture All rights reserved. Accenture, its logo, and High performance. Delivered. are trademarks of Accenture. This document makes descriptive reference to trademarks that may be owned by others. The use of such trademarks herein is not an assertion of ownership of such trademarks by Accenture and is not intended to represent or imply the existence of an association between Accenture and the lawful owners of such trademarks. Copyright 2013 Accenture. All rights reserved. Accenture, its logo, and High Performance Delivered are trademarks of Accenture. Page 7