Scalability of Master-Worker Architecture on Heroku

Scalability of Master- Architecture on Heroku Vibhor Aggarwal, Shubhashis Sengupta, Vibhu Soujanya Sharma, Aravindan Santharam Accenture Technology Labs Page 0

Table of Contents Synopsis... 2 Introduction... 3 Architecture Overview... 4 Experiment Results... 5 Conclusion... 6 Copyright 2013 Accenture. All rights reserved. Accenture, its logo, and High Performance Delivered are trademarks of Accenture. Page 1

Synopsis Accenture Technology Labs has been focusing on developing technology thought leadership and software tools and frameworks for application life-cycle management for cloud. One of the core initiatives under the broad umbrella of application lifecycle management is called Migration Assessment Tool (MAT), a tool that analyzes legacy applications from the perspectives of technical services, architecture, performance, security, data and scalability for migration to cloud. In the context of evaluating architectures of legacy applications and re-factoring them to a scalable architecture in a target platform as a service (PaaS), Accenture has been exploring various options. It is in that context of performance scalability of applications in PaaS platforms that we embarked on this experimental study with Heroku TM. Heroku TM (http://www.heroku.com) is a platform as a service that provides a more powerful way of scaling web-facing and backend applications. Accenture has built a unique master-worker architecture with Heroku TM worker dynos, message queuing service, and a NoSQL data store service that scaled impressively. The application architecture has been tested for industry-scale master-worker parallel-processing problems and results have been quite promising. An image rendering algorithm based on Monte Carlo integration, written in Java with a complex CPU-bound workload, scaled up-to 1024 worker dynos, running for 407 dyno hours with 98% processing efficiency and a total elapsed time of 41 minutes. Such a job on a single symmetric multi-processing (SMP) machine will take days to complete. Copyright 2013 Accenture. All rights reserved. Accenture, its logo, and High Performance Delivered are trademarks of Accenture. Page 2

Introduction Parallel processing based on master-worker pattern has widespread industry applications, especially in high-performance, cluster and grid computing. Examples of such applications are: Monte Carlo simulations for Finance (e.g. option pricing, Value-at-Risk calculation) Mesh algorithms and computational fluid dynamics based applications for Automotive and Heavy industries (e.g. crash simulation) Image rendering applications for gaming and visualization Traditional enterprise batch applications One of the key advantages of employing cloud computing is its ability to scale-up infinitely to match the application needs. While PaaS platforms are being widely used for Web facing workloads (serving web portals, content, collaboration and other business processes), they also hold immense possibilities for providing elastic application infrastructure for batch-oriented jobs. For a perfectly parallel application, theoretically, this means that it can complete any amount of work load on the cloud in almost no time. This is a hugely attractive proposition as compared to hosting the application in-house, if the application load varies significantly. In-house infrastructure is usually difficult to scale and the lag is also higher than the instant scaling options available on cloud. Master- architecture is frequently employed for distributed computation where the master acts as the central authority to drive the computation forward. The master is in charge of delegating relevant tasks to the workers who perform them independently in parallel. The workers typically don't communicate with each other or use the master to route messages. Therefore, the global state of affairs is generally available at the master, making it an essential entity of the system. Copyright 2013 Accenture. All rights reserved. Accenture, its logo, and High Performance Delivered are trademarks of Accenture. Page 3

Architecture Overview The Accenture team, with the help of Heroku, Inc engineers, has tested a scalable master-worker architecture, implemented in Java, using dynos as master and worker nodes for task processing with inbuilt load-balancing mechanism, message queue to help facilitate the communication between the system entities, and a back-end NoSQL data storage where job data is kept. The system architecture is shown in Figure 1. The batch job is broken into individual tasks to be run on the dynos. RabbitMQ was used for communication between the dynos. Input and output data from the task computations was stored in MongoDB along with timestamps to measure the timing for each task. RabbitMQ Master MongoDB Figure 1 - Master-worker Architecture on Heroku using RabbitMQ and MongoDB Two applications were used for running the experiments: High-fidelity rendering is the process of generating realistic images from a three-dimensional description of an environment using physically-based material properties of the objects and light source details. The computation is carried out by solving the Rendering Equation using Monte Carlo integration. The image can be subdivided into set of tiles which can be rendered in parallel and then the results can be combined to form the final image. Two workloads (W1, W2) with different input data were used to study the scalability for rendering. As the rendering computation is a randomized algorithm, another workload (W3) is studied which performed fixed number of computations to calculate the first N prime numbers. Five types of tasks were computed by varying N, and each type is queued up 3600 times. Copyright 2013 Accenture. All rights reserved. Accenture, its logo, and High Performance Delivered are trademarks of Accenture. Page 4

Experiment Results The speedup and efficiency graphs are plotted for the three workloads in Figure 2 with varying number of dynos. It can be seen that the scalability for all the workloads is almost linear for up to 128 dynos (with efficiency close to 100%) after which it became sub-linear. The efficiency loss was mainly due to the limitations imposed on the beta version of the RabbitMQ add-on. Speedup (log 2 scale) 256 64 16 4 1 0% 4 8 16 32 64 128 256 512 Number of s (log 2 scale) Figure 2 - Speedup and Efficiency graphs 120% 100% W1 W2 W3 Ideal Speedup W1 - Efficiency W2 - Efficiency W3 - Efficiency 80% 60% 40% 20% Wall- %me Efficiency The experiments were continued with up-to 512 dynos with an add-on instance of RabbitMQ in the native Heroku TM platform. The team then carried out a controlled experiment by hosting the message queue component in a large instance of Amazon EC2 TM and by ramping up the dynos to 1024 nodes in a controlled manner. The job was completed in 41 minutes with a processing efficiency of nearly 98% percent, showing excellent scalability of the Heroku platform architecture. The log from Heroku TM system console (Figure 3) shows that the dynos performed smoothly and dyno-grid ran the full load gracefully with headroom to spare. This is indeed a very impressive achievement. (ask anyone who has run and managed a 1024 full loaded Unix cluster) Figure 3 Snapshot of Heroku TM railgun server instances Copyright 2013 Accenture. All rights reserved. Accenture, its logo, and High Performance Delivered are trademarks of Accenture. Page 5

Conclusion The experiment proved conclusively that Accenture s innovative master-worker architecture scales very well on Heroku TM s worker dyno nodes making use of scalable Advanced Message Queuing Protocol (AMQP) servers for task management and communication. Prime-time cluster / grid-based backend jobs can be shifted to Heroku TM PaaS cloud, keeping the cost of ownership fairly low. Copyright 2013 Accenture. All rights reserved. Accenture, its logo, and High Performance Delivered are trademarks of Accenture. Page 6

About Accenture Accenture is a global management consulting, technology services and outsourcing company, with more than 259,000 people serving clients in more than 120 countries. Combining unparalleled experience, comprehensive capabilities across all industries and business functions, and extensive research on the world s most successful companies, Accenture collaborates with clients to help them become high- performance businesses and governments. The company generated net revenues of US$27.9 billion for the fiscal year ended Aug. 31, 2012. Its home page is www.accenture.com. Copyright 2013 Accenture All rights reserved. Accenture, its logo, and High performance. Delivered. are trademarks of Accenture. This document makes descriptive reference to trademarks that may be owned by others. The use of such trademarks herein is not an assertion of ownership of such trademarks by Accenture and is not intended to represent or imply the existence of an association between Accenture and the lawful owners of such trademarks. Copyright 2013 Accenture. All rights reserved. Accenture, its logo, and High Performance Delivered are trademarks of Accenture. Page 7