COMPUTING SCIENCE. Designing a Distributed File-Processing System Using Amazon Web Services. Students of MSc SDIA 2008-09 TECHNICAL REPORT SERIES



Similar documents
A programming model in Cloud: MapReduce

Simple Storage Service (S3)

Cloud computing - Architecting in the cloud

Web Application Deployment in the Cloud Using Amazon Web Services From Infancy to Maturity

References. Introduction to Database Systems CSE 444. Motivation. Basic Features. Outline: Database in the Cloud. Outline

Introduction to Database Systems CSE 444

DISTRIBUTED SYSTEMS [COMP9243] Lecture 9a: Cloud Computing WHAT IS CLOUD COMPUTING? 2

Servers. Servers. NAT Public Subnet: /20. Internet Gateway. VPC Gateway VPC: /16

Data Management in the Cloud

Scalable Application. Mikalai Alimenkou

Cluster Computing. ! Fault tolerance. ! Stateless. ! Throughput. ! Stateful. ! Response time. Architectures. Stateless vs. Stateful.

Lets SAAS-ify that Desktop Application

TECHNOLOGY WHITE PAPER Jun 2012

Migration Scenario: Migrating Batch Processes to the AWS Cloud

A REVIEW PAPER ON THE HADOOP DISTRIBUTED FILE SYSTEM

StorReduce Technical White Paper Cloud-based Data Deduplication

Distribution transparency. Degree of transparency. Openness of distributed systems

Amazon Web Services Building in the Cloud

Design for Failure High Availability Architectures using AWS

CUMULUX WHICH CLOUD PLATFORM IS RIGHT FOR YOU? COMPARING CLOUD PLATFORMS. Review Business and Technology Series

Drupal in the Cloud. Scaling with Drupal and Amazon Web Services. Northern Virginia Drupal Meetup

ODEX Enterprise. Introduction to ODEX Enterprise 3 for users of ODEX Enterprise 2

Distributed File Systems

Learning Management Redefined. Acadox Infrastructure & Architecture

Storage Options in the AWS Cloud: Use Cases

Cloud Based Distributed Databases: The Future Ahead

ORACLE DATABASE 10G ENTERPRISE EDITION

TECHNOLOGY WHITE PAPER Jan 2016

Introduction to Azure: Microsoft s Cloud OS

CLOUD DEVELOPMENT BEST PRACTICES & SUPPORT APPLICATIONS

Chapter 4 Cloud Computing Applications and Paradigms. Cloud Computing: Theory and Practice. 1

Fault-Tolerant Computer System Design ECE 695/CS 590. Putting it All Together

Amazon Web Services Primer. William Strickland COP 6938 Fall 2012 University of Central Florida

Listeners. Formats. Free Form. Formatted

Cloud Computing at Google. Architecture

Hadoop & Spark Using Amazon EMR

This paper defines as "Classical"

Real Time Big Data Processing

Comparing Microsoft SQL Server 2005 Replication and DataXtend Remote Edition for Mobile and Distributed Applications

IGNITE Cloud Server enhancements

Cloud Computing For Bioinformatics

AWS CodePipeline. User Guide API Version

Scaling Web Applications in a Cloud Environment using Resin 4.0

Online Transaction Processing in SQL Server 2008

Cluster, Grid, Cloud Concepts

A Brief Analysis on Architecture and Reliability of Cloud Based Data Storage

BIG DATA IN THE CLOUD : CHALLENGES AND OPPORTUNITIES MARY- JANE SULE & PROF. MAOZHEN LI BRUNEL UNIVERSITY, LONDON

Best Practices: Extending Enterprise Applications to Mobile Devices

WildFire Overview. WildFire Administrator s Guide 1. Copyright Palo Alto Networks

Alfresco Enterprise on AWS: Reference Architecture

System Center Configuration Manager

Cloud Computing with Microsoft Azure

Microsoft SharePoint 2010 on VMware Availability and Recovery Options. Microsoft SharePoint 2010 on VMware Availability and Recovery Options

Lambda Architecture for Batch and Real- Time Processing on AWS with Spark Streaming and Spark SQL. May 2015

Amazon Cloud Storage Options

Introduction to Cloud Computing

Where We Are. References. Cloud Computing. Levels of Service. Cloud Computing History. Introduction to Data Management CSE 344

Designing Apps for Amazon Web Services

ABSTRACT. KEYWORDS: Cloud Computing, Load Balancing, Scheduling Algorithms, FCFS, Group-Based Scheduling Algorithm

Increased Security, Greater Agility, Lower Costs for AWS DELPHIX FOR AMAZON WEB SERVICES WHITE PAPER

BPMN and Simulation. L. J. Enstone & M. F. Clark The Lanner Group April 2006

Nutanix Tech Note. Configuration Best Practices for Nutanix Storage with VMware vsphere

SiteCelerate white paper

So today we shall continue our discussion on the search engines and web crawlers. (Refer Slide Time: 01:02)

Cloud Computing. Chapter 1 Introducing Cloud Computing

Architecting For Failure Why Cloud Architecture is Different! Michael Stiefel

Feature Comparison. Windows Server 2008 R2 Hyper-V and Windows Server 2012 Hyper-V

Building Scalable Applications Using Microsoft Technologies

Lecture 6 Cloud Application Development, using Google App Engine as an example

Relocating Windows Server 2003 Workloads

ediscovery 5.3 and Release Notes

QLIKVIEW INTEGRATION TION WITH AMAZON REDSHIFT John Park Partner Engineering

THE WINDOWS AZURE PROGRAMMING MODEL

Cloud Computing and Amazon Web Services. CJUG March, 2009 Tom Malaher

Low-cost Open Data As-a-Service in the Cloud

DESIGN OF A PLATFORM OF VIRTUAL SERVICE CONTAINERS FOR SERVICE ORIENTED CLOUD COMPUTING. Carlos de Alfonso Andrés García Vicente Hernández

Reference Model for Cloud Applications CONSIDERATIONS FOR SW VENDORS BUILDING A SAAS SOLUTION

Data Deduplication: An Essential Component of your Data Protection Strategy

Amazon S3 Essentials

Workflow Templates Library

Cloud Computing Disaster Recovery (DR)

Cloud Models and Platforms

The Google File System

Dropbox for Business. Secure file sharing, collaboration and cloud storage. G-Cloud Service Description

Enhanced Connector Applications SupportPac VP01 for IBM WebSphere Business Events 3.0.0

A Comparative Study on Vega-HTTP & Popular Open-source Web-servers

Automated Control in a Cloud Computing Infrastructure

Role of Cloud Computing in Big Data Analytics Using MapReduce Component of Hadoop

Overview. Timeline Cloud Features and Technology

Technical Overview Simple, Scalable, Object Storage Software

Chapter 19 Cloud Computing for Multimedia Services

Multilevel Communication Aware Approach for Load Balancing

Cloud Computing Is In Your Future

Axceleon s CloudFuzion Turbocharges 3D Rendering On Amazon s EC2

Transcription:

COMPUTING SCIENCE Designing a Distributed File-Processing System Using Amazon Web Services Students of MSc SDIA 2008-09 TECHNICAL REPORT SERIES No. CS-TR-1198 April 2010

TECHNICAL REPORT SERIES No. CS-TR-1198 April, 2010 Designing a Distributed File-Processing System Using Amazon Web Services Students of MSc SDIA 2008-09 Abstract Computing on the cloud is about building Internet applications utilising the computing services and resources provided by a third party. With companies such as Amazon, Google and Microsoft offering inexpensive utility computing service, it is becoming an increasingly attractive option for hosting enterprise applications. This report proposes an architecture and design for hosting one such application, namely distributed file processing system, using the Amazon Web Services platform. The proposed design combines the best features of the work carried out by three MSc. SDIA groups as part of the Group Project module. It is extensible, scalable and addresses the challenges in ensuring at most once semantics. 2010 University of Newcastle upon Tyne. Printed and published by the University of Newcastle upon Tyne, Computing Science, Claremont Tower, Claremont Road, Newcastle upon Tyne, NE1 7RU, England.

Bibliographical details STUDENTS OF MSC SDIA 2008-09 Designing a Distributed File-Processing System Using Amazon Web Services [By] Students of MSc SDIA 2008-2009 Newcastle upon Tyne: University of Newcastle upon Tyne: Computing Science, 2010. (University of Newcastle upon Tyne, Computing Science, Technical Report Series, No. CS-TR-1198) Added entries UNIVERSITY OF NEWCASTLE UPON TYNE Computing Science. Technical Report Series. CS-TR-1198 Abstract Computing on the cloud is about building Internet applications utilising the computing services and resources provided by a third party. With companies such as Amazon, Google and Microsoft offering inexpensive utility computing service, it is becoming an increasingly attractive option for hosting enterprise applications. This report proposes an architecture and design for hosting one such application, namely distributed file processing system, using the Amazon Web Services platform. The proposed design combines the best features of the work carried out by three MSc. SDIA groups as part of the Group Project module. It is extensible, scalable and addresses the challenges in ensuring at most once semantics. About the authors The authors are MSc SDIA students, 2008-09. They describe here the best possible design for the group project specification which they worked on as three independent groups. The reported design combines the best features of their three group designs and seeks to avoid the worst in each. Suggested keywords CLOUD COMPUTING, QUEUEING, MESSAGING, RELIABILITY, DESIGN, IMPLEMENTATION AND TESTING

Designing a Distributed File-Processing System Using Amazon Web Services MSc. System Design for Internet Application (2008-2009) Technical Report Contributors Matthew Clark Qublai K. Ali Mirza Raneesha Manoharan Winai Wongthai Jeremy R. Whiting Ian Dundas Matthew Brook Iliana Filippou Kristel Caisip Olufunke J- Oluleye 1

Abstract Computing on the cloud is about building Internet applications utilising the computing services and resources provided by a third party. With companies such as Amazon, Google and Microsoft offering inexpensive utility computing service, it is becoming an increasingly attractive option for hosting enterprise applications. This report proposes an architecture and design for hosting one such application, namely distributed file processing system, using the Amazon Web Services platform. The proposed design combines the best features of the work carried out by three MSc. SDIA groups as part of the Group Project module. It is extensible, scalable and addresses the challenges in ensuring at most once semantics. 2

1. Problem Specification and Challenges We will use some or all of the following Amazon Web Services to develop a distributed file processing application. Simple Queuing Service (SQS) - "reliable, highly scalable, hosted queue" for messaging between distributed application components Simple Storage Service (S3) - "highly scalable, reliable, fast, inexpensive data storage" Elastic Compute Cloud (EC2) - "resizable compute capacity in the cloud" 1.1 Specification of the file processing application The basic file processing application has the following structure. A controller process adds file-processing jobs to a queue. Worker processes take jobs from the queue, retrieve a file from an input store, perform the file processing and store the result in an output store. The nature of the processing is not important as long as it is reasonably compute intensive. Examples include conversion of PDF files to postscript, the conversion of large image files to thumbnails, encryption or decryption of input files, conversion between different audio formats. An example of PDF file conversion would be for a controller process to identify all the technical reports in PDF on the School's web site for worker processors to convert to postscript. There are two mandatory aspects to your application. 1. There must be a controller process that places jobs on a queue and two or more separate worker processes that perform the file processing. 2. Amazon SQS must be used as the queuing service to communicate between processes. 3

Guarantee at-most-once file processing SQS guarantees at least once message delivery. That is, it is possible for message consumers to see duplicate messages. For this extension, modify your application to detect duplicate messages and guarantee global at-most-once file processing. Handle multiple job types Extend your application to support multiple job types (for example, file format conversion and encryption). To do this you may use job descriptors in your queue messages or you may use multiple queues, one for each job type. As a further extension, use multiple queues and additional worker processes to implement pipelining (for example, file format conversion then encryption). In this case, you could have an intermediate controller process to enqueue jobs for a second set of worker processes (using a queue to link two instances of the application structure shown above). Use Amazon S3 Use the Amazon S3 service for storage of output files, and possibly for input storage Use Amazon EC2 Migrate your worker processes to Amazon EC2. The report is structured as follows. The next section describes Amazon Web Services, Section 3 outlines our design and assumptions. Section 4 presents a few guidelines on implementation. 4

Design Challenges 1.2 Amazon Web Services All Amazon Web Services components can be referenced by a Universal Resource Locator (URL). This feature allows us to reference components dynamically at runtime within our system which allows for arbitrary component configuration and a simpler design. 1.2.1 AWS components Simple Queuing Service (SQS) Amazon s SQS is a distributed Queue which can store messages as Strings. Queues decouple the dependency between a client and server to operate at the same time. Thus making the system fault tolerant to server crashing or network delays. Due to the distributed nature of SQS, messages on SQS are stored and replicated across many servers internal to Amazon, yet this complexity is hidden from the user by their abstracted public interface. Amazon implements a weighted distribution algorithm to store and retrieve messages from a subset of their servers. This immediately raises the issue of synchronisation of message state across multiple servers. Therefore the service is probabilistic in nature, and produces at least once message delivery. According to Amazon the reason for this approach is to achieve redundancy and high availability. In order to achieve at-most-once semantics within our file processing system, a higher guarantee of message delivery must be implemented within our application. At most once message delivery is a common problem and as a consequence is well understood and relatively easy to implement. For the purpose of this example we assume that a message has already been put onto a queue and the servers have synchronised such that all of the servers in the example contain the message M. Note that not all servers in Amazon contain the message M. The following diagram shows the resulting state after a series of interactions between a client (C), and the Amazon SQS. Interaction sequence 5

i) Client requests a message from queue /* SQS queries subset of servers (shown in grey) */ /* Servers are synchronised, message locked according to visibility */ ii) Server returns message M with receipt iii) Client deletes message M from queue /* SQS servers synchronise */ iv) Client requests a message from queue v) SQS (S4) sends message M to client /* SQS servers synchronise */ Note: events marked with /* <event> */ are internal (non-visible) events Problem: The client receives message M twice. Query subsets are disjoint. SQS servers cannot be guaranteed to be synchronised between state changes. Reason: This is a consequence of the weighted distributed algorithms which SQS uses to query nodes when attempting to retrieve a message from the queue and the result is a false positive. A further consequence of this approach is false negative results where messages may not be retrieved at all when messages exist on the queue. Again, this is a result of the subset of servers holding a copy of the message being disjoint from the query subset. Obviously this is less of a problem since the client can just keep trying and eventually servers will synchronise with no effect on the client other than the possibility of wasted time. Simple Storage Service (S3) Amazon provides web-accessible and abstracted file storage in the form of Buckets, which can each contain an unbounded number of Objects of up to 5GB each. Elastic Compute Cloud (EC2) A virtual dedicated-server hosting platform which runs Virtual Machine instances of Amazon Machine Images (AMIs) based on a user-generated image bundle. 6

2. Design 2.1 Proposed Architecture (add labels to arrows and specify AWS components more clearly S3) Figure 1: Overall System Architecture. The system is message-oriented, in that components are directed by the messages they receive: messages in this system describe Jobs. The Controller (C) is responsible for generating XML job messages and sending these to the correct queue. The diagram shows a single conceptual controller, however in reality this could be a pool of Controllers: no inter-communication between Controllers would be required, although it might be useful to report Job progress into a shared database. Amazon SQS instances are specific to a particular job type or task (T i ), which means there exists a 1:1 relationship between queue type and worker type (W i ). There is a 7

special queue called the result queue (R) which Controllers can poll to retrieve Job results, e.g. success or fail. The workers retrieve jobs from the queue it is bound to, and execute on the EC2 platform. This means that groups of workers can be organised into clusters around a single queue, with the number of workers being dynamically re-assigned by the Controller depending on queue load. Note that there is little architectural difference between the controller and the worker, as both service queues: workers may be capable of executing controller tasks as well as queue tasks, and only the runtime configuration differentiates them. The final component is the Co-ordinator (Co), to which all workers are connected. Its role is to co-ordinate worker activities in order to achieve at most once file processing. Each of these components will be discussed in more detail below. In practise, there are some benefits to using identical workers, such as the lower cost of maintenance operations such as upgrading and the re-bundling operation this requires. When conceiving of abstract workers however, is better to think of workers as task-specific. 2.2 Design Assumptions 1. Amazon Web Services has inherent limitations, and so is not a suitable platform to use if fault tolerance is required. By accepting that this platform cannot be used in a fault tolerant context, we can simplify the design by assuming that the controller never fails, making the design cleaner. Therefore, there is no need to persist controller state within our design. 2.3 Components Controller The Controller starts the Worker EC2 instances. Then it uploads the files to be processed into S3, and then creates the Job XML, pushing it onto the correct SQS queue. The Controller polls the Result queue to determine the status outcome of executed jobs. Worker The worker implements a single task, and is bound to a single queue type. It needs to be able to check with the co-ordinator that a job has not already been processed. An alternative approach would be to have a single type of worker which can perfom multiple task types, but as a design the former gives a clearer association between task type and worker. Co-ordinator 8

The co-ordinator is accessed by all workers, and it is responsible for keeping track of processed jobs. This is to compensate for SQS At Least Once (ALO) message delivery semantics: it ensures that if two workers receive the same job XML, only one will actually process the job. This ensures At Most Once (AMO) semantics, satisfying the design requirements for job processing. If the co-ordinator becomes unavailable, the worker processes should wait until it becomes available again. Figure 2: A layered architecture highlighting the level of message delivery each layer provides to the layer above it. The co-ordinator which executes in an ALO environment provides AMO to the workers above. Worker crashes retry count & timeout Retry count limits the number of times a Job is attempted, preventing a message remaining on the queue in an indefinite cycle. If a job fails on every attempt, to the point that the retry count reaches zero, the message will expire and will be removed from the queue. Timeout refers to the amount of time that must pass before a worker may re-attempt a failed job. The co-ordinator maintains both the retry count and the timeout for each job. The system therefore provides some failure recovery mechanisms in terms of lifecycle but does not guarantee full fault tolerance. Problems still inevitably exist when dealing with many operations which one would like to perform atomically. An example here would be the following sequence of events executed at the worker: remove message from queue; remove entry from database; send result. To ensure consistent global state atomic transactions would be required for a fault tolerant solution. Transactions however do not fit with the requirements we are trying to achieve. As mentioned earlier, AWS is not suitable for such requirements: purposebuilt transactional-services should be used instead. Moreover, this application does not need this high level of consistency in general. For example, it does not matter semantically if the file is processed more than once as this only wastes CPU cycles. 9

An example of a co-ordinator implementation could use a database to handle concurrent changes to records, preventing race conditions. 10

2.4 Message format The system is driven by Job XML which is passed between the components and is understood by any worker, with the direction and order in which messages are passed drives the flow of state transitions. The Job XML unfolds to release nested child XML elements as workers process it, which can be added back onto a Job queue due to the fact that child XML has the same structure as the parent: the Job XML is in this way consumed and transformed recursively. This enables sequential and parallel tasks to be defined in by the Controller that creates the original Job XML by controlling the hierarchy of Job XML 2.4.1 Job XML Figure 3: Structure of the Job-XML Messages When a worker completes the parent job, it discards the parent <job> and moves the child <job> into the parent s place within the XML structure, so that the child now lies directly beneath <pipeline>. A job can have multiple child <job>s so, if there are multiple children, multiple XML documents are produced during this step, each with the same pipeline Global Unique Identifier (GUID). This was known as a divergence of the pipeline. These child jobs are placed back onto the corresponding job queue to be consumed by any free worker. In this way, tasks are performed upon resources in a sequential manner. The GUID is a hash identifier generated by the controller. It's a 40 digit string that uniquely identifies each processing job. The identifier stays with the job as it progresses through the system pipelines. The message will be set with the status value: completed, error or failure. 11

2.4.2 Pipelining/divergence The design allows for pipelining with the capability of divergence. One may wonder whether convergence is not included here, this however is a difficult problem to solve in general and is itself a major topic of research. Divergence on the other hand is easier to implement and makes sense in the domain of file processing. This can be implemented by simply splitting the workflow into two parts, generating two child XML files from a single XML parent and adding them onto different queues. As an example, consider an input file of an image which is to be converted and then compressed. The user may wish to have two different image formats and therefore the job would diverge at this point. 2.4.3 Versioning XML Tasks or jobs can place base requirements upon the worker i.e. Job XML can contain a task version number which demands that the worker be patched with a minimum of that version. With each update come a versions.xml file which stores the tasks this distribution supports, and the code version of each. Job XML expected-task-versions are checked against this list before being processed, to make sure that a worker doesn t attempt a job that it can t fully support. This has two effects: firstly, it prevents an older version of code processing a task, and secondly it alerts the worker to the fact that it is out of date, triggering a self-refresh of its code base. 2.4.4 Result XML The XML added by the workers onto the Result Queue <?xml version="1.0" encoding="utf-8"?> <!-- This document is a sample demonstrating result queue messages content. --> <!-- job_result is the root node for the document --> <!-- The processed date time uses the date format yyyy-mm-dd HH:mm:ss. --> <job_result processed_date_time="2009-02-01 01:01:01Z"> <!-- pipeline_stage_id represents at which stage the job is. --> <pipeline_stage_id>1</pipeline_stage_id> <!-- result_status indicates the result when a EC2 worker attempted to process the pipeline task. Values are: FAIL ERROR SUCCESS --> <result_status>success</result_status> <!-- This is the unique hash identifier for this job. --> <guid>sdle5j3j3ljsef0idfmsdfsdfsldkflksdfdsfsd</guid> <!-- An optional tag to hold details of the message. Will be empty if the status is SUCCESS. The body is wrapped in a CDATA section --> <message>diagnostic data in case of problems</message> </job_result> 12

2.4.5 Globally Unique Identifier The globally unique identifier (GUID) is a hash identifier generated by the controller. Just like the Job XML a GUID is a 40 digit string that uniquely identifies the processed job. The identifier stays with the job as it progresses through the system pipelines. The message is set with the status value (completed, error or failure). Result XML is sent at each intermediary stage of the pipeline. 2.4.6 Amazon Elastic Compute Cloud We used our own private Amazon Machine Image (AMI) bundle so that we could customise it with our own configuration. Customising the AMI involved installing software and adding our own components to execute tasks. To make the deployment of components easier we configured Workers to download the latest version. 13

3. Implementation of a Prototype 3.1 AWS S3 Amazon s Simple Storage Service (S3) is used in this implementation to store the processed files. S3 works as buckets which can store upto 5GB of data. Files are stored in S3 when job needs to be done on a particular file it is fetched from the bucket and after processing it is stored in the bucket. 3.2 AWS SQS Amazon s Simple Queuing Service (SQS) is used in this implementation to store the the job messages it provides at least once message delivery which can cause duplicate job processed. 3.3 AWS EC2 Amazon s Elastic Compute Cloud (EC2) is used in this implementation to get the virtual server instances, which gives Virtual Machine instances of Amazon Machine Images (AMIs) based on a user-generated image bundle. Component Controller Worker Co-ordinator AWS S3 AWS SQS AWS EC2 The controller interacts with workers using AWS SQS. The controller then places messages onto the first processing task type job queue and receives messages from the single result queue. See Figure 1 for more details. Note about worker processes Even though in theory separation of workers is more extensible and a good example of modularisation, in practise this approach has its downfalls, these are to do with the AWS specification. 14

4. Evaluation This project has provided an insight to designing and implementing a system using Amazon cloud computing services. In our system we decided to keep the Job XML as a single document. The flow of this document is described in section 2.4 Message format. In systems such as ours a sequential execution of operations is necessary. Each depends on the completion of a prior task. This requirement is called happens before semantics. The advantage of using Job XML with combined tasks is a simplified design. A weakness however is the limited file size using SQS. This limits the potential size of workflow for a system. One solution is to compress the Job XML beforehand and uncompress by the worker. Our experience when using Amazon Web Services infrastructure highlighted techniques needed during programming to compensate for the underlying infrastructure. Due to network latency and at-least-once messaging semantics we had to use guarded programming techniques by capturing timing faults and retrying resolved problems and checking if a message had been processed. Future work With additional time the extension of the system can be improved. Improvements will allow higher guarantees for message processing. Compression of Job XML Messages for faster message propagation Splitting Job XML into smaller sections Replace the XML message format with a standard such as Business Execution Processing Language such as BPEL. Replace the Controller database with an AWS database service to allow sharing with Coordinator. 5. Conclusion The infrastructure provided by Amazon Web Services puts forth opportunities to develop applications such as the Distributed File Processing applications as proposed in this project. The deployment of this application would bring benefit to users who require file conversions. The application also brings business value, more so with Amazon s affordable charges on its services. The service provided by Amazon Web Services provides reliability and security to a certain extent which provides assurance on files uploaded for conversion. Various functionalities were implemented in efficient way. Firstly, the structure of job-xml message which has parent and a child job which are processed in such a way that when parent job is completed, it is then discarded and child job is executed. Another functionality is two different queues one is for jobs and one is to give the results of the jobs executed. Amazon AWS functionality provides at least once message delivery which can cause 15

duplicate messages to be processed. This report gives a solution for this and ensures at most once message delivery with a help of a coordinator. This project has achieved the objectives using cloud computing infrastructure services. Using Amazon Web Services infrastructure services applications can be created which are suitable for long running tasks. It uses Amazon s SQS, S3, and EC2. 6. References [1] Amazon (2008-2009), Amazon Elastic Compute Cloud getting started guide [2] Amazon (2008-2009), Amazon Elastic Compute Cloud developers guide [3] Amazon (2006), Amazon Simple Storage service developers guide [4] Amazon (2009), Amazon Simple Queue Service developers guide Rogues Gallery Matthew Clark Qublai K. Ali Mirza Raneesha Manoharan Winai Wongthai Jeremy R. Whiting Ian Dundas 16

Matthew Brook Iliana Filippou Kristel Caisip Olufunke John-Oluleye 17