A batch system for HEP applications on a distributed IaaS cloud

Similar documents
Research computing in a distributed cloud environment

Data intensive high energy physics analysis in a distributed cloud

arxiv: v1 [cs.dc] 30 Jun 2010

Simulation and user analysis of BaBar data in a distributed cloud

Repoman: A Simple RESTful X.509 Virtual Machine Image Repository. Roger Impey

Dynamic Resource Distribution Across Clouds

Clouds for research computing

HEP Data-Intensive Distributed Cloud Computing System Requirements Specification Document

Adding IaaS Clouds to the ATLAS Computing Grid

Operating a distributed IaaS Cloud

Context-aware cloud computing for HEP

PoS(EGICF12-EMITC2)005

Resource Scalability for Efficient Parallel Processing in Cloud

Cloud services for the Fermilab scientific stakeholders

Distributed IaaS Clouds and 100G Networking for HEP applications

Science Clouds: Early Experiences in Cloud Computing for Scientific Applications Kate Keahey and Tim Freeman

Deploying Business Virtual Appliances on Open Source Cloud Computing

Elastic Management of Cluster based Services in the Cloud

Evaluation of Nagios for Real-time Cloud Virtual Machine Monitoring

Infrastructure as a Service (IaaS)

POSIX and Object Distributed Storage Systems

The CERN Virtual Machine and Cloud Computing

Exploiting Virtualization and Cloud Computing in ATLAS

Scientific Cloud Computing: Early Definition and Experience

Shoal: IaaS Cloud Cache Publisher

PES. Batch virtualization and Cloud computing. Part 1: Batch virtualization. Batch virtualization and Cloud computing

An Efficient Use of Virtualization in Grid/Cloud Environments. Supervised by: Elisa Heymann Miquel A. Senar

Experimental Investigation Decentralized IaaS Cloud Architecture Open Stack with CDT

Potential of Virtualization Technology for Long-term Data Preservation

StACC: St Andrews Cloud Computing Co laboratory. A Performance Comparison of Clouds. Amazon EC2 and Ubuntu Enterprise Cloud

Cloud-pilot.doc SA1 Marcus Hardt, Marcin Plociennik, Ahmad Hammad, Bartek Palak E U F O R I A

Scheduling and Load Balancing in the Parallel ROOT Facility (PROOF)

Cloud Computing for Control Systems CERN Openlab Summer Student Program 9/9/2011 ARSALAAN AHMED SHAIKH

Shared Parallel File System

Dynamic Extension of a Virtualized Cluster by using Cloud Resources CHEP 2012

Private Distributed Cloud Deployment in a Limited Networking Environment

Deployment of Private, Hybrid & Public Clouds with OpenNebula

An Experimental Study of Load Balancing of OpenNebula Open-Source Cloud Computing Platform

How To Create A Grid On A Microsoft Web Server On A Pc Or Macode (For Free) On A Macode Or Ipad (For A Limited Time) On An Ipad Or Ipa (For Cheap) On Pc Or Micro

Batch and Cloud overview. Andrew McNab University of Manchester GridPP and LHCb

Hadoop Distributed File System Propagation Adapter for Nimbus

An approach to grid scheduling by using Condor-G Matchmaking mechanism

Cloud Computing with Red Hat Solutions. Sivaram Shunmugam Red Hat Asia Pacific Pte Ltd.

University of Huddersfield Repository

The Evolution of Cloud Computing in ATLAS

files without borders

Efficient Data Management Support for Virtualized Service Providers

A Distributed Storage Architecture based on a Hybrid Cloud Deployment Model

Enabling multi-cloud resources at CERN within the Helix Nebula project. D. Giordano (CERN IT-SDC) HEPiX Spring 2014 Workshop 23 May 2014

Cloud Computing PES. (and virtualization at CERN) Cloud Computing. GridKa School 2011, Karlsruhe. Disclaimer: largely personal view of things

Frequently Asked Questions

wu.cloud: Insights Gained from Operating a Private Cloud System

CONDOR CLUSTERS ON EC2

CERN Cloud Storage Evaluation Geoffray Adde, Dirk Duellmann, Maitane Zotes CERN IT

IaaS Cloud Architectures: Virtualized Data Centers to Federated Cloud Infrastructures

IBM Platform Computing Cloud Service Ready to use Platform LSF & Symphony clusters in the SoftLayer cloud

OASIS: a data and software distribution service for Open Science Grid

Using SUSE Studio to Build and Deploy Applications on Amazon EC2. Guide. Solution Guide Cloud Computing.

White Paper. Recording Server Virtualization

A Middleware Strategy to Survive Compute Peak Loads in Cloud

Chao He (A paper written under the guidance of Prof.

Plug-and-play Virtual Appliance Clusters Running Hadoop. Dr. Renato Figueiredo ACIS Lab - University of Florida

Performance Isolation of a Misbehaving Virtual Machine with Xen, VMware and Solaris Containers

Advanced File Sharing Using Cloud

Oracle Database Scalability in VMware ESX VMware ESX 3.5

XenMon: QoS Monitoring and Performance Profiling Tool

Cloud computing: A Perspective study

Is Virtualization Killing SSI Research?

Transparent Optimization of Grid Server Selection with Real-Time Passive Network Measurements. Marcia Zangrilli and Bruce Lowekamp

- Behind The Cloud -

Cloud Computing: a Perspective Study

Virtualization Infrastructure at Karlsruhe

Private Clouds with Open Source

A Quantitative Analysis of High Performance Computing with Amazon s EC2 Infrastructure: The Death of the Local Cluster?

Leveraging BlobSeer to boost up the deployment and execution of Hadoop applications in Nimbus cloud environments on Grid 5000

36 January/February 2008 ACM QUEUE rants:

Research of Enterprise Private Cloud Computing Platform Based on OpenStack. Abstract

HPC performance applications on Virtual Clusters

Dynamic Virtual Cluster reconfiguration for efficient IaaS provisioning

9/26/2011. What is Virtualization? What are the different types of virtualization.

Virtualization for Future Internet

On the Use of Cloud Computing for Scientific Workflows

Deep Dive: Maximizing EC2 & EBS Performance

VMware and Xen Hypervisor Performance Comparisons in Thick and Thin Provisioned Environments

Cloud Computing. Adam Barker

Software Testing and Deployment Using Virtualization and Cloud

Managing a tier-2 computer centre with a private cloud infrastructure

Measurement of BeStMan Scalability

Performance Comparison of VMware and Xen Hypervisor on Guest OS

Nimbus: Cloud Computing with Science

The Evolution of Cloud Computing in ATLAS

Migration of Virtual Machines for Better Performance in Cloud Computing Environment

GESTIONE DI APPLICAZIONI DI CALCOLO ETEROGENEE CON STRUMENTI DI CLOUD COMPUTING: ESPERIENZA DI INFN-TORINO

benchmarking Amazon EC2 for high-performance scientific computing

Windows 8 SMB 2.2 File Sharing Performance

Introduction 1 Performance on Hosted Server 1. Benchmarks 2. System Requirements 7 Load Balancing 7

DESIGN OF A PLATFORM OF VIRTUAL SERVICE CONTAINERS FOR SERVICE ORIENTED CLOUD COMPUTING. Carlos de Alfonso Andrés García Vicente Hernández

Cloud Storage. Parallels. Performance Benchmark Results. White Paper.

Transcription:

A batch system for HEP applications on a distributed IaaS cloud I. Gable, A. Agarwal, M. Anderson, P. Armstrong, K. Fransham, D. Harris C. Leavett-Brown, M. Paterson, D. Penfold-Brown, R.J. Sobie, M. Vliet Department of Physics and Astronomy University of Victoria, Victoria, Canada V8W 3P6 A. Charbonneau, R. Impey, W. Podaima National Research Council Canada 100 Sussex Drive, Ottawa, Canada E-mail: igable@uvic.ca Abstract. The emergence of academic and commercial Infrastructure-as-a-Service (IaaS) clouds is opening access to new resources for the HEP community. In this paper we will describe a system we have developed for creating a single dynamic batch environment spanning multiple IaaS clouds of different types (e.g. Nimbus, OpenNebula, Amazon EC2). A HEP user interacting with the system submits a job description file with a pointer to their image. images can either be created by users directly or provided to the users. We have created a new software component called Cloud Scheduler that detects waiting jobs and boots the user required on any one of the available cloud resources. As the user s appear, they are attached to the job queues of a central Condor job scheduler, the job scheduler then submits the jobs to the s. The number of s available to the user is expanded and contracted dynamically depending on the number of user jobs. We present the motivation and design of the system with particular emphasis on Cloud Scheduler. We show that the system provides the ability to exploit academic and commercial cloud sites in a transparent fashion. 1. Introduction Grid computing, although successful for many High Energy Physics(HEP) projects, has some well understood disadvantages [1] when compared to traditional cluster High Throughput Computing (HTC). On of the more difficult to overcome is the need to deploy applications to remote resources. Applications written for HEP experiments are complex software systems usually requiring an application specialist to make the deployment. Often it can take significant amounts of time to install and configure the software on a new cluster, making it difficult to quickly take advantage of computing resources as they become available. Virtual Machine software such as Xen [2] and K [3] can be used to wrap the application code in a complete software environment [1]. Both Xen and K have been shown to present insignificant execution overhead for HEP applications [4]. s can be copied and deployed at various remote Infrastructure-as-a-Service (IaaS) cloud computing platforms, such as Nimbus

# Regular Condor Attributes Universe = vanilla Executable = script.sh Arguments = one two three Log = script.log Output = script.out Error = script.error should_transfer_files = YES when_to_transfer_output = ON_EXIT # Cloud Scheduler Attributes Requirements = +Type = "vm-name" +Loc = "http://repo.tld/vm.img.gz" +AMI = "ami-dfasfds" +CPUArch = "x86" +CPUCores = "1" +Network = "private" +Mem = "2048" +Storage = "20" Queue Figure 1. A typical Condor Job description file used with Cloud Scheduler. There are several custom attributes required which define the type of image required to run the job. [5], OpenNebula [6], Eucalyptus [7]. However, having efficient s and the means to deploy them to remote IaaS sites is not enough to make a usable platform for a typical HEP user with many jobs to run. Most HEP users are already familiar with using cluster batch computing as it has been a technology employed for at least the last decade. However, few of these users have experience constructing a batch system as this is typically a task handled by a site system administrator. We have implemented a dynamic multi-cloud site batch system using a simple software component we developed called Cloud Scheduler [8]. Our goal is to produce a system that requires little effort from the users than typical batch job submissions. In this paper we describe this system and show how it can be used effectively with the BaBar application software [9]. 2. System Architecture Cloud resources are typically provided at discrete sites with no mechanism for bridging multiple sites or submitting to a central resource broker. In order to use multiple sites, requests for s must be individually submitted to each site. As the number of sites grows this task becomes increasingly onerous for the users. We solve this problem by building Condor [10] resource pools across multiple clouds. The component which allows us to accomplish this is the aforementioned Cloud Scheduler (for a detailed description of Cloud Scheduler s internal design see [8]). Cloud Scheduler is configured with a simple text file that list the available cloud resources and their properties such as available CPU architectures, number of available slots, and memory per slot. In order to use the system, users are a configured with the base HEP application software or a with an unmodified installation of an operating system. This serves both typical and sophisticated users who wish to substantially modify their application software. Users create a valid X.509 grid proxy [13], then submit a Condor job with a set of custom Condor attributes (see Figure 1). The Condor attributes define the type of image which their job requires

Cloud Scheduler Scheduler status communication UVic HEP cloud (Nimbus, EC2, Other) 40 cores... UVic science cloud (Nimbus, EC2, Other) 20 cores... NRC science cloud (Nimbus, EC2, Other) 30 cores... Amazon EC2 (EC2) 10 m1.small Condor Scheduler User Figure 2. The figure shows the distributed cloud system used in this work. The number of slots (or cores) at each site shown on the figure is the number made available to this project. in addition to other requirements such as memory, scratch space, and CPU architecture. Once submitted to Condor, Cloud Scheduler detects the waiting idle job in the queue and boots a virtual machine on any number of configured IaaS cluster types such as Nimbus, Amazon EC2, and Eucalyptus. Prior to booting, the is contextualized (modified by the IaaS software) to connect back to the central Condor Scheduler to begin draining jobs. Users initially see a Condor Scheduler with no available resources and then s spring to life that meet the requirements of their jobs. When all jobs are completed the idle s are shutdown. In this work we use a system consisting of four clouds as shown in Figure 2. Both the UVic HEP cloud and UVic science cloud are located in Victoria, Canada. The NRC cloud is located at the National Research Council in Ottawa, Canada, and Amazon EC2 is located in Virginia. All sites are connected with 1 Gbps or better research network connections except Amazon EC2 which is reached via a slower and more congested commodity internet connection. Each one of the three research cloud sites consists of a Linux cluster with Intel Nahalem Xeon processsors and Gigabit Ethernet or bonded Gigabit Ethernet connections. These clusters are typical of those purchased for HEP HTC clusters. The three research sites use Nimbus and are accessed by the Nimbus IaaS API. Sites provided by Amazon are accessed via its REST API. Each Nimbus site uses X.509 grid proxies for authentication, which allows HEP users to reuse their existing grid credentials. Grid credentials are also well trusted means of authentication at other research sites. Amazon EC2 uses its own access control and secret keys for authentication.

RRDTOOL / TOBI OETIKER Number of Jobs 250 240 230 220 210 200 190 180 170 160 150 140 130 120 110 100 90 80 70 60 50 40 30 20 10 Cloud Jobs 0 Jan 08-12h Jan 09-00h Jan 09-12h Jan 10-00h (a) Running Jobs Queued Jobs RRDTOOL / TOBI OETIKER Number of s 100 95 90 85 80 75 70 65 60 55 50 45 40 35 30 25 20 15 10 5 Cloud Virtual Machines 0 Jan 08-12h Jan 09-00h Jan 09-12h Jan 10-00h (b) Running s Starting s Error s Figure 3. (a) Shows the state of 255 jobs executed on the cloud system over a 2 day period in January 2010.(b) Shows the s automatically instantiated to meet the needs of those jobs over the same period. The blue curve shows the number of s being booted, the black curve shows the running s and the red curve shows s in an error state. The number of s being booted shows a short spike for s being booted on the NRC and Amazon EC2 clouds (where the s are locally stored). The s required at the the two UVic clouds needed to be transfered from Ottawa to Victoria and this required 4-5 hours on the research network. The dip in the number of running jobs is explained in the accompanying text. 3. Results We present a analysis run comprised of 255 jobs submitted by a BaBar user who is a member of this project. The submitted jobs run a user analysis over three BaBar data sets: Tau1N-data (total size: 1158 GB ), Tau1N-MC (total size: 615 GB) and Tau11-MC(total size: 1386 GB).

RRDTOOL / TOBI OETIKER bits per second 2.5 G 2.4 G 2.3 G 2.2 G 2.1 G 2.0 G 1.9 G 1.8 G 1.7 G 1.6 G 1.5 G 1.4 G 1.3 G 1.2 G 1.1 G 1.0 G 0.9 G 0.8 G 0.7 G 0.6 G 0.5 G 0.4 G 0.3 G 0.2 G 0.1 G Cloud IO 0.0 Jan 08-12h Jan 09-00h Jan 09-12h Jan 10-00h Xrootd IO Image Repository IO Figure 4. The total network I/O of the distributed cloud system. The blue line shows the I/O out of the image repository. The black line shows the data I/O from the jobs in the Cloud s. Tau1N-data and Tau1N-MC have event sizes of 4 KB while Tau11-MC has event size of 3 KB. All the data is stored on a Lustre filesystem hosted at UVic and is accessed by the s through the Xrootd [12] protocol. Prior to the submission the user prepared a Xen image containing their BaBar application and made it available on an http server at NRC in Ottawa. This server functions as the central virtual machine image repository which can be accessed by any one of the research clouds shown in Figure 2. In addition the image must be uploaded to Amazon EC2 as Amazon requires that images preexist on their cloud. Each image is roughly 16 GB in size. At present there is no user authentication on the central repository, however we have an image repository under development which can use X.509 proxies to authenticate over https. Figure 3(a) shows the jobs submission occurring at 13:00 on Jan 8 and 255 jobs entering the batch queue. Initially no jobs are able to run because no suitable resources have yet been instantiated in the cloud. Cloud Scheduler detects the waiting jobs and begins to instantiate the necessary s to meet the requirements of the jobs as listed in the Condor job description files (see Figure 1). We initially see s rapidly starting on both the NRC cloud and EC2 as shown in Figure 3(b). There is a delay of approximately 5 hours at which point we see s booting on both clouds at the University of Victoria. Because each 16 GB image is transfered over the WAN from Ottawa to Victoria we see a substantial penalty in the boot up times on Victoria based clouds. In order to avoid this penalty images could have been preloaded onto those clouds, however this would require extra work on the part of the user and knowledge of the clouds in question. In the future, we intend to address this problem by implementing a caching system such that identical images will only be transfered over the WAN once. At roughly 20:00 on Jan 8 we see the full distributed cloud reaching its capacity of 100 s. After two hours of stable running 2 of the physical worker nodes on the UVic HEP cloud experienced a known intermittently occurring Xen networking bug which caused the s on those nodes to drop their network connection to the central Condor Scheduler. Once this occurs those machines drop off the available Condor resources and the jobs are listed as evicted in Condor terms. Cloud scheduler immediately detects that these s are booted but not connected to the job queues. It then terminates those s via the cluster IaaS interface and

attempts to instantiate new s to replace them. The cluster soon recovers and returns to full capacity and Condor resubmits the jobs. In this fashion the user is insulated from instabilities in the cloud system. This is particularly important in the case of these new technologies. Near 15:00 on Jan 9 we see that there are fewer then 100 remaining running jobs and Cloud Scheduler begins to shut down the unused resources as the jobs finish. of As the job queue drains we see the number of running s simultaneously reduced. The total network I/O usage of the distributed cloud can be seen in Figure 4. Early in the run the demands on the image repository are substantial with half hour average network I/O peaking at 2.2 Gbps when all three research clouds are pulling from the image repository simultaneously. After the NRC cloud, which is co-located with the image repository finishes pulling the s we see I/O drop to 400 Mbps as the images are transfered across the WAN to Victoria. Amazon EC2 has no impact on the image repository I/O as the EC2 image is already stored at the Amazon site. The Xrootd I/O for data access for the running across all sites peaks at a modest 400 Mbps. 4. Conclusion We have constructed a multi-site distributed cloud using Condor and a new component called Cloud Scheduler. This distributed cloud was able to analyze roughly 5 TB ofbabar data using 100 s booted across four cloud sites with no work from the user beyond job submission and image preparation. Furthermore, we have shown that the system is resilient to failures in individual cloud resources; a feature essential to any system running on new IaaS technology. Acknowledgment The support of CANARIE, the Natural Sciences and Engineering Research Council, the National Research Council of Canada and Amazon are acknowledged. References [1] Agarwal A, Charbonneau A, Desmarais R, Enge R, Gable I, Grundy D, Penfold-Brown D, Seuster R, Sobie R and Vanderster D C 2008 Deploying HEP Applications Using Xen and Globus Virtual spaces Proc. of Computing in High Energy and Nuclear Physics 2007 J. Phys.: Conf. Ser. 119 062002 doi:10.1088/1742-6596/119/6/062002 [2] Barham P, Dragovic B, Fraser K, Hand S, Harris T, Ho A, Neugebauer R, Pratt I and Warfield A, 2003 Xen and the art of virtualization. Proc. of the 19th ACM Symp. on Operating Systems Principles (Bolton Landing, NY, USA) pp 164-177 [3] The Kernel Virtual Machine http://www.linux-kvm.org/ [4] Alef M and Gable I HEP specific benchmarks of virtual machines on multi-core CPU architectures. J. Phys Conf. Ser. 219, 052015 (2009). doi: 10.1088/1742-6596/219/5/052015 [5] Nimbus Project http://nimbusproject.org/ [6] Open Nebula Project http://www.opennebula.org/ [7] Eucalyptus Software http://www.eucalyptus.com/ [8] Armstrong P, Agarwal A, Bishop A, Charbonneau A, Desmarais R, Fransham K, Hill N, Gable I, Gaudet S, Goliath S, Impey R, Leavett-Brown C, Ouellete J, Paterson M, Pritchet C, Penfold-Brown D, Podaima W, Schade D, and Sobie R J Cloud Scheduler: a resource manager for a distributed compute cloud, June 2010 ( arxiv:1007.0050v1 [cs.dc] ) [9] Aubert B et al. [BABAR Collaboration], The BaBar detector Nucl. Instrum. Meth. A 479, 1 (2002) [arxiv:hep-ex/0105044]. [10] Thani D, Tannenbaum T and Livny M Distributed computing in practice: the Condor experience. Concurrency and Computation: Practice and Experience Vol. 17 (2005) 323. [11] The Scientific Linux Distribution: http://www.scientificlinux.org/ [12] Dorigo A, Elmer P, Furano F, and Hanushevsky A 2005 XROOTD/TXNetFile: a highly scalable architecture for data access in the ROOT environment Proceedings of the 4th WSEAS International Conference on Telecommunications and Informatics (TELE-INFO 05) [13] Tuecke S, Welch V, Engert D, Pearlman L, Thompson M 2004 Internet X.509 public key infrastructure (PKI) proxy certificate profile IETF RFC 3820, June 2004, http://www.ietf.org/rfc/rfc3820.txt