Image Distribution Mechanisms in Large Scale Cloud Providers



Similar documents
PES. Batch virtualization and Cloud computing. Part 1: Batch virtualization. Batch virtualization and Cloud computing

Virtualization. (and cloud computing at CERN)

An Experimental Study of Load Balancing of OpenNebula Open-Source Cloud Computing Platform

Cloud Computing PES. (and virtualization at CERN) Cloud Computing. GridKa School 2011, Karlsruhe. Disclaimer: largely personal view of things

OpenNebula Leading Innovation in Cloud Computing Management

Infrastructure as a Service (IaaS)

Resource Scalability for Efficient Parallel Processing in Cloud

Cloud Computing for Control Systems CERN Openlab Summer Student Program 9/9/2011 ARSALAAN AHMED SHAIKH

FREE AND OPEN SOURCE SOFTWARE FOR CLOUD COMPUTING SERENA SPINOSO FULVIO VALENZA

THE CC1 PROJECT SYSTEM FOR PRIVATE CLOUD COMPUTING

OpenNebula Cloud Innovation and Case Studies for Telecom

OpenNebula Cloud Case Studies

Li Sheng. Nowadays, with the booming development of network-based computing, more and more

AN IMPLEMENTATION OF E- LEARNING SYSTEM IN PRIVATE CLOUD

OpenNebula Cloud Case Studies for Research and Industry

2) Xen Hypervisor 3) UEC

Cloud Computing Architecture with OpenNebula HPC Cloud Use Cases

Experimental Investigation Decentralized IaaS Cloud Architecture Open Stack with CDT

OpenNebula An Innovative Open Source Toolkit for Building Cloud Solutions

Introduction to Cloud Computing

Deploying Business Virtual Appliances on Open Source Cloud Computing

ZEN LOAD BALANCER EE v3.04 DATASHEET The Load Balancing made easy

Efficient Cloud Management for Parallel Data Processing In Private Cloud

IaaS Cloud Architectures: Virtualized Data Centers to Federated Cloud Infrastructures

Scheduler in Cloud Computing using Open Source Technologies

Plug-and-play Virtual Appliance Clusters Running Hadoop. Dr. Renato Figueiredo ACIS Lab - University of Florida

VMware Server 2.0 Essentials. Virtualization Deployment and Management

STeP-IN SUMMIT June 18 21, 2013 at Bangalore, INDIA. Performance Testing of an IAAS Cloud Software (A CloudStack Use Case)

ZEN LOAD BALANCER EE v3.02 DATASHEET The Load Balancing made easy

Sistemi Operativi e Reti. Cloud Computing

StACC: St Andrews Cloud Computing Co laboratory. A Performance Comparison of Clouds. Amazon EC2 and Ubuntu Enterprise Cloud

An Integrated CyberSecurity Approach for HEP Grids. Workshop Report.

Mark Bennett. Search and the Virtual Machine

The OpenNebula Standard-based Open -source Toolkit to Build Cloud Infrastructures

PERFORMANCE ANALYSIS OF KERNEL-BASED VIRTUAL MACHINE

Cloudcom 2010 Indianapolis, USA Nov 30 Dec 3, OpenNebula Tutorial. Constantino Vázquez Blanco Borja Sotomayor

Installing & Using KVM with Virtual Machine Manager COSC 495

Managing your Red Hat Enterprise Linux guests with RHN Satellite

Intel Cloud Builder Guide: Cloud Design and Deployment on Intel Platforms

Seed4C: A Cloud Security Infrastructure validated on Grid 5000

Virtualization with Windows

Comparison of Several Cloud Computing Platforms

Virtual Machine Management with OpenNebula in the RESERVOIR project

Cloud Models and Platforms

From Grid Computing to Cloud Computing & Security Issues in Cloud Computing

LSKA 2010 Survey Report I Device Drivers & Cloud Computing

Science Clouds: Early Experiences in Cloud Computing for Scientific Applications Kate Keahey and Tim Freeman

Solution for private cloud computing

Large Scale Management of Virtual Machines Cooperative and Reactive Scheduling in Large-Scale Virtualized Platforms

VIRTUAL RESOURCE MANAGEMENT FOR DATA INTENSIVE APPLICATIONS IN CLOUD INFRASTRUCTURES

Mobile Cloud Computing T Open Source IaaS

FCM: an Architecture for Integrating IaaS Cloud Systems

Eucalyptus: An Open-source Infrastructure for Cloud Computing. Rich Wolski Eucalyptus Systems Inc.

A quantitative comparison between xen and kvm

OGF25/EGEE User Forum Catania, Italy 2 March 2009

Quantifying the Performance Degradation of IPv6 for TCP in Windows and Linux Networking

From Grid Computing to Cloud Computing & Security Issues in Cloud Computing

Design and Building of IaaS Clouds

Power Aware Load Balancing for Cloud Computing

OpenNebula Open Souce Solution for DC Virtualization

Elastic Management of Cluster based Services in the Cloud

Keywords Distributed Computing, On Demand Resources, Cloud Computing, Virtualization, Server Consolidation, Load Balancing

VMM-Level Distributed Transparency Provisioning Using Cloud Infrastructure Technology. Mahsa Najafzadeh, Hadi Salimi, Mohsen Sharifi

Next Generation Now: Red Hat Enterprise Linux 6 Virtualization A Unique Cloud Approach. Jeff Ruby Channel Manager jruby@redhat.com

A Distributed Storage Architecture based on a Hybrid Cloud Deployment Model

Network Performance Comparison of Multiple Virtual Machines

Using SUSE Cloud to Orchestrate Multiple Hypervisors and Storage at ADP

How to Backup and Restore a VM using Veeam

Solution for private cloud computing

CPET 581 Cloud Computing: Technologies and Enterprise IT Strategies. Virtualization of Clusters and Data Centers

Cloud Optimize Your IT

Software Define Storage (SDs) and its application to an Openstack Software Defined Infrastructure (SDi) implementation

Shoal: IaaS Cloud Cache Publisher

Experience with Server Self Service Center (S3C)

<Insert Picture Here> Private Cloud with Fusion Middleware


Dynamic Creation and Placement of Virtual Machine Using CloudSim

Virtualization Management the ovirt way

Key Research Challenges in Cloud Computing

13.1 Backup virtual machines running on VMware ESXi / ESX Server

Google

A Middleware Strategy to Survive Compute Peak Loads in Cloud

How To Get The Most Out Of Redhat.Com

cloud functionality: advantages and Disadvantages

Dynamic Load Balancing of Virtual Machines using QEMU-KVM

Simplifying Storage Operations By David Strom (published 3.15 by VMware) Introduction

IOS110. Virtualization 5/27/2014 1

OpenNebula Open Souce Solution for DC Virtualization

OpenStack Ecosystem and Xen Cloud Platform

Scaling VM Deployment in an Open Source Cloud Stack

Efficient Load Balancing using VM Migration by QEMU-KVM

Cloud Implementation using OpenNebula

How To Detect Denial Of Service Attacks On A Cloud Computing System

Virtualization benefits in High Performance Computing Applications

U-LITE Network Infrastructure

Eucalyptus: An Open-source Infrastructure for Cloud Computing. Rich Wolski Eucalyptus Systems Inc.

Virtualization for Future Internet

Figure 1. The cloud scales: Amazon EC2 growth [2].

15 th April 2010 FIA Valencia

Performance Evaluation of the XDEM framework on the OpenStack Cloud Computing Middleware

Transcription:

2nd IEEE International Conference on Cloud Computing Technology and Science Image Distribution Mechanisms in Large Scale Cloud Providers Romain Wartel, Tony Cass, Belmiro Moreira, Ewan Roche, Manuel Guijarro, Sebastien Goasguen, Senior Member, IEEE, and Ulrich Schwickerath Abstract This paper presents the various mechanisms for virtual machine image distribution within a large batch farm and between sites that offer cloud computing services. The work is presented within the context of the Large Hadron Collider Computing Grid (LCG), it has two main goals. First it aims at presenting the CERN specific mechanisms that have been put in place to test the pre-staging of virtual machine images within a large cloud infrastructure of several hundred physical hosts. Second it introduces the basis of a policy for trusting and distributing virtual machine images between sites of the LCG. Finally experimental results are shown for the distribution of a 10 GB virtual machine image distributed to over 400 physical nodes using a binary tree and a BitTorrent algorithm. Results show that images can be pre-staged within 30 minutes. Index Terms virtualization, image, cloud, trust, grid I. INTRODUCTION During the last year, CERN has started to design and setup an internal cloud infrastructure called lxcloud [1] with the aim to provide an internal IaaS service to the batch processing service managers. Lxcloud has the potential to evolve and become a public cloud [2]. While such an infrastructure presents many challenges, a crucial one is to make it scalable to a large number of nodes, typically up to several thousands of hypervisors. A key component of such a scalable infrastructure is the image distribution system which will be described in this paper. Managing a large batch resource on the grid is extremely challenging [3]. Virtualization has the potential to ease the management of such large resources by offering more flexibility in the provisioning of machines, the mixing of operating systems and the customization of worker nodes for particular users [4]. The separation of concerns provided by the virtualization layer empowers the system administrator while offering the users what they want [5]. To that end several virtualization management software have been created. Among them Nimbus [6], Eucalyptus [7] and OpenNebula [2] emerged from academic research, while ISF [8] and the VMware tools emerged from the coporate world. These tools introduce virtualized infrastructures as a Cloud computing [9] characteristic. However, image distribution is not directly adressed and often rely on the use of a shared file system to make virtual machine images available on all hypervisors. This paper deals with mechanisms that fill the gap of these provisioning systems R. Wartel, T. Cass, B. Moreira, E. Roche, M. Guijarro and U. Schwickerath are with the European Organization for Nuclear Research (CERN), Geneva, Switzerland (e-mail: romain.wartel@cern.ch). S. Goasguen is with Clemson University, School of computing, SC, USA (e-mail: sebgoa@clemson.edu) and present the implementation of efficient image distribution mechanisms as well as policies to trust these images. The main contribution of this paper is two fold: First it presents two mechanisms to distribute virtual machine (VM) images to a set of hypervisors. A scp based system using a binary tree offers logarithmic speedup compared to a sequential model, while the BitTorrent setup presented is quite novel in being used for an image distribution purpose. Second a policy for trust and transfer of vitual machines images between sites is presented, this work -still under progress- is seen as a key component to enable the use of virtual machines across LCG. This policy could also be adopted by other grid infrastructures such as EGI, TeraGrid and the Open Science Grid. The paper is organized as follows: Section II presents a summary of [1] where the cloud infrastructure that has been implemented at CERN was first introduced. Local image production and storage in a trusted repository is described in section III while section IV presents two mechanisms to distribute images stored in a reposiory to a local infrastructure -namely a set of hypervisors. Section V presents the key concepts currently driving a policy to share images among organizations. Finally conclusions are drawn in section VI. II. LXCLOUD INFRASTRUCTURE At the lowest level, a provider of Infrastructure as a Service is implemented as a server farm capable of instantiating virtual machines on-demand. Therefore it must be made of: 1) Servers running an hypervisor of choice (e.g Xen, KVM, VMware ESX) 2) A virtual machine provisioning system, capable of placing virtual machine instances on the hypervisors (e.g OpenNebula [2], Nimbus [6]) 3) An image repository, which holds the images that are used for instantiation. As well as mechanisms to distribute the images to the hypervisors. This the main subject of this paper and is described in details in section III and section IV. 4) A set of processes for networking, configuration and scheduling of the instances The following paragraphs explain these components in more details. Figure 1 shows how these components are integrated within lxcloud. The diagram also shows two use cases. First, the public cloud use case, where users request instantiation of an image. Second, the private cloud use case where the service managers use the cloud to create a batch farm used via standard grid computing job submission mechanisms. 978-0-7695-4302-4/10 $26.00 2010 IEEE DOI 10.1109/CloudCom.2010.73 112

via a command line client or via an XML-RPC server. These mechanisms offer great flexibility for integration with other data center mechanisms. ISF offers a versatile web client. Both systems support Xen and KVM hypervisors. Details and performance of the two systems will be presented in a different publication. Figure 1. Lxcloud infrastructure diagram. A user can either use the cloud computing interface to request a new instance (i.e Public cloud mode), or it can use the standard grid computing job submission mechanisms. When a job is submitted lxcloud automatically starts a new VM instance that will join the batch system and run the job. Therefore, the service managers of the CERN batch system represent users of the internal cloud ( i.e Private cloud mode). A provisioning system is used (e.g OpenNebula [2] or ISF [8]), an image repository provides an index of all virtual machines that can be instantiated. The golden nodes which represent virtual machines produced locally are directly tied to the data center management system while the virtual machine instances are ephemeral and will be destroyed automatically once the batch jobs finish. A. Farm of hypervisors A key aspect of lxcloud is to setup a farm of hypervisors. XEN [10] is considered a robust and well known hypervisor mechanism currently supported by Red Hat. Xen has been the de-facto hypervisor in most virtualization products. A newer hypervisor technology is the Kernel Virtual Machine (KVM [11]), developed by Qumranet, it is a kernel module that makes use of the hardware virtualization extension of the newest processors and presents guests as user space processes. KVM is becoming more mature and RedHat is starting to support it as of RHEL6. Additional hypervisors are available such as VMware esx, Microsoft Hyper-V and Virtual Box. For our large linux resources, however, Xen and KVM offer the only viable solutions in term of cost and performance. In lxlcoud, Quattor [12] templates have been created to setup physical servers as Xen or KVM hypervisors. B. The virtual machine provisioning system Images managed in an image repository are used to instantiate the VMs placed on the hypervisor by the provisioning system. The provisioning system takes an image and decides where to create an instance of that image. The process of selecting an hypervisor to start an instance is called virtual machine placement. It is a scheduling decision based on available resources or scheduling heuristics such as stack them which would put as many instances per hypervisors as possible or spread them which would do the opposite. In lxcloud both OpenNebula and ISF are being investigated as possible provisioning system. In OpenNebula, requests for VM instantiation are made either via a cloud computing interface, C. Networking Networking of the guests presents several challenges. Various network configurations are possible depending on the combination of constraints set by site policies and application requirements. First is the possible need for public IP adresses. Second is the type of network on the hypervisor, namely, the use of bridge networking or NAT. Third is the managment of the set of IPs allocated to the guests and the need to keep track of the network topology. Site policies for networking, security and auditing as well as usage model dictates which combination is used. At one end of the spectrum is the use of public IP adresses with a very strict topology where the guests can only be started on specific hypervisors (e.g current CERN setup). At the other end of the spectrum is a fully NATed approach where each guest is networked with its own NAT (e.g Clemson University setup [13] ). An intermediate configuration would be the use of private IP adresses and a bridged network on each hypervisor. D. Contextualization A key aspect of the image lifecycle is contextualization [6]. Contextualization refers to the self-configuration of the guest when it is provisioned on a hypervisor. While an image can be contextualized ahead of instantiation: so-called image contextualization, the instance contextualization is an automated process. Simple contextualization issues are for example the networking setup and the startup of any services running within the guests. The contextualized services could be selfcontained services (e.g cron jobs) or services needing access to resources in the physical infrastructure (e.g batch system, network file system). In the case of lxcloud, contextualization of the guests involves joining the local LSF cluster so that the guests can start running batch jobs. Contextualization mechanisms should be a feature of the provisioning system used. In most cases, this is currently achieved by attaching a contextualization disk to the guest to pass data and scripts. Both OpenNebula and ISF offer a contextualization mechanism which has been used to configure the network of the guests, join the LSF cluster, turn on or off the AFS service as well as set the lifetime of the guest. E. Autonomic provisioning Given the order of magnitude increase in worker nodes (i.e one physical server with eight cores can provide eight VM instances), a key requirement for a large cloud is that it manages itself. This self-management property is known as an autonomic capability such as self-configuration, self-discovery and self-provisioning. The automated contextualization described in the previous section is a good example of selfconfiguration. In lxlcoud, self-discovery refers to the ability 113

of the hypervisors to advertize themselves to the provisioning systems. Self-provisioning refers to the system ability to create instances on its own based on policies that dictates actions performed by the provisioning system. Finally, a requirement of lxcloud is for the VM instances to have a built-in limited lifetime, and terminate themselves after the last job they run has completed. This limited life-time requirement stems from the fact that the virtual worker nodes are expected to be dynamic hosts not managed by the data center management system and therefore could get out of date if they were not automatically shutdown and replaced. III. LOCAL IMAGE PRODUCTION A. Image Production Images can only be as secure as their source. CERN already has a large batch farm, and a sophisticated system to manage a large number of machines [12], as well as policies and procedures to deploy security and other software updates in a timely manner. The natural choice is therefore to hook into this system, and use it as a source for creating virtual machine images. For this reason, the concept of so called golden nodes has been created. These are statically allocated, permanently living virtual machines which are centrally managed. They only differ from corresponding physical machines by possible software extensions necessary for virtual machine management. Such packages can include local customizations for contextualization, or driver updates to increase performance. If a critical update has been received, these golden nodes are being updated along with all other machines in the computer center. When this is done, a snapshot is taken, any remaining confidential information, such as the encrypted root password are removed, and the resulting image is moved to the local image repository for distribution to the hypervisors. Using a snapshot of a golden node solves the problem of managing tens of thousands of virtual machines in addition to the thousands of physical machines already in a data center. The VM instances are seen as transient, they will die as soon as they are not needed and will not be managed by the data center utilities. On the hypervisors, the instances are snapshots of the golden node images that are pre-staged using the mechanisms described in this paper. B. Image Repository The image repository is a central index of virtual machine images. It is populated by images produced locally or by images produced by third parties, endorsed and trusted according to the policy described in V. One of the challenges of maintaining a large number of hypervisors, running possibly different virtual images, is to ensure that a coherent set of images is maintained. Also, each virtual image may be revised, for example to integrate new security patches. As a result, each hypervisor needs to be able to identify and obtain the latest version of a given image, as well as remove deprecated images. One way to achieve this consists in maintaining an index of all the currently supported virtual images. For example, it is possible to configure a central server that would provide access to a list of all the images currently supported. Such a list would provide a view of the different images available, along with a set of metadata describing each virtual image in general, including its content, purpose, version, the location of the actual virtual image file, as well as the relevant digital signatures. The resulting list can then be signed by an entity trusted by all the hypervisors. Each hypervisor can then be provided with the identity of a trusted signatory, along with a pointer to the virtual images index. By querying the central index on a regular basis, every hypervisors can be provided with a clear and trusted view of the VM images that are currently supported. In particular, the hypervisors would not instantiate new VM images that are no longer included in the index. The index can be fetched by the hypervisors on a regular basis and verified by checking its digital signature. At CERN, such a solution has been implemented, and the index is provided as a YAML-formatted list, served over HTTPS from an Apache server. IV. DISTRIBUTING IMAGES LOCALLY A. Environment Once the hypervisors have obtained the list of currently supported VM images from the index, they need to remove outdated virtual images and fetch the new (or updated) VM images. Distributing large files (from two to ten gigabytes) simultaneously accross a large number of hypervisors has clear implications on the performance of the infrastructure and of the network. Either a sequential file transfer is used and distributing the image to hundreds of nodes would take days, or a shared file system is used to make simultaneous copies from the shared file system onto the local disks of the hypervisors or parallel file transfer mechanisms are used to speed up the transfer without using a shared file system. Such transfers however could use all available bandwidth of the network and cripple the running virtual machines and associated network operations. It is important to ensure that the hypervisors can all obtain the new VM images as fast as possible -so that instantiation of new machines can resume-, while maintaining reasonable network performances to address the needs of the virtual images currently running. B. Downloading Infrastructure The approach consisting in downloading all the images from one single source has quickly been eliminated, as it would constitute a significant bottleneck in terms of performance, become a single point of failure, and also overload a specific part of the network infrastructure. Instead, efforts have been directed towards distributed downloads. Two technologies have been evaluated to support a more distributed approach: one is based on scp-wave (described in IV-C ), an SCP based algorithm designed to use target destinations as sources for future transfers, and the other based on the Bittorrent protocol. 114

C. Binary tree based distribution In order to pre-stage the image as fast as possible it is necessary to implement a parallel transfer of an image from the image repository to all the hosts. Most provisioning systems implement a naive sequential file transfer method which drastically slows down the provisioning of virtual machines. To speed-up the process a low cost parallel transfer has been implemented based on a binary tree structure -more specifically a Fibonacci tree- which offers logarithmic speedup. So called scp-wave has been implemented in Python and is available for download at [14]. The parent node in the tree initiates the first transfer much like the straightforward sequential transfer. Once the first transfer is complete, the node that received the image becomes a parent itself. The transfer is slow at first but grows logarithmitically as the leaves of the tree are traversed. Simulation results have been presented independently in [15]. to 462 hypervisors in 23 minutes while scp-wave transfers it in 50 minutes. BitTorrent appears to offer better performances and a slightly better success rate to transfer virtual images. The design of the BitTorrent protocol enable the client to deal efficiently with temporary host or network unavailability, as well as a more homogeneous bandwidth consumption over time. Moreover, on hypervisors with scarce network bandwidth, a BitTorrent client can be tuned to limit the amount of bandwidth used for file transfers. D. BitTorrent BitTorrent is a widely adopted protocol to share files accross WANs. It enables a load balanced, distributed, resilient file sharing infrastructure to be implemented. However, it requires some tuning to enable good performances to be achieved in a high bandwidth, low latency, highly interconnected environment. An additional challenge in the topological aspect is that it may not be suitable to add unwanted peers to the network or to become part of the public BitTorrent network. This has implications on the configuration of the tracker, but also on the use of Distributed Hash Tables (DHT). 1) Implementation: At CERN, it has been difficult to find a suitable BitTorrent command line client for Linux, and after several common implementations were reviewed, rtorrent was selected. The client configuration has been modified to control and limit the memory usage of the downloads, PEX [16] has been enabled, but the number of connected peers per hypervisor has been restricted to enable a maximum of five parallel data streams. It was initially planned to deploy a trackerless infrastructure based on DHT, but the DHT network bootstrap has been problematic. In its final implementation, a traditional approach based on a central tracker, but complemented by DHT, has been choosen. The tracker, based on OpenTracker, also runs an rtorrent client and acts as an initial seeder. 2) Benchmarks: Benchmarks were conducted to compare the relative performance of BitTorrent and scp-wave, by transferring virtual image files from different sizes to a large number of hypervisors. Figure 2 shows the results of the benchmark for the transfer of three image file of different sizes. The expected staircase-like behavior is clearly seen. Everytime a transfer completes, two transfers start, therefore the size of each step doubles. In the implementation, a maximum number of open sockets was set at 200, capping the size of the maximum step. Figure 3 shows the result of the BitTorrent benchmark for the same image files as in the scp-wave tests. BitTorrent exhibits a smoother continuous transfer since all peers are involved. For clarity, the result for a 10GB image file transfer is shown in figure 4. BitTorrent transfers the image Figure 2. Performance comparison of scp-wave for three different image file (4 GBytes, 8 GBytes and 10 GBytes). Of interest is the staircase-like behavior of scp-wave typical of a logarithmic binary tree based system. At each step the height of the step doubles until the maximum number of parallel transfer is reached. Figure 3. Performance comparison of BitTorrent transfer for three different image file (4 GBytes, 8 GBytes and 10 GBytes). All nodes gets involved in the transfer faster than scp-wave, maximizing the bandwidth usage. V. SHARING IMAGES IN THE COMMUNITY Sites participating in the LCG project are distributed all over the world. They are independent of each other, and often 115

Figure 4. Diagram showing the performance benchmark of BitTorrent vs. scp-wave. A 10GB image file is transfered to 462 physical nodes. Scp-wave successfully transfers the file in 50 minutes while BitTorrent transfers the file in 23 minutes. While BitTorrent is clearly faster, the ease of use and ease of implementation of scp-wave maybe of interest to some grid sites. support, besides LHC experiments, additional local communities. All of them are subject to different local laws and site policies, which makes sharing of images between these sites a challenge by itself. A. The HEPiX virtualization working group The HEPiX [17] community includes a number of High Energy Physics institutes and aims at fostering a strong collaboration at a technical level between its participants. Virtualisation is a topic of interest for HEPiX and a dedicated working group has been established, in order to propose a common approach to share virtual machines images in the community. The objective is to produce virtual images that could easily and securely be used at different sites in the community. This is raising a number of challenges, not only on the technical side, but also with regards to trust. In order to address the trust issues, the HEPiX virtualisation working group is collaborating with the Joint Security Policy Group [18] in order to establish a common security policy [19]. The next paragraphs include some of the concepts described in the draft policy as well as the ongoing discussions in the working group. However, at this stage, no formal decision has been made in HEPiX, and the following represents solely the view of the authors. B. Trust Issues 1) Image Production: Virtual image production involves a number of responsibilities, in particular with regards to security. It is for instance essential to produce virtual images that are fully patched, with an appropriate degree of access control and the relevant security configuration and tools in place. In addition, the produced virtual images must be designed to be world-readable. Figure 5. In this diagram several scenarios of image trust are depicted. Two endorsers from Org 1 and Org 2 have published a VMIC. A site approved those VMICs as well as its local one from a local endorser. This populates a list of virtual machines that can be used at the site. The local image distribution system can be hooked onto the local listing of approved VMIs to automatically pre-stage the images. 2) Responsibilities: The virtual images produced must then satisfy a number of security requirements before they can be used in the community. However, the current policy does not place any responsibility on the actual image producer. Instead, the policy defines the role of endorser, as "an individual who confirms that a particular VM complete image has been produced according to the requirements of this policy and states that the image can be trusted." [19] The policy then defines a number of technical and procedural criteria that must be verified by the endorser before a virtual image can be trusted. This enables a significant degree of flexibility to choose the virtual image producers, and in particular it enables the use of virtual images produced by third parties over which HEPiX has no authority, as long as the resulting images can be vetted by an HEPiX-appointed endorser. 3) Incident Handling: While significant efforts are dedicated to prevent attackers to gain unauthorized access to the virtual machines, it is essential to retain the ability to handle security incidents and to conduct in-depth forensics on affected virtual machines. This means that a sufficient degree 116

of traceability must be available, as well as the ability to run standard forensics tools on the affected disc image. Both the metadata and the technologies used by HEPiX have been chosen and designed to fulfil this requirement. C. The Virtual Machine Image Catalogs, VMIC A given endorser can maintain a set of valid virtual machine images, serving different purposes. This set is maintained upto-date with security patches and updated on a regular basis. The set of endorsed virtual images is made available to other sites via a Virtual Machine Image Catalogue (VMIC). Each endorser publishes and maintains a VMIC. A site may decide to trust an endorser and all the virtual images in its VMIC. If an endorser is trusted by a particular group or community, all the virtual images from this endorser should indeed be trusted. However, not all of them may be suitable to run at a particular site. And sites may want to retain a fine grained control of the images it decides to run. As a result, from a site point view, each virtual machine image must not only be endorsed, but also approved to be run at the site. D. A Sample VMIC Implementation 1) Metadata: A VMIC basically consists solely in metadata about the virtual images it contains. Typically, it should contain the actual virtual image location as well as information on its endorser and content, including its identifier, version, checksum, date of endorsement, architecture, supported hypervisor and purpose. In HEPiX, a sample implementation has been prepared based on a simple Django application. Django models provide the metadata about the endorsers and their VMIC, and simple views can be prepared to publish and make the VMIC available to other sites. Different endorsers can be managed, so that virtual images from different endorsers could be added. 2) Connection with Local Distribution: A VMIC consists in a description of the virtual images it contains, but it also needs to be connected to the local virtual images distribution system. At CERN, the operator can browse the list of existing virtual images in the CERN VMIC via a Django Web-interface, which contains virtual images produced at CERN (or possibly at other sites), and can mark specific virtual machines as approved to run at CERN or not. Then, by using the CERNspecific Django view, the operator can trigger an automatic update of the central index of VM images stored in the repository and presented earlier in section III. Thereafter, all the BitTorrent clients will shortly download any newly approved virtual images. An additional set of CERN-specific metadata has also been implemented in the VMIC itself to match the need of the local teams. VI. CONCLUSIONS This short paper introduces the basis of a common security policy to trust virtual machines in a large collaboration of cloud providers. This new policy could be adopted by LCG to enable its grid sites to share images and offer more flexibility to its users via a cloud computing model. This paper also presents two mechanisms used inside the CERN datacenter to pre-stage images on its new cloud resource termed lxcloud. While a binary tree solution based on scp offers great improvement compared to a naive sequential transfer, a P2P setup based on BitTorrent has shown to provide the best performance. Early results show that a 10GB image file can be pre-staged on approximately 500 Hypervisors in around 20 minutes. This staging time is well within the currently foresseen requirements of lxcloud. REFERENCES [1] T. Cass, S. Goasguen, B. Moreira, E. Roche, U. Schwickerath, and R. Wartel, Cern s virtual batch farm, in Proc. of the Second Cloud Computing International Conference. Porto, Portugal: EuroCloud Portugal Association, May 2010, pp. 21 32. [2] B. Sotomayor, R. Montero, I. Llorente, and I. Foster, Virtual infrastructure management in private and hybrid clouds, IEEE Internet Computing, vol. 13, no. 5, pp. 14 22, 2009. [3] T. Roblitz, F. Schintke, A. Reinefeld, O. Barring, M. Barroso Lopez, G. Cancio, S. Chapeland, K. Chouikh, L. Cons, P. Poznanski et al., Autonomic management of large clusters and their integration into the grid, Journal of Grid computing, vol. 2, no. 3, pp. 247 260, 2004. [4] M. A. Murphy, L. Abraham, M. Fenn, and S. Goasguen, Autonomic clouds on the grid, Journal of Grid Computing, vol. 8, no. 1, pp. 1 18, March 2010. [5] R. J. Figueiredo, P. A. Dinda, and J. A. B. Fortes, A case for grid computing on virtual machines, in 23rd International Conference on Distributed Computing Systems, 2003. [6] P. Marshall, K. Keahey, and T. Freeman, Elastic Site: Using Clouds to Elastically Extend Site Resources, in 2010 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing. IEEE, 2010, pp. 43 52. [7] D. Nurmi, R. Wolski, C. Grzegorczyk, G. Obertelli, S. Soman, L. Youseff, and D. Zagorodnov, The eucalyptus open-source cloud-computing system, in Proceedings of the 2009 9th IEEE/ACM International Symposium on Cluster Computing and the Grid-Volume 00. IEEE Computer Society, 2009, pp. 124 131. [8] P. Computing. Isf. Platform Computing. [Online]. Available: http://www.platform.com/private-cloud-computing/private-cloudplatform-isf [9] L. Vaquero, L. Rodero-Merino, J. Caceres, and M. Lindner, A break in the clouds: towards a cloud definition, ACM SIGCOMM Computer Communication Review, vol. 39, no. 1, pp. 50 55, 2008. [10] P. Barham, B. Dragovic, K. Fraser, S. Hand, T. Harris, A. Ho, R. Neugebauer, I. Pratt, and A. Warfield, Xen and the art of virtualization, in Nineteenth ACM Symposium on Operating Systems Principles, 2003. [11] Red Hat, Kernel-based Virtual Machine. [Online]. Available: http://www.linux-kvm.org [12] R. García Leiva, M. Barroso López, G. Cancio Meliá, B. Chardi Marco, L. Cons, P. Poznański, A. Washbrook, E. Ferro, and A. Holt, Quattor: Tools and techniques for the configuration, installation and management of large-scale grid computing fabrics, Journal of Grid Computing, vol. 2, no. 4, pp. 313 322, 2004. [13] L. Stout, M. Murphy, and S. Goasguen, Kestrel: an XMPP-based framework for many task computing applications, in Proceedings of the 2nd Workshop on Many-Task Computing on Grids and Supercomputers. ACM, 2009, pp. 1 6. [14] B. Bauer and S. Goasguen. (2010) scp-wave. [Online]. Available: http://code.google.com/p/scp-wave [15] M. Schmidt, N. Fallenbeck, M. Smith, and B. Freisleben, Efficient Distribution of Virtual Machines for Cloud Computing, in 2010 18th Euromicro Conference on Parallel, Distributed and Network-based Processing. IEEE, 2010, pp. 567 574. [16] Pex. [Online]. Available: http://en.wikipedia.org/wiki/peer_exchange [17] Hepix. [Online]. Available: http://www.hepix.org [18] Jspg. [Online]. Available: http://www.jspg.org [19] Jspg policy. [Online]. Available: http://www.jspg.org/wiki/policy_trusted_virtual_machines 117