Scientific Cloud Computing: Early Definition and Experience



Similar documents
Cloud Computing: a Perspective Study

How To Understand Cloud Computing

Cloud Computing: Computing as a Service. Prof. Daivashala Deshmukh Maharashtra Institute of Technology, Aurangabad

Contents. 1 Introduction 2

4/6/2009 CLOUD COMPUTING : PART I WHY IS CLOUD COMPUTING DISTINCT? INTRODUCTION: CONTINUE A PERSPECTIVE STUDY

Deploying Business Virtual Appliances on Open Source Cloud Computing

Science Clouds: Early Experiences in Cloud Computing for Scientific Applications Kate Keahey and Tim Freeman

SOA and Cloud in practice - An Example Case Study

Comparison of Several Cloud Computing Platforms

An Efficient Use of Virtualization in Grid/Cloud Environments. Supervised by: Elisa Heymann Miquel A. Senar

From Grid Computing to Cloud Computing & Security Issues in Cloud Computing

Cloud Models and Platforms

How To Understand Cloud Computing

Business applications:

CHAPTER 8 CLOUD COMPUTING

Infrastructure as a Service (IaaS)

Cloud Infrastructure Pattern

Efficient Cloud Management for Parallel Data Processing In Private Cloud

Cloud Computing for Control Systems CERN Openlab Summer Student Program 9/9/2011 ARSALAAN AHMED SHAIKH

INCREASING SERVER UTILIZATION AND ACHIEVING GREEN COMPUTING IN CLOUD

Migration of Virtual Machines for Better Performance in Cloud Computing Environment

Cloud and Virtualization to Support Grid Infrastructures

IaaS Cloud Architectures: Virtualized Data Centers to Federated Cloud Infrastructures

OGF25/EGEE User Forum Catania, Italy 2 March 2009

A Survey on Open-source Cloud Computing Solutions

The OpenNebula Standard-based Open -source Toolkit to Build Cloud Infrastructures

Grid Computing Vs. Cloud Computing

Geoprocessing in Hybrid Clouds

The Cisco Powered Network Cloud: An Exciting Managed Services Opportunity

A Survey on Open-source Cloud Computing Solutions

IBM EXAM QUESTIONS & ANSWERS

A Study on Service Oriented Network Virtualization convergence of Cloud Computing

CLOUD COMPUTING: A NEW VISION OF THE DISTRIBUTED SYSTEM

An Introduction to Virtualization and Cloud Technologies to Support Grid Computing

Sistemi Operativi e Reti. Cloud Computing

OpenNebula An Innovative Open Source Toolkit for Building Cloud Solutions

Toward a Unified Ontology of Cloud Computing

CLOUD COMPUTING IN HIGHER EDUCATION

Virtual Machine Management with OpenNebula in the RESERVOIR project

VM Management for Green Data Centres with the OpenNebula Virtual Infrastructure Engine

Getting Started Hacking on OpenNebula

Analysis and Research of Cloud Computing System to Comparison of Several Cloud Computing Platforms

FREE AND OPEN SOURCE SOFTWARE FOR CLOUD COMPUTING SERENA SPINOSO FULVIO VALENZA

Introduction to Cloud Computing

CS 695 Topics in Virtualization and Cloud Computing and Storage Systems. Introduction

openqrm Enterprise Edition FAQ

Auto-Scaling Model for Cloud Computing System

Cloud computing - Architecting in the cloud

White Paper on CLOUD COMPUTING

Platform Autonomous Custom Scalable Service using Service Oriented Cloud Computing Architecture

Cloud computing: the state of the art and challenges. Jānis Kampars Riga Technical University

CLEVER: a CLoud-Enabled Virtual EnviRonment

Li Sheng. Nowadays, with the booming development of network-based computing, more and more

Oracle Applications and Cloud Computing - Future Direction

Private Cloud in Educational Institutions: An Implementation using UEC

BUSINESS MANAGEMENT SUPPORT

Planning, Provisioning and Deploying Enterprise Clouds with Oracle Enterprise Manager 12c Kevin Patterson, Principal Sales Consultant, Enterprise

Scheduler in Cloud Computing using Open Source Technologies

A STUDY ON OPEN SOURCE CLOUD COMPUTING PLATFORMS

Introduction to Engineering Using Robotics Experiments Lecture 18 Cloud Computing

VIRTUAL RESOURCE MANAGEMENT FOR DATA INTENSIVE APPLICATIONS IN CLOUD INFRASTRUCTURES

Introduction to Cloud Computing

Comparative Study of Eucalyptus, Open Stack and Nimbus

A Study of Infrastructure Clouds

An Introduction to Cloud Computing Concepts

Comparison and Evaluation of Open-source Cloud Management Software

OpenStack. Orgad Kimchi. Principal Software Engineer. Oracle ISV Engineering. 1 Copyright 2013, Oracle and/or its affiliates. All rights reserved.

How To Understand Cloud Computing

Getting Familiar with Cloud Terminology. Cloud Dictionary

Elastic Cloud Computing in the Open Cirrus Testbed implemented via Eucalyptus

CLOUD COMPUTING: ARCHITECTURE AND CONCEPT OF VIRTUALIZATION

<Insert Picture Here> Private Cloud with Fusion Middleware

How To Create A Cloud Based System For Aaas (Networking)

Cloud Computing and Amazon Web Services

How To Compare Cloud Computing To Cloud Platforms And Cloud Computing

High Performance Computing Cloud Computing. Dr. Rami YARED

Transcription:

The 10th IEEE International Conference on High Performance Computing and Communications Scientific Cloud Computing: Early Definition and Experience Lizhe Wang, Jie Tao, Marcel Kunze Institute for Scientific Computing, Research Center Karlsruhe Hermann-von-Helmholtz-Platz 1, 76344 Eggenstein-Leopoldshafen, Germany Alvaro Canales Castellanos, David Kramer, Wolfgang Karl Department of Computer Science, University Karlsruhe (TH) 76128 Karlsruhe, Germany Abstract Cloud computing emerges as a new computing paradigm which aims to provide reliable, customized and QoS guaranteed computing dynamic environments for end-users. This paper reviews recent advances of Cloud computing, identifies the concepts and characters of scientific Clouds, and finally presents an example of scientific Cloud for data centers. 1. Introduction Cloud computing emerges as a hot topic from the late of 2007 due to its abilities of offering flexible dynamic IT infrastructures, QoS guaranteed computing environments and configurable software services. As reported in Google trends (Figure 1), Cloud computing (blue line), which is enabled by Virtualization technology (yellow line), has outpaced Grid computing [7] (red line). Compute Cloud [16], IBM s Blue Cloud [14], scientific Cloud projects such as Nimbus [17] and Stratus [26], OpenNEbula [19]. There are still no widely accepted definition for Cloud computing albeit Cloud computing practice has attracted much attention. Several reasons has lead into this situation: Cloud computing involves researchers and engineers from various backgrounds, e.g., Grid computing, software engineering, data storage. They work on Cloud computing from different viewpoints. Technologies which enable the Cloud computing are still evolving and progressing, for example, Web 2.0 and SOA. Existing computing Clouds still lack large scale deployment and usage, which would finally justify the concept of Cloud computing. In this paper, we try to give an early definition of Cloud computing based on recent advances from academia and industry as well as our experience. This paper also introduces a proof-of-concept computing Cloud Cumulus, which is deployed at our site. This paper is organized as follows. Section 2 introduces the current projects of Cloud computing. Section 3 defines the concept of Cloud computing. Cumulus project, our experience of Cloud computing, is presented in Section 4. Section 5 concludes the paper. 2. Recent advances of Cloud computing Figure 1. Cloud computing in Google trends Currently numerous projects from industry and academia have been proposed, for example, RESER- VOIR project [23] - IBM and European Union joint research initiative for Cloud computing, Amazon Elastic This section discusses several projects which are currently devoted to Cloud computing. 2.1. Globus virtual workspace service and Nimbus A virtual workspace [10, 9] is the abstraction of an execution environment that can be made dynamically avail- 978-0-7695-3352-0/08 $25.00 2008 IEEE DOI 10.1109/HPCC.2008.38 825

able to authorized clients by using well-defined protocols. The abstraction captures resource quota assigned to such execution environments during deployment (such as CPU or memory) as well as software configuration aspects of the environment (such as operating system installation or provided services). The workspace service allows a Globus Toolkit client to dynamically deploy and manage workspaces. The virtual workspace services consist of the following interfaces [10, 9]: The Workspace Factory Service has one operation called create. Create has two required parameters: workspace metadata and a deployment request for that metadata. Once created, a workspace is represented as a WSRF resource and can be inspected and managed through operations of the Workspace Service. The Group Service allows an authorized client to manage a group of workspaces as a whole. The Status Service offers the interface through which a client can query the usage data the service has collected about it. Based on Globus virtual workspace services, a cloudkit named Nimbus [17] has been developed to build scientific Clouds. With Nimbus client, users could: browse virtual machine images inside the cloud, submit their own virtual machine images to the clouds, deploy virtual machines, and query virtual machine status, and finally access the virtual machine. 2.2. OpenNEbula OpenNEbula (former GridHypervisor) is a virtual infrastructure engine that enables the dynamic deployment and re-allocation of virtual machines in a pool of physical resources. OpenNEbula extends the benefits of virtualization platforms from a single physical resource to a pool of resources, decoupling the server not only from the physical infrastructure but also from the physical location [19]. OpenNEbula contains one frontend and multiple backends. The frontend provides users with access interfaces and management functions. The backends are installed on Xen servers, where Xen hypervisors are started and virtual machines could be backed. Communications between frontend and backends use SSH. OpenNEbula gives users a single access point to deploy virtual machines on a locally distributed infrastructure. Figure 2. OpenNebula architecture 2.3. Amazon Elastic Compute Cloud Amazon Elastic Compute Cloud (EC2) [16] is a Web service that provides resizable compute capacity in the cloud. It is designed to make Web-scale computing easier for developers. Amazon EC2 s simple Web service interface allows users to obtain and configure capacity with minimal friction. It provides users with complete control of computing resources and lets them use Amazon s proven computing environment. Amazon EC2 reduces the time required to obtain and boot new server instances to minutes, allowing users to quickly scale capacity, both up and down, as computing requirements change. Amazon EC2 changes the economics of computing by allowing users to pay only for capacity that they actually use. With Amazon EC2 users can: create an Amazon Machine Image (AMI) containing the applications, libraries, data and associated configuration settings, or use Amazon s pre-configured, templated images to get up and running immediately; upload the AMI into Amazon Simple Storage Service (S3). Amazon EC2 provides tools that make storing the AMI simple. Amazon S3 provides a safe, reliable and fast repository to store user s images; use Amazon EC2 Web service to configure security and network access; choose the type of instance users want to run; start, shutdown, and monitor as many instances of user s AMI as needed, using the web service APIs; pay for the CPU time and bandwidth that user actually consume. 826

2.4. Discussion We have studied the solutions for network configuration, data management, virtual machine infrastructure deployment inside the cloud. Nimbus & Globus virtual workspace provide three network configurations: public mode picks a public IP address from a pool for virtual machine, private mode picks a private IP address from a pool for virtual machine, and advisory mode gives a static IP address for virtual machine The solutions are sometimes however beyond of some user scenarios. For example, a data center might employ a central DHCP service, which allocates dynamic IP addresses for all virtual machines. Globus virtual workspace in addition requires to contact all the backends of the local infrastructure. Sometimes a computer center might employ a local virtualization management system, like ware Infrastructure [29], to manage local hosting resources. It would pay off, in our viewpoint, that Globus virtual workspace talk with a local management system. The same scenario happens when Globus Toolkit works together with local resource scheduler like OpenPBS [20], or Condor [12]. OpenNEbula employs NIS (Network Information System) to manage a common user system and NFS (Network File System) for virtual machine image management. However it has been widely recognized that NIS has a major security flaw: it leaves users password file accessible by anyone in the entire network. To employ OpenNEbula in professional way, it is better to merge OpenNEbula with some modern infrastructure solutions, e.g., LDAP [11] and Oracle Cluster File System [21]. 3. Cloud computing: definition, characterization and Enabling technologies 3.1. Functionalities Computing clouds render users with services to access hardware, software, and data resource; Furthermore, some configurable integrated platforms for users could be supported: HaaS: Hardware as a Service Hardware as a Service was coined possibly at 2006. As the result of rapid advances in hardware virtualization, IT automation, and usage metering and pricing, users could buy IT hardware - or even an entire data center/computer center - as a pay-as-you-go subscription service. The HaaS could be flexible, scalable and manageable to meet your needs [2]. SaaS: Software as a Service Software or application is hosted as a service and provided to customers across the Internet. This mode eliminates the need to install and run the application on the customer s local computer. SaaS therefore alleviates the customer s burden of software maintenance, and reduce the expense of software purchases by ondemand pricing. DaaS: Data as a Service Data in various formats, from various sources, could be accessed via services to users on the network. Users could, for example, manipulate remote data just like operate on local disk; or access data in a semantic way on the Internet. Based on the support of HaaS, SaaS, and DaaS, Cloud computing thereafter delivers Platform as a Service (PaaS) for users. Users thus can on-demand subscribe a computing platform with requirements of hardware configuration, software installation and data access demands. Figure 3 shows the relationship between above services. HaaS PaaS SaaS DaaS Figure 3. Cloud functionalities 3.2. Key features Cloud computing distinguishes itself from other computing paradigms, like Grid computing [7], Global computing [6], Internet Computing [13], in following aspects: User-centric interfaces Cloud services could be accessed with user-centric interfaces, which means: The Cloud interfaces do not force users to change their working habits, e.g., developing language, compiler, operating system, and so on. The Cloud client which is required to be installed locally is lightweight, for example, Nimbus Cloudkit client size is around 15MB. Cloud interfaces are location independent and can be accessed by some well established interfaces like Web service and Internet browser. 827

On-demand service provision Computing Clouds provide resources and services for users on-demand. Users can customize required computing environments later on, for example, software installation, network configuration, as users normally own root privilege. QoS guaranteed offer The computing environments provided by computing Clouds can guarantee QoS for users, e.g., hardware performance like CPU bandwidth and memory size. Autonomous System The Computing Cloud is an autonomous system and managed transparently to users. Hardware, software and data inside clouds can be automatically reconfigured, orchestrated and consolidated to a single platform image, finally rendered to users. 3.3. Enabling technologies A lot of enabling technologies contribute to the Cloud computing, here we identify several state-of-the-art techniques: information sharing, and, most notably, collaboration among users. These concepts have led to the development and evolution of Web-based communities and hosted services [4]. Mashup is a Web application that combines data from more than one source into a single integrated storage tool [3]. SmugMug [25] is an example of Mashup, which is a digital photo sharing website, allowing the upload of an unlimited number of photos for all account types, providing a published API which allows programmers to create new functionality, and supporting XML-based RSS and Atom feeds. Globus Virtual Workspace Service OpenNEbula frontend Access point Virtualization Virtualization technologies multiplex hardware and thus provide flexible and scalable platforms. Virtual machine techniques, such as ware [29] and Xen [1], offer virtualized IT-infrastructures ondemand. Virtual network advances, such as VPN [5], support users with a customized network environment to access cloud resources. Xen Hypervisor Host Xen Hypervisor Host SSH Xen Hypervisor Host Serviceflow and workflow orchestration Computing clouds offer a complete set of service images on-demand, which could be composed by services inside the Cloud. Cloud should be able to automatically orchestrate services from different sources and of different types to form a serviceflow or workflow for users. virtual network domain Oracle File System physical network domain Web service and SOA Cloud services are normally exposed as Web services, which follow industry standards, like WSDL [28], SOAP [24] and UDDI [18]. The services organization and orchestration inside clouds could be managed in a Service Oriented Architecture (SOA). A set of Cloud services furthermore can be included a SOA, make themselves available on various distributed platforms and can be accessed across networks. Web 2.0 and Mashup Web 2.0 describes the trend in the use of World Wide Web technology and web design to enhance creativity, Figure 4. Cumulus architecture World-wide distributed storage system A Cloud storage model should foresee: A network storage system, which is backed by distributed storage providers (e.g., data centers), offers storage capacity for users to lease. The data storage could be migrated, merged, and managed transparently to end users for whatever data formats. Examples are Google File System [8] and Amazon Elastic Storage [16]. 828

A distributed data system which provides data sources accessed in a semantic way. Users could locate data sources in a large distributed environment by the logical name instead of physical locations. Virtual Data System (VDS) [27] could be good reference. 4. Cumulus: a scientific Cloud for data center Cumulus is an on-going project of Cloud computing at our site. We design Cumulus in a layered architecture (see also Figure 4): Globus virtual workspace service resides on the access point of Cumulus, accepts users requirements of virtual machine operation. The OpenNEbula works as Local Infrastructure Virtualization Manager (LI). The frontend of Open- NEbula works on the Cumulus access point and get messages from Globus virtual workspace service. OpenNEbula frontend communicates to its backends and Xen hypervisors on the hosts via SSH for virtual machine manipulation. Virtual machines stay in a separate network domain. Network solution for virtual machines could be: Virtual machines could start with Xen virtual network interface and configure like physical machine. For example, virtual machines might be configured with dynamic IP addresses by listening to a center DHCP service in the network domain. Virtual network technologies could be used for virtual machine network space. For example, VNET [15] ties virtual machines together efficiently and makes them appear to users. To reach a productional quality, we build the local infrastructure using IBM Bladecenter as backend and Oracle server [22] as operating system. All the hosts and virtual machines are backed by Oracle Cluster File System [21]. Virtual machine images and templates are stored in a Oracle File System, which could be mounted by all hosts. Application level software are also saved in the Oracle data server. Virtual machines could thus automatically mount software installation packages required by users. The results so far look promising. 5. Conclusion This paper reviews the recent advances of Cloud computing and presents our early definition of Cloud computing, its interfaces, and its features. We also discuss our experience of building a scientific Cloud for a data center. References [1] P. Barham, B. Dragovic, K. Fraser, S. Hand, T. L. Harris, A. Ho, R. Neugebauer, I. Pratt, and A. Warfield. Xen and the art of virtualization. In Proceedings of the 19th ACM Symposium on Operating Systems Principles, pages 164 177, New York, U. S. A., Oct. 2003. [2] Here comes HaaS [URL]. http://www.roughtype.com/archives/2006/03/here comes haas.php [3] Web 2.0 definition [URL]. http://en.wikipedia.org/wiki/mashup (web application hybrid)/, [4] Web 2.0 definition [URL]. http://en.wikipedia.org/wiki/web 2/, access on June 2008. [5] B. Gleeson etc. A framework for ip based virtual private networks. Rfc2764, The Internet Engineering Task Force, Feb. 2000. [6] G. Fedak, C. Germain, V. Néri, and F. Cappello. Xtremweb: A generic global computing system. In Proceedings of the 1st IEEE International Symposium on Cluster Computing and the Grid, pages 582 587, 2001. [7] I. Foster and C. Kesselman. The grid: blueprint for a new computing infrastructure. Morgan Kaufmann, 1998. [8] S. Ghemawat, H. Gobioff, and S. Leung. The google file system. In Proceedings of the 19th ACM Symposium on Operating Systems Principles, pages 29 43, 2003. [9] K. Keahey, K. Doering, and I. Foster. From sandbox to playground: dynamic virtual environments in the grid. In Proceedings of the 5th International Workshop on Grid Computing, pages 34 42, 2004. [10] K. Keahey, I. Foster, T. Freeman, and X. Zhang. Virtual workspaces: achieving quality of service and quality of life in the grid. Scientific Programming, 13(4):265 275, 2005. [11] V. A. Koutsonikola and A. Vakali. Ldap: Framework, practices, and trends. IEEE Internet Computing, 8(5):66 72, 2004. [12] M. Litzkow, M. Livny, and M. W. Mutka. Condor - a hunter of idle workstations. In Proceedings of the 8th International Conference on Distributed Computing Systems, pages 104 111, 1988. 829

[13] M. Milenkovic, S. H. Robinson, R. C. Knauerhase, D. Barkai, S. Garg, V. Tewari, T. A. Anderson, and M. Bowman. Toward internet distributed computing. IEEE Computer, 36(5):38 46, 2003. [14] IBM Blue Cloud project [URL]. http://www- 03.ibm.com/press/us/en/pressrelease/22613.wss/, access on June 2008. [15] A. I. Sundararaj and P. A. Dinda. Towards virtual networks for virtual machine grid computing. In Proceedings of the 3rd Virtual Machine Research and Technology Symposium, pages 177 190, 2004. [16] Amazon Elastic Compute Cloud [URL]. http://aws.amazon.com/ec2, access on Nov. 2007. [17] Nimbus Project [URL]. http://workspace.globus.org/clouds/nimbus.html/, [18] OASIS UDDI Specification [URL]. http://www.oasisopen.org/committees/uddi-spec/doc/tcspecs.htm, access on June 2008. [19] OpenNEbula Project [URL]. http://www.opennebula.org/, access on Apr. 2008. [20] OpenPBS [URL]. http://www.pbsgridworks.com/, access on Nov. 2007. [21] Oracle Cluster File System [URL]. http://oss.oracle.com/projects/ocfs/, access on June 2008. [22] Oracle Virtual Machine [URL]. http://www.oracle.com/technologies/virtualization/index.html/, [23] Reservoir Project [URL]. http://www- 03.ibm.com/press/us/en/pressrelease/23448.wss/, [24] Simple Object Access Protocol (SOAP) [URL]. http://www.w3.org/tr/soap/, access on Nov. 2007. [25] SmugMug [URL]. http://www.smugmug.com/, access on June 2008. [26] Status Project [URL]. http://www.acis.ufl.edu/vws/, [27] Virtual Data System [URL]. http://vds.uchicago.edu/, access on Nov. 2007. [28] Web Service Description Language (WSDL) [URL]. http://www.w3.org/tr/wsdl/, access on Nov. 2007. [29] ware virtualization technology [URL]. http://www.vmware.com, access on Nov. 2007. 830