31 A STUDY ON OPEN SOURCE CLOUD COMPUTING PLATFORMS ABSTRACT PROF. ANITA S. PILLAI*; PROF. L.S. SWASTHIMATHI** *Faculty, Prin. L. N. Welingkar Institute of Management Development & Research, Bengaluru, India. **Faculty, S.I.E.S. College of Management Studies, Navi Mumbai, India. Cloud computing is the convergence and evolution of several concepts from virtualization, distributed application design, grid, and enterprise IT management to enable a more flexible approach for deploying and scaling applications. Cloud Computing, as a utility, has the potential to transform the IT industry by providing software as a service and revolutionizing the way, IT hardware is designed and purchased. Several Cloud computing platforms have been launched in the last few years. With the advent of several Open Source Cloud computing platforms, guaranteeing performance and uptime it is not easy for a nonexpert users to choose from the different platforms without comprehending the characteristics and advantages of each of these platform. This paper analyses the characteristics, architecture and applications of some of the most popular open source cloud computing platforms like EUCALYPTUS, OpenStack, Nimbus and OpenNebula to facilitate novice user to select from the different cloud platforms. KEYWORDS: Open Source Cloud platforms, EUCALYPTUS, OpenStack, Nimbus, Open Nebula. INTRODUCTION In 1969, Leonard Kleinrock one of the chief scientists of the original Advanced Research Projects Agency Network (ARPANET) project which seeded the Internet, said: As of now, computer networks are still in their infancy, but as they grow up and become sophisticated, we will probably see the spread of computer utilities which, like present electric and telephone utilities, will service individual homes and offices across the country. Leonard Kleinrock vision of providing computing service as an utility service is now a reality thanks to cloud computing. The cloud computing service users need not invest on complex infrastructure. The consumer access the services based on their requirements without regard to where the services are hosted and pay the providers for those resources they use, thus considerably bringing down the information technology cost. This model is referred to as utility computing or Cloud computing. Cloud Computing is a contemporary technology developed after years of research in distributed computing, virtualization, utility computing, and software services. Cloud Computing is the key for businesses to create
32 efficient and scalable automated applications, while reducing technology over-heads for user and providing cost effective solution In general, Cloud computing customers do not own the physical infrastructure. The customer avoids capital expenditure on hardware, software, and services by renting usage from a Cloud computing provider. Cloud computing can provide three kind of service mode IaaS, PaaS and SaaS. Cloud application services or "Software as a Service (SaaS)" deliver software as a service over the Internet to be used by customers on demand, eliminating the need to install and run the application on the customer's own computers thus simplifying maintenance and support. SaaS providers enable customers to access application remotely through web and provide activities that are managed from central locations rather than at each customer s site Cloud platform services or "Platform as a Service (PaaS)" is a platform to developers where applications are developed and deployed as a service over the web without the complexity of buying and managing the services to use the underlying infrastructure. Cloud computing platforms possess characteristics of both Clusters and Grids, with support for virtualization, dynamic provisioning of Web Services, and strong support for creating third party value added services. Platform consists of infrastructure software, and typically includes a database, middleware and development tools. Cloud infrastructure services or "Infrastructure as a Service (IaaS)" delivers computer hardware (server, storage, network) and associate software (operating systems, file systems, virtualization) environment as a service. Clients buy these resources as fully outsourced services, instead of purchasing servers, software, data centre space or network equipment. From the point of deployment, cloud computing platform include three kinds, that is public cloud, private cloud and hybrid cloud. In Public cloud or external cloud specifies cloud computing resources are dynamically provided on service basis over the Internet, via web applications/web services, from an off-site third-party provider who shares resources and bills on computing basis. Private cloud or internal cloud is controlled and managed completely by the enterprise. To meet the benefits of public and private approach, newer execution models have been developed to combine public and private clouds in which each kind of cloud is independent, however they are combined with standards techniques so that data and applications are transplant. OBJECTIVE OF STUDY With an increase in the number of cloud platforms available both commercial and open source enterprise has wide choice.the open-source projects provide an important alternative for organizations that do not wish to use a commercially provided cloud. However, since there are many open cloud platforms, each with its own characteristics and advantages, it is not easy to choose between these platforms. The objective is to study and perform comparison of some of the most popular open cloud platforms.since the most popular open source platforms are EUCALYPTUS, OpenStack, Nimbus and OpenNebula an analysis and comparison of these platforms, will help users to
33 choose an open cloud platform depending on their requirement The study is to also identify challenges common to the platform and areas for improvement. RESEARCH METHODOLOGY To address the concerns enumerated in the study objectives, a deductive research approach relying on secondary data was done. This paper is based on data from Open source cloud computing website, its online documentations Research articles Reference books Published thesis OPEN CLOUD PLATFORMS EUCALYPTUS Elastic Utility Computing Architecture for Linking Your Programs To Useful Systems - has been developed as an open-source software infrastructure for implementing cloud computing on clusters. Its originated are as a research project in the Computer Science Department at the University of California, Santa Barbara where its authors were studying the use of open source to create new, highly scalable, and high performance distributed computing environments. EUCALYPTUS 2.0 is a Linux-based software architecture that implements scalable, efficiency-enhancing private and hybrid clouds within an organization's IT infrastructure. EUCALYPTUS uses computational and storage infrastructure for academic research groups and provides a platform that is modular and open to experiment. The system allows users to start, control, access, and terminate entire virtual machines using an emulation of Amazon EC2 s SOAP and Query interfaces. One striking feature of Eucalyptus, is its choice of the Amazon AWS APIs as the API it supports. The current interface to EUCALYPTUS is compatible with Amazon's EC2 interface and uses the EC2 tools directly and duplicates Simple Storage Service (S3) service. Eucalyptus implements a distributed storage system called Walrus which is designed to imitate Amazon s S3 distributed storage. The infrastructure is designed to support multiple client-side interfaces. EUCALYPTUS is implemented using commonly available Linux tools and basic Web-service technologies making it easy to install and maintain. NIMBUS
34 Nimbus is an open-source toolkit focused on providing Infrastructure-as-a-Service (IaaS) cloud to its client via WSRF-based or Amazon EC2 WSDL web service APIs. Nimbus project explicitly advertises itself as a science cloud solution. However Nimbus have supported many nonscientific research domain applications. Nimbus v2.9 is incredibly customizable. Nimbus supports the Xen hypervisor and virtual machine schedulers PBS and SGE. It allows deployment of self-configured virtual clusters via contextualization. It is configurable with respect to scheduling, networking leases, and usage accounting. Nimbus provides a complementary tool Cumulus implementation of a quota-based storage cloud designed for scalability and allows providers to configure multiple storage cloud implementations. Nimbus offers scaling tools allowing users to automatically scale across multiple distributed providers, These tools "sky computing tools" operate in a multi-cloud environment combining private and public cloud capabilities. Nimbus allows developers to extend and customize IaaS by providing an open source implementation Workspace Service can be configured to support different virtualization implementations, resource management options interfaces. Nimbus, provides most of the customization to the administrator and not to the user and has several components. These components include the image storage, previously GridFTP and now Cumulus. Nimbus Platform tools include cloudinit.d and Context Broker. cloudinit.d is a tool for launching, controlling, and monitoring cloud applications. cloudinit.d automates the creation of virtual machines, their contextualization, and the messaging between VMs. The Context Broker is a service that allows clients to coordinate large virtual cluster launches automatically and repeatably. OPENSTACK OpenStack launched in July 2010 is an initiative of Rackspace Hosting and NASA. OpenStack is designed to create freely available code, standards, and common ground for the benefit of both cloud providers and cloud customers. The goal of OpenStack 2.0 is to allow organization to create and offer cloud computing capabilities using open source software running on standard hardware. The project boasts of compute, storage and image service component. OpenStack Compute is open source software designed to provision and manage large networks of virtual machines, creating a redundant and scalable cloud computing platform. It has the software, control panels, and APIs required orchestrating a cloud, including running instances, managing networks, and controlling access through users and projects. OpenStack Storage is software for creating redundant, scalable object storage using clusters of commodity servers to store terabytes or even petabytes of data. OpenStack Image Service (code-named Glance) provides discovery, registration, and delivery services for virtual disk images. A multi-format image registry, OpenStack Image Service allows uploads of private and public images in a variety of formats, including VHD, VDI, Qemul. Service providers, companies that use private cloud and Institutions with physical hardware can use OpenStack for large-scale cloud deployments. All of the code for OpenStack is freely available under the Apache 2.0 license. OpenStack is aiming at Virtualization Portability where user will be able to move from virtualization technologies including those hosted in the cloud and will be able to migrate seamlessly, that includes VMs running in VMware, Xen, HyperV and KVM. Once in the cloud they will be able to move across clouds public and private unencumbered Amazon, Rackspace, Eucalyptus,
35 Ubuntu Enterprise Cloud and others. Adoption of a widespread virtualization standard like Open Virtualization Format (OVF) has helped OpenStack OPENNEBULA OpenNebula is a fully open-source tool kit to build any type (private, public and hybrid) of infrastructure based cloud. OpenNebula is platform agnostic with broad hypervisor support, allowing to leverage the existing IT infrastructure. The cloud provides infrastructure users with an elastic platform for fast delivery and scalability of services to meet dynamic demands of end-users. It allows the user to dynamically host the services in VMs, enables monitoring and control using interfaces like command line interface, XML- RPC API, Libvirt virtualization API. OpenNebula manages the data center of private cloud and infrastructure of cluster running Xen, KVM or VMware and also support hybrid cloud to connect local and public infrastructure which is very useful to build highly scalable cloud computing environment. OpenNebula supports heterogeneous execution environments with multiple, even conflicting, software requirements on the same shared infrastructure with full control of the lifecycle of virtualized services management. ANALYSIS & FINDINGS The analysis was based on comparison of the EUCALYPTUS, OpenStack, Nimbus and OpenNebula. A comparative study of the architecture of these open platforms was done. TABLE 1: FEATURES OF PLATFORM EUCALYPTUS Nimbus OpenStack OpenNebula Focus Infrastructure Infrastructure Infrastructure Infrastructure Cloud Implementation Private & Hybrid public Public & Hybrid Private, Hybrids & Public Form of cloud IaaS IaaS IaaS IaaS User access interface Web Service, Command-line EC2 WSDL,WSRF Web-interface libvirt, EC2, OCCI API Scalability Scalable scalable scalable Dynamical, scalable Service Type Compute, Storage Compute, Storage Compute(Nova), Storage(Swift) Compute, Storage Compatibility Support EC2,S3 support EC2 Supports multiple open, multiplatform
36 platforms Web APIs Yes Yes Yes Yes Deployment Dynamic Dynamic Dynamic Dynamic Virtualization Xen (versions 3.*), KVM Hypervisor Support Xen Xen and KVM VMWare, Xen and KVM OS support Linux Linux Linux, Ubuntu Linux Programming Framework Linux-based, Java Java, Python Python, using the Java Tornado and Twisted frameworks TABLE 2: COMPARISON OF OPEN CLOUD PLATFORM CHARACTERISTCS Eucalyptus Nimbus OpenStack OpenNebula Disk Image Options Options set by admin Depends on configuration Glance has RESTful API In private cloud, most libvirt options left open. Disk Image Storage Walrus, which imitates Amazons S3 Cumulus (recent update from GridFTP) Nova A shared file system, by default NFS, or SCP Hypervisors Xen, KVM (VM Ware in non-open source) Xen, KVM Open Virtualization Format (OVF) Xen, KVM, VMware Unique Features User management web interface Nimbus context broker Unified Authentication System VM migration supported
37 TABLE 3: COMPARISON OF OPEN CLOUD PLATFORM CHARACTERISTCS Eucalyptus Nimbus OpenStack OpenNebula Concept Mimic Amazon EC2 Cloud resources tailored to scientific researchers Virtualization Portablilty Private, highly customizable cloud Customizable Many parts except Some for admin,less for image for user storage and globus credentials Basically everything Basically everything Platform Security Tight. Root required for many things. Fairly tight, unless deploying a fully private cloud. Loose Looser, but can be Made more tight if needed. User Security Users are given custom credentials via a web interface Users x509 credential is registered with cloud Role Based Access Control User logs into head (unless optional front-end used) DHCP On cluster controller On individual compute node On network node Variable An Ideal Setting Large group of machines for Deploy for less to semi- Large scale deployment Smaller group of machines
38 bunch of semitrusted users trusted users familiar with x509 for highly trusted users Network Issues dhcpd on cluster controller dhcpd on every node and Nimbus assigns MAC nova.conf or VLAN-based networking with DHCP if no network manager is defined in nova.conf Admin must set manually but has many options Nimbus in comparison to other platforms pays most attention to capacity allocation and capacity overflow. Nimbus is in between Eucalyptus and OpenNebula on the customization chain. Nimbus provides large number of options for user and administrators in deploying the cloud. Its security level is slightly higher than OpenNebula, due to the required integration of Globus certificate credentials. Nimbus is more advantageous for the scientific community that might be less interested in the technical internals of the system, but has broad customization requirements. The community with the use of Globus Toolkit would be more conducive to sharing excess cloud time. OpenNebula is more open than Nimbus and exposes large amounts of the underlying software in the default private cloud configuration. OpenNebula permits maximum customizability and provides greater level of centralization to especially end-users (for private cloud). From the point of reliability OpenNebula is the most reliable open platform since it has considered rollback and fault tolerance mechanisms in the cloud implementation. It offers high level of customization to administrators and users hence this system works best for users who know what they are doing from a technical perspective and can, therefore, take advantage of the added available features. Alternatively, the administrator can use an optional front-end like EC2 to protect the users. OpenNebula is geared toward persons interested in the cloud or VM technology at their own end. OpenNebula is also ideal for organization that wants a few cloud machines quickly. The front-end provided by Eucalyptus euca2ools is very similar and compatible with Amazon s EC2 front-end programs thus allowing easy integration with the commercial cloud and it protects users from as many of the complexities of the underlying systems. Eucalyptus implements a distributed storage system called Walrus which is designed to imitate Amazon s S3 distributed storage. Eucalyptus is designed for corporate enterprise computing setting hence there is clear separation between user-space and admin-space. Root access is required for everything done by the administrator on the physical machines. Users are only allowed to access the system via a web interface or some type of front-end tools. OpenNebula and Eucalyptus, in their default configurations, do not do any real form of scheduling, in the sense of negotiating priority for processors. Eucalyptus does not give a
39 cap for space in the Walrus distributed storage. Nimbus allows for user to be given a cap on the number and size of VMs which they are allowed to create. CONCLUSION Cloud Computing has the potential to transform IT industry, making software even more attractive as a service and shaping the way IT hardware is designed and purchased. Open Cloud platforms offers real alternatives to end-user for improved flexibility and on demand services. The open cloud platform allow great amount of customization. This paper explained the characteristics, application and also provided a comparison of the most commonly used open cloud platforms. The analysis and summarization would help the users to understand the characteristics and would allow users to choose better services that fit their requirements. The users would be able to make a more informed decision on the open cloud platform to use according to the cloud types, interfaces, compatibility, implementation, deployment requirement, and development support. Since cloud computing is an evolving technology there are many features which are being added the comparison is based on the current features and technology available in these open source platform however there is need for incorporation of more features to improve these framework REFERENCES 1. Grid and Cloud Computing Architecture and Services, Mark Erlenmeyer Rochester Institute of Technology May, 2009 2. Market Oriented Cloud Computing: Vision, Hype, and Reality for Delivering IT Services as Computing utilities, R. Buyya, C. S. Yeo, and S. Venugopal, Keynote Paper, Proceedings of the 10th IEEE International Conference on High Performance Computing and Communications (HPCC 2008, IEEE CS Press, Los Alamitos, CA, USA), Sept. 25-27, 2008, Dalian, China. 3. Cloudbus Toolkit for Market-Oriented Cloud Computing, Rajkumar Buyya, Suraj Pandey, and Christian Vecchiola, Proceeding of the 1st International Conference on Cloud Computing (CloudCom 2009, Springer, Germany), Beijing, China, December 1-4, 2009. 4. Architectural Strategies for Cloud Computing, An Oracle whitepaper in Enterprise Architecture, August 2009 5. R. Buyya, C. S. Yeo, S. Venugopal, J. Broberg, and I. Brand. Cloud Computing and Emerging IT Platforms: Vision, Hype, and Reality for Delivering Computing as the 5th Utility. Future Generation Computer Systems, 25(6):599 616, June 2009 6. C. Hoffa, G. Mehta, T. Freeman, E. Deelman, K. Keahey, B. Berriman, and J. Good. On the Use of Cloud Computing for Scientific Workflows. SWBES 2008, December 2008.
40 7. Z. Lei, B. Zhang, W. Zhang, Q. Li, X. Zhang, and J. Peng. Comparison of Several Cloud Computing Platforms. Second International Symposium on Information Science and Engineering, pages 23 27, 2009. 8. http://en.wikipedia.org/wiki/cloud_computing#characteristics 9. http://www.sellsbrothers.com/writing/intro2tapi/default.aspx?content=pstn.htm 10. http://aws.amazon.com/ec2/ 11. http://appengine.google.com 12. http://www.eucalyptus.com/ 13. http://www.eucalyptus.com/pdf/whitepapers/cloud_builder_guide.pdf 14. http://open.eucalyptus.com/wiki/eucalyptusadministratorguide 15. http://openstack.org/ 16. http://www.salesforce.com/in/?ir=1 17. http://www.vmware.com/ 18. http://www.nimbusproject.org/ 19. http://opennebula.org/