Mobile Cloud Computing T-110.5121 Open Source IaaS Tommi Mäkelä, Otaniemi
Evolution Mainframe Centralized computation and storage, thin clients Dedicated hardware, software, experienced staff High capital and operational expenses Strived for efficiency due to the costs Client-server Distributed computation and storage, thick clients Less dedicated hardware, software licenses, staff Lower capital, high operational expenses Strives for agility because of lower costs 2
Evolution In the long run, servers have become less expensive and therefore companies have been able to provide more IT services for their employees and customers Typically servers are dedicated for certain services, such as, email. Some servers are underutilized, because exact resource planning is difficult Unnecessary hardware investments Wasteful electricity consumption Insufficient resources are also problematic, because companies will loose some customers permanently 3
Evolution Server virtualization Virtualization makes it possible to run multiple operating systems simultaneously on the same physical server Utilization increases from 10-15% to 70-80% Common solution is paravirtualization, where one OS hosts others. Access to allocated resources is offered by a hypervisor, which also enables manipulation of instances, such as replication. In full virtualization a hypervisor runs on top of hardware without host OS Widely used hypervisors are ESXi, KVM and Xen 4
Evolution Consolidating several underutilized servers into a few devices reduces need for investments and electricity. Virtualized servers, however, still need administration Private cloud services Advanced middleware with virtualized servers enable automation, which reduces need for administration Lower capital and operational expenses Employees and other end users are able to allocate resources by themselves without the help of IT staff Economies of scale: up to 1000 servers per admin 5
Evolution Commercial players, who initially offered virtualization solutions, have started to provide private clouds, such as Vmware and Citrix, through evolution or purchases Nowadays there are many open source middleware available, which mainly offer IaaS type of services In the long run, open source solutions have become more reliable and versatile mature architectures However, SLAs require rapid responses that are not guaranteed with open source solutions There is still markets for commercial products with extensive support 6
Evolution Open source solutions typically consists of many existing well tried software including databases, hypervisors, network and web services (MySQL, KVM, Apache) Some of the projects are as old as the concept of cloud computing or the first commercial solutions There are around three open source forerunners Eucalyptus, OpenNebula, OpenStack Recently CloudStack, former Cloud.com, has increased its profile and has become one of the worthy solutions. It was acquired by Citrix and then released as open source 7
Cloud middleware Eucalyptus
Eucalyptus Initial release in May 2008, written with C and Java Unlike other examples, it involves some proprietary parts while rest of the code is licensed under GPL v3. The founders of the project created open core which consists of open source code and commercial product with proprietary code Bad approach Because it is not entirely open source, some players have switched to other IaaS solutions, for example, NASA, who wanted to modify open core for its needs Some companies rely on it, such as NSN, EA, Fujitsu 9
Eucalyptus It utilizes full virtualization (ESXi) or paravirtualization in which case it requires host OS for its software. Currently suitable hosts are Red Hat Enterprise Linux and CentOS So do not try to install it on Windows servers. The list of compatible Windows guest OS is also very short: W7 However, all modern Linux distributions are supported Even the basic configuration requires several physical servers to function (Min 4). It has upper limits as well Its development life cycle has shortened in recent years several releases per year. The latest version is 3.3 10
Eucalyptus Since its release, Eucalyptus has focused on the compatibility with Amazon Web Services. Currently they are technology partners. Apparently Amazon has promised not to modify its APIs unexpectedly The infrastructure consists of five main components, which utilize these APIs for internal communication Cloud Controller: Front-end with client tools, user interface Walrus Controller: Simple storage for images, snapshots Cluster Controller: Manages nodes, instances, networks Node Controller: Interacts with cluster and hypervisor Storage controller: Block storage, volume snapshots 11
Eucalyptus Simplified infrastructure Identical services 12
Eucalyptus Large scale installations For the whole system 1x Cloud controllers 1x Walrus Storages For each cluster (Max 4) 1x Cluster controllers 1x Storage controllers 32x Node controllers For each node controller 16 virtual machines Large scale Eucalyptus has up to 2048 instances 13
Eucalyptus Emphasizes hybrid cloud approach for developers Test applications and assure their quality within a private cloud and then release them in Amazon Benefits of the approach: separation of testing and production environments. No need to buy instances from the public cloud Lower developing costs? Furthermore, more control over IaaS infrastructure Disadvantages: could lead to severe vendor lock-in Even if Eucalyptus has profiled itself as an extension of Amazon, other IaaS solutions also support Amazon 14
Cloud middleware OpenNebula
OpenNebula Released in March 2008, written with several languages Apache licensed Free software license, which allows distribution, modifications and usage. Promoted as fully open source and fully-featured Not a limited edition of an enterprise version or open core release Some major players have appreciated OpenNebula s features and licenses including IT vendors IBM and Dell and telecommunication company Telefonica It seems to be quite popular solution for hosting and cloud providers. Tried out by science institutes, CSC 16
OpenNebula Can use full virtualization or paravirtualization, where hosts are Linux. OpenNebula points out that there is no vendor lock-in due to platform independent hypervisors Common hypervisors provide some compatibility Wide support for different networking and storage solutions, which also improves portability. Allows to select most suitable or familiar technologies The support for guest OS depends only on the hypervisors, no specific limitations by middleware Many development life cycles, current release 4.2 17
OpenNebula Emphasizes interoperability and versatility by supporting several APIs, both open source solutions and de facto standards. However, only two available at the moment Supposedly highly customizable for users needs due to different interfaces and extensions The main services of OpenNebula installation Management and scheduler (oned, mm_sched) Monitoring and accounting (onecctd) Graphical User Interface (sunstone) Cloud API (e2-query or occi) 18
OpenNebula Simplified infrastructure 19
OpenNebula Services where to choose from 20
OpenNebula Basic installation involves a front-end with most of the OpenNebula services, network-attached storage, up to 500 hosts with instances Latter only requires hypervisors, SSH connections and privileges Minimum installation is light (~10MB) and architecture seems simple and scalable Just add more hosts and storage 21
OpenNebula Hybrid clouds can be created by using Amazon s public APIs. For internal communication, however, it relies on XML-RPC protocol (XML encoded calls + HTTP request) More open approach than Eucalyptus s Provides also an abstraction layer for XML-RPC known as OpenNebula Cloud API, which allows to create calls with Java or Ruby Simpler for developers For users, it offers predefined catalogues, where to choose images, network setups, templates, monitoring Promoted as feature-rich and light-weight. Just like Linux 22
Cloud middleware OpenStack
OpenStack Initial release in October 2010, written in Python. It was a combination of two separate projects Nova that was build from scratch by NASA, who needed highly scalable compute resources and VM management Swift that was created by Rackspace to store objects Apache licensed open source software, which is managed by non-profit foundation nowadays. Long listing of commercial users, Cisco, Intel, HP, At&t, Deutsche Telekom, Paypal, DreamHost and Rackspace Moreover, CSC, NASA and IBM have switched to it 24
OpenStack Full virtualization or paravirtualization (ESXi, KVM, Xen) Initially most of the tasks were performed by Nova In addition to new features in each release, it has become highly modular, which has reduced its complexity Easier to develop and expand No apparent restrictions for host OS, guides seems to favor Ubuntu. It is actually the default cloud middleware for Ubuntu Server Boosts OpenStack s popularity? Several development life cycles, current release is the seventh called Grizzly, less than a year between cycles 25
OpenStack Each of the major service have their own API, which could be utilized by applications. Also Amazon support APIs provide abstraction because of Python language The main services of the latest OpenStack installation Nova: Virtual machine management and execution Quantum: Networking with attachable back-ends Keystone: Authentication service for cloud users Cinder: Block storage volumes through iscsi Swift: Object storage to store individual files Glance: Image storage for virtual machines Horizon: User interface via web browser 26
OpenStack Basic idea 27
OpenStack Simplified infrastructure Minimum 3 nodes 3 interfaces or 2 with bridges VMs traffic Exposed for applications 28
OpenStack Fully standalone private and public cloud solution, which modularity allows future development and expansions For users, Horizon offers instance flavors, which are predefined by admins (number of CPUs, memory) Nowadays GUI supports majority of CLI functions Definitely feature-rich, but also heavy (few 100MB) Easy to install, but requires plenty of configuration Recent releases have enhanced networking In addition to server virtualization, network virtualization 29
Notes Google App Engine and Microsoft Azure mainly PaaS, because none of the cloud middleware supports them Number of active users in 2012: OpenStack > Eucalyptus > OpenNebula Community activity in 2012 Openstack >> Eucalyptus > OpenNebula Community population in 2012 Eucalyptus >>> Openstack > OpenNebula CloudStack has had almost as high number of active users and community activity in 2012 as OpenStack http://www.revistacloudcomputing.com/2013/01/comparativa-de-las- plataformas-cloud-abiertas-openstack-opennebula-eucalyptus-y-cloudstack/ 30
Tommi Mäkelä, Otaniemi Questions?