1 Large Construction of a Cloud IaaS with Dynamic Resource Allocation Method Using OpenStack Chao-Tung Yang and Yu-Tso Liu Department of Computer Science, Tunghai University, Taichung City, Taiwan ROC Abstract In this thesis, we particularly focus on the use of the free open-source software, so that end users do not need to spend a large amount of software license fees. In the cloud, virtualization has numerous benefits, but also is a basic role to build a cloud environment. By virtualization, enterprises can maximize working efficiency without the need to install more facility in the computer room. In this thesis, we implemented a virtualization environment and perform experiments on it. The main subject of this study is how to use the OpenStack open-source software to build a cloud infrastructure with high availability and dynamic resource allocation mechanism. It provides a private cloud solution for business and organizations. It belongs to Infrastructure as a Service (IaaS), one of the three service models in the cloud. For the part of the user interface, the web interface can reduce the complexity of the user access to cloud resources. We measured the performance of live migration of virtual machines with different specifications and analyzed the data. Also according to live migration modes, we write an algorithm to solve the traditional dynamic migration problem that needs manually determining whether the machine load is too heavy or not; as a result, the virtual machine load level is automatically detected, and the purpose of automatic dynamic migration to balance resources of servers is achieved. Keywords-Live Migration; Cloud Computing; Dynamic Resource Allocation; KVM; OpenStack; I. MOTIVATION Cloud issues become more and more popular in recent years; because of this reason, the deployment of cloud infrastructure becomes one of the basic ability of IT staff. Virtualization is the basis of the cloud architecture. Virtualization allows us to maximize the performance of the server; thus, we can open multiple virtual machines in a physical machine. Each different virtual machine can be configured according to the needs of everyone, and with not only one operating system. Now there is a need for High Availability (HA) , . Because we do not want disconnection happens when users connect to the virtual machine, we can deploy multiple servers; thus, when a server is in need of repair or load is too high you can manually perform live migration   of VMs to other physical machines. We want to know live migration performance of the virtual machine with different specifications, and also want to implement a dynamic resource allocation algorithm ,  to achieve automatic live migration. This thesis focuses on cloud computing   infrastructure, especially the virtualization and live migration. The goal is to build a system which belongs to the private cloud, so management and deployment with VMs is a major mission. The information supplied by the system can be monitored including CPU utilization, disk usage, virtual machine space, and memory usage. This system can also perform live migration. When a problem occurs, the administrator can shift the user s virtual machine to another physical machine so that the user will not feel any abnormalities. It is an important and useful function for virtualization. Furthermore, we include a dynamic resource allocation algorithm to achieve automatic live migration. Meanwhile, this thesis also has experiments of live migration and carries out the system performance tests. We then analyze and observe the features of live migration from experimental results. In Chapter 2, we briefly sum up the trend of cloud computing and virtualization technology. We describe how to design and develop the system in Chapter 3. Chapter 4 shows and examines the experimental results. Finally in Chapter 5, we provide the conclusion and future Work. II. BACKGROUND REVIEW AND RELATED WORK A. Cloud Computing Google CEO Eric Schmidt at SES San Jose 2006 first used the term cloud computing. The basic concepts of cloud computing is that the client is provided with sufficient network bandwidth to connect to and use the huge storage and computing resources provided by the cloud provider; in this way, users do not have to waste time managing computer hardware and software, and do not need to buy a computer infrastructure to save total resource cost and energy consumption. According to the United States National Institute of Standards and Technology s  definition of cloud computing, cloud computing is a model, so that users can access resources on the Internet, such as networks, servers, storage, application, and services. Cloud computing consists of five basic characteristics, three service models, and four deployment models.
2 1) Basic Characteristics of Cloud Computing: On-demand self-service: The user can be served according to the demand for computing resources (such as servers or storage space), and the entire process is automatic unilaterally, without interaction with the resource provider. Broad network access: The service is provided via the Internet, and there are standard mechanisms to enable different client platforms (such as smart phones and laptops). Resource pooling: Services provided by the computing resources, such as storage space, network bandwidth, computing power, and number of virtual machines can be thought of as a large pool, based on demand at any time re-allocated to multiple users of different platforms. With abstract concepts, users do not need to understand the physical location of the resource, such as resource in which country or which data center. Rapid elasticity: Computing resources can not only be fast and flexible to provide or release, and for customers, resources are inexhaustible and can be free to buy. Measured service: Computing resources can be automatic controlled and optimized according to the characteristics of the provided services. Both providers and users can transparently monitor resource usage. 2) Cloud Computing Service Model: Cloud computing service model is divided into SaaS, PaaS, IaaS, shown in Figure 1. Figure 1. Cloud computing service model Infrastructure as a Service: Is a collection of virtualized hardware resources and related management functions, abstract computing, storage and network resources through virtualization technology, internal process automation and resource management optimization, which provide dynamic, flexible infrastructure to external users. Consumers of this layer can directly enjoy the convenience brought by IaaS services that they use the processing power, storage space, network components or middleware basic computing resources, but also control of the operating system, storage, deployed applications and firewalls, load balancers, but not control of the underlying architecture of the cloud. Platform as a Service: Cloud application for development, operation, management and monitoring of the environment; it can be said to be optimized cloud middleware, with excellent platform layer design to meet the cloud scalability, availability, and security requirements. The consumers of this layer can be provided through the platform provider program development tools to their own applications built on top of the cloud architecture, and are able to control the operation of the application environment (also has the right to host part of the control), but does not control the operating system, hardware or operating network infrastructure. Software as a Service: Is a collection of software application architecture at the infrastructure layer for these resources and environment of the platform layer above and delivered to the user over the network. This layer provides applications to let users access services through multiple networked devices by just opening a browser or networking interface and no longer need to worry about software installation and upgrades; costumer do not need to buy the software license, instead, are charged based on actual usage. For application developers, they can easily deploy and upgrade software, and do not need to manage or control the underlying cloud architecture, such as network, servers, operating systems, storage. 3) Cloud Computing Deployment Model: Cloud computing deployment model can be divided into the public cloud, private cloud, community cloud and hybrid cloud. Public Cloud: The cloud infrastructure available to the general public or a large industry group, owned by the organization selling cloud services, is flexible and cost-effective. The word public does not represent an absolute free, but may also represent a free or very cheap; in addition, public does not mean that the user data is available for anyone to view, and the cloud provider is usually the user to implement the access control mechanism. Private Cloud: Cloud infrastructure designed for the organization and operation, which may be deployed locally (on premise) by the organization itself or a third-party administrator or remote deployment (Off premise). Private cloud has the advantages of the flexibility of a public cloud environment. The network and users are subject to special limits, both data and procedures in the organization internal management, and less susceptible to network bandwidth, security concerns, and regulations limit. Cloud providers and users have more control over the cloud infrastructure to improve safety and flexibility. Community Cloud: Cloud infrastructure shared by
3 many members of the community with similar interests to control and use of cloud data and applications. They have common concerns, such as specific tasks, security requirements, and policy and compliance considerations. It may be provided by the organization or thirdparty management, with site deployment or remote deployment. Hybrid Cloud: The cloud infrastructure is composed of two or more clouds (private, community or public), such a cloud to maintain a single entity, but by the standards or proprietary technology contact with the data and applications with portable type. In this mode, the users will do usually non-business-critical information outsourcing, and dealt with in a public cloud, but at the same time control of the internal confidential services and information. B. Virtualization Virtualization is a broad term that refers to computing elements running on a virtual basis rather than on a real basis, in order to simplify management, and optimize resource solutions. The users can use the same cost to build a more suitable space, and thus save costs, and maximize utilization of space to play. This concept of re-planning to achieve maximum utilization of ideas with limited fixed resources according to different needs is called virtualization technology in the field of IT. Virtualization technology can expand the capacity of the hardware, and simplify software re-configuration process. CPU virtualization technology can use a single CPU to simulate multiple CPU in parallel and allow a platform to run multiple operating systems and applications running independently of each other in the independent space, thus significantly improving the efficiency of the computer. Virtualization technology and multi-tasking, and Hyper-Threading technology are totally different. Multitasking is run in parallel in an operating system more than one program at the same time; with the virtualization technology, you can run multiple operating systems at the same time, but each operating system has multiple programs to run each operation and the system runs on a virtual CPU or virtual host; Hyper-Threading technology is only a single CPU simulating dual CPUs to balance the program running performance, and these two simulated CPU is inseparable and can work together. The main purpose of the virtualization technology is that expensive mainframe resources can be fully utilized. This technology allows a host to run multiple operating systems. Today, a large number of server infrastructures cause energy consumption and cooling systems continues to grow. The virtualization technology has become a way to solve problems of rising equipment purchase and maintenance expenses. The virtualization technology is a key technology for cloud computing. Cloud computing is a wide virtualization pool of resources; it allows users to easily use and access through the Internet; that is, such resources into the form of a service, sent over the network to the user, so the users use the service according to personal needs. Virtualization is divided into two main techniques as full virtualization and para-virtualization. Full virtualization: In the full virtualization technology environment, the hypervisor simulates and provides the same environment as the physical hardware, and the operating system does not require changes to be able to run. In Figure 2 architecture, the hypervisor operation permissions Ring 0 level guest OS is running in Ring 1, applications run in Ring 3. Ring 0 permissions instruction is executed by the Guest OS after being converted via the hypervisor. Like VMware translates machine code (binary Translation) technology that enables virtualization instructions can be executed in Ring 0. But this technology will consume more hardware resources, causing degradation of the virtual machine performance. Figure 2. Full virtualization architecture Para-virtualization: The full virtualization Guest OS cannot directly execute Ring 0 permissions instruction, resulting in the need through the hypervisor to execution, and causing degradation of the virtual machine performance. In Figure 3, the Guest OS core to directly modify, in Figure 2 full virtualization architecture will be executed the Ring 0 permissions instruction, and replaced hypercall, and then access the hardware through the hypervisor. Thus, the operating system operating in a para-virtualization platform is limited because of the need to modify the operating system of the virtual machine for execution.
4 Figure 3. Para virtualization architecture C. Open Source for Virtualization 1) KVM: Kernel-base Virtual Machine (KVM)  is open source virtualization software on a Linux system. Linux version (February 2007 release), as part of the core into the KVM.2008 manufacturers Linux RedHat bought Qumranet in RHEL 5.4 version is built directly into the KVM virtualization technology. KVM Linux kernel combination, compared with other virtualization software, is easier to repair and system integration problems. KVM can run multiple client operating systems; each system has its own private virtualized hardware that supports Intel VT and AMD-v such hardware support for virtualization CPU. Through the support of the processor, the memory can through the KVM virtualization. Furthermore, I/O can through with QEMU virtualization. The KVM contains two important components, namely Kernel module (kvm.ko) can be used to manage virtual machines and simulated hardware. The other is the User Space process (qemu) is a PC hardware simulator can be made through kqemu run faster. KVM architecture is shown in Figure 4. 2) Xen: Xen  through the software layer to access the physical hardware independently isolated from each other on a single computer to run multiple guest operating system. The hypervisor plays a similar traffic cop, directing the hardware access and coordination of requests from the client operating system. In a Xen environment, there are two components. A virtual machine monitor (VMM), also called a hypervisor. The Hypervisor layer between the hardware and the virtual machine is the first to be loaded into the hardware layer. Hypervisor load can deploy virtual machines. In the Xen virtual machine called the Domain. In these virtual machines, Domain0 with high privileges, responsible for some specialized work. Hypervisor does not contain any drivers to communicate with the hardware, there is no interface to communicate with the administrator, these drivers provide by Figure 4. KVM architecture domain0 to. By domain0, administrators can take advantage of some of the Xen tools to create other virtual machines (DomainU). These domainu belongs to the no privileges domain. In domain0 will load a xend process. This process will manage all virtual machines and virtual machine console access. When virtual machine is created, administrator uses the configuration program domain0 direct communication. Xen architecture is shown in Figure 5. Figure 5. Xen architecture
5 D. Virtualization Management 1) OpenNebula: OpenNebula  is an open-source project developing the industry standard solution for building and managing virtualized enterprise data centers and enterprise private clouds. IaaS Cloud Computing is the next step in the evolution of the data center. Because no two data centers are the same, we do not think there s a one-size-fits-all in the cloud, and we do not try to provide a turnkey solution that imposes requirements on data center infrastructure. OpenNebula interoperability makes cloud an evolution by leveraging existing IT infrastructure, protecting your investments, and avoiding vendor lock-in. In contrast to other open-source management tools that only provide a special-purpose implementation of popular cloud interfaces on pre-defined environments, OpenNebula aims to provide an open, flexible, extensible, and comprehensive management layer to automate and orchestrate the operation of virtualized data centers by leveraging and integrating existing deployed solutions for networking, storage, virtualization, monitoring, or user management. There are many manufacturers expressed support for OpenStack, such as AMD, Citrix, Intel and Dell. Microsoft in October 2010, showed that support for OpenStack and Windows Server 2008 R2 integration. Ubuntu also, Open- Stack added support in the version. Dell server OpenStack solution that can be quickly set up a private cloud. Cisco in February 2011 joined the OpenStack cloud technology, focusing on network services in the OpenStack project. In addition to Canonical , Extreme Networks , Grid Dynamics  to join. OpenStack alliance members has reached more than 50, the support of these manufacturers, accelerate the popularity of the driving force is OpenStack. Figure 7. OpenStack version flowchart Figure 6. The diagram of OpenNebula 2) OpenStack: OpenStack  is developed jointly by NASA  and Rackspace  Hosting, the open source cloud management system authorized by the Apache license. After the release of the first version of Austin in October 2010, February 2011 release version of Bexar, Cactus version released in April 2011, the version of the September 2011 release of Diablo, 2012 April release of Essex version of Folsom version, released in September 2012, the latest version of the Grizzly, released in April 2013, which represents the development of the OpenStack is very hot. OpenStack core project consists of three parts, namely, the Compute (project code: Nova), Object Storage (project code: Swift) and Image Service (project code: Glance). Nova: Nova is the OpenStack cloud environment, as cloud computing control, for instance for the lifetime of the virtual machine in the cloud environment, create, delete and restart (reboot) management functions. Nova also manages the OpenStack cloud computing environment within all computing resources, networks, and authentication and authorization. Nova does not provide any virtualization capabilities, which defines a drive mechanism for both the underlying operating system virtualization and extensions to operate through a web API to interact with a variety of different hypervisor. OpenStack supported hypervisor Xen / XenServer, KVM, Hyper-V, VMWare / ESX, Linux Containers (LXC), QEMU , and UML, etc. Glance: Glance is OpenStack virtual machine disk image file (image) delivery service. REST interface provided by the standard query is stored in the virtual
6 machine disk images of a variety of back-end devices, managed by the cloud environment within the computer, upload and register a new image file, or query publicly available image information through Glance. OpenStack supports a variety of image file formats such as RAW, AMI, VHD, VDI, qcow2, VMDK and OVF. Swift: Swift is the Rackspace development of object storage systems, it can be used to create a scalable, store huge amount of data storage systems. It is not a file system or real-time data storage system, but a storing multimedia databases (such as photos, music and video), backup files or operating system image file types such long-term storage system. Object storage system is different with the traditional file system to access the data file as an object, through Web Services REST API to access these objects, such as Amazon S3 data access method. E. Live Migration Live migration is to ensure the normal operation of the virtual machine service, a virtual machine system moves from one physical host to another physical host. The process does not cause significant impact to the end user, so that the administrator can not affect the user normal usage, offline repair or upgrade physical server. With static migration in different places, in order to ensure the migration process virtual machine services available, the migration process is just a very short downtime. In front of the stages of migration, services run in a virtual machine on the source host when migrating to a certain stage, the destination host already has the necessary resources for running the virtual machine system, after a very short switching, the source host control is transferred to the destination hosts, virtual machines continue to run on the destination host (Figure 8). For the virtual machine service, due to the very short switching time, users feel less interruption of service, and thus the migration process is transparent to users. By adjusting the resources with virtual technology, to make provided services to closer to the actual needs of different users. Live migration of virtual machines is an important technology. The live-migration of VM can transfer VM to other physical servers without shutdown. It achieves the high availability with the non-stop services (Figure 9). One of the most significant advantages of live migration is that it facilitates proactive maintenance. If an imminent failure is suspected, the potential problem can be resolved before service disruption. Live migration can also be used for load balancing, in which work is shared among computers to optimize the usage of available CPU resources. F. Related Work In recent years, cloud computing has been emerging as the next big revolution in both computer networks and Web provisioning. Because of raised expectations, several vendors, Figure 8. Figure 9. The successful process of Live Migration The concept of Live Migration such as Amazon and IBM, started designing, developing, and deploying cloud solutions to optimize the usage of their own data centers, and some open-source solutions are also underway, such as Eucalyptus and OpenStack. Cloud architectures exploit virtualization techniques to provision multiple Virtual Machines (VMs) on the same physical host, so as to efficiently use available resources, for instance, to consolidate VMs in the minimal number of physical servers to reduce the runtime power consumption. VM consolidation has to carefully consider the aggregated resource consumption of co-located VMs, in order to avoid performance reductions and Service Level Agreement (SLA)  violations. While various works have already treated the VM consolidation problem from a theoretical perspective, we focuses on it from a more practical viewpoint, with specific attention on the consolidation aspects related to power, CPU, and
7 networking resource sharing. Moreover, proposes a cloud management platform to optimize VM consolidation along three main dimensions, namely power consumption, host resources, and networking. Reported experimental results point out that interferences between co-located VMs have to be carefully considered to avoid placement solutions that, although being feasible from a more theoretical viewpoint, cannot ensure VM provisioning with SLA guarantees. System virtualization is becoming pervasive and it is enabling important new computing diagrams such as cloud computing. Live virtual machine (VM) migration is a unique capability of system virtualization which allows applications to be transparently moved across physical machines with a consistent state captured by their VMs. Although live VM migration is generally fast, it is a resource-intensive operation and can impact the application performance and resource usage of the migrating VM as well as other concurrent VMs. However, existing studies on live migration performance are often based on the assumption that there are sufficient resources on the source and destination hosts, which is often not the case for highly consolidated systems. As the scale of virtualized systems such as clouds continue to grow, the use of live migration becomes increasingly more important for managing performance and reliability in such systems. Therefore, it is key to understand the performance of live VM migration under different levels of resource availability, addressing this need by creating performance models for live migration which can be used to predict a VM s migration time given its application s behavior and the resources available to the migration. A series of experiments were conducted on Xen to profile the time for migrating a DomU VM running different resource-intensive applications while Dom0 is allocated different CPU shares for processing the migration. Regression methods are then used to create the performance model based on the profiling data. The results show that the VM s migration time is indeed substantially impacted by Dom0 s CPU allocation whereas the performance model can accurately capture this relationship with the coefficient of determination generally higher than 90%. III. SYSTEM DESIGN AND IMPLEMENTATION This chapter will implement the OpenStack cloud IaaS system. Section 3.1 gives brief overview of the entire system; Section 3.2 describes the mechanism of the virtual machine (VM) live migration; and Section 3.3 describes the dynamic resource allocation algorithm formula and the flow chart. A. System Overview Figure 10 shows the components of the three Cloud Models and point out the emphasis of this thesis. This system has the web-based interface to manage VM. And the system shows the CPU utilization, host loading, memory utilization and VMs information etc. Besides managing individual VMs life cycles, this study also designs the core to support service deployment. Such services typically include a set of interrelated components (for example, a Web server and database back end) requiring several VMs. Thus, a group of related VMs becomes a first-class entity in OpenStack. Besides managing the VMs as a unit, the core also handles the context information delivery (such as the Web servers IP address, digital certificates, and software licenses) to the VMs. Figure 10. The domain of IaaS B. Live Migration Mechanism of VM Live migration of VMs is to keep the VM running and, at the same time, migrate the VM and services running as a migration unit from the source physical machine to the destination machine. The services running on the VM will always be able to respond to the user. When the migration is complete, the VM (service) resume in the destination physical machine. The time of service interrupted is very short. In order to ensure VMs can keep running in the target physical machine after migration, it must send adequate information, such as disk, memory, the CPU, the I/O devices. Among them, the information of the memory is more complex and essential for migration. In Figure 11, we can clearly distinguish the procedure of live migration step by step. C. Dynamic Resource Allocation Algorithm The Dynamic Resource Allocation , ,  is an efficient approach to increasing availability of host machine. However, at present open source VM management software merely provide a web interface for users to manage VMs. Such as Eucalyptus cannot accomplish load balance. When a part of VMs load increasing, it will affect all VM on the same host machine. Our Dynamic Resource Allocation algorithm  can overcome this obstacle, and improve host machine performance. Dynamic Resource Allocation works by continuously monitoring all VMs resource usage to determine which
8 Figure 11. Live migration time line VM have to migrate to another host machine. The goal is to make all host machine CPU and memory loading identically. The Dynamic Resource Allocation process is as follows. Assuming j host machines are in this pool. Every host machine loading ideal rate is α = 1?j. And i VMs are not running load balancing in these host machine. Each VM resource usage is defined V M jirate, is: (V M jicp Uuse V M jiramallocate ) n V M jirate = (1) (V M jicp Uuse V M jiramallocate ) i=1 Where V M jirate denotes VM resource usage percentage in all allocate CPU and memory physical resource. When V M jicp Uuse increase, V M jirate is also increase too. In the next step, the VMs resource usage rate has been added up on different host machine. Each host machine current resource usage is defined HOST jrate, is: HOST jrate = n i=1 V M jirate (2) Where HOST jrate must compare with ideal rate α. When HOST jrate is bigger than α, it presents this host loading is too high, must migrate VM to another host machine, and it also determine migrate source host. At the first, decide which host machine to be the migrated source host. Such as:max(host jrate α), and decide which host machine to be the migrated destination host, such as:min(host jrate α). And at last, the migrated VM is defined V M k migrate. Such as: V M k migrate = min V M jirate max(host jrate α) (3) When the difference between the physical machine occupying resource weight and the average physical machine occupying resource weight is greater than a default migration value, execute following steps: while ((HOST jrate α)! = 0) do elect a P M with a max V M jirate as the P M max ; elect a P M with a min V M jirate as the P M min ; calculate (HOST jrate α); if there is a V M in the P M max ; with min V M jirate max(host jrate α) then set the V M as V M migration ; migration (V M migration P M max ); end if end while The DRA algorithm flow chart shown in Figure 12 further explains how to evenly allocate loads of the running physical machines by a dynamic resource allocation process. First we used Equations (1), (2) to calculate the VM occupying resource weight of each of the VMs vmxx, the physical machine occupying resource weight of each of the physical machines, and average physical machine occupying resource weight of all the physical machines. In Equations (2), (3), although the load rate V M jicp Uuse and the memory allocation V M jiramallocate of the respective VM in each of the physical machines are computed in percentage to get the gross occupied resource weight ratio, the VM occupying resource rate V M jirate, the physical machine occupying resource rate HOST jrate, and the average physical machine occupying resource rate α, respectively, a person familiar with the art should know that other resources in the physical machines such as storage devices can be considered, or the weight values can be calculated with different formulas. After finding the VM occupying resource weight of each of the VMs vmxx, the physical machine occupying resource weight of each of the physical machines, and the average physical machine occupying resource weight of all the physical machines, it decides whether to do live migration of VMs. When the difference between the physical machine occupying resource weight of any one of the physical machines and the average physical machine occupying resource weight is greater than the default migration value set by the user, the live migration of VM will be performed. In the next step, a physical machine with the maximum physical machine occupying resource weight is elected as a migration source machine. Then, a physical machine with the minimum physical machine occupying resource weight is chosen as a migration target machine. Further, a migration difference between the physical machine occupying resource weight of the migration source machine and the average physical machine occupying resource weight is computed. Furthermore, a VM with the VM occupying resource weight thereof nearest to the migration difference is chosen as a migration VM. Finally, the migration VM is moved to the migration target machine to complete a resource allocation cycle and enter another resource allocation cycle if necessary. IV. EXPERIMENTAL RESULTS A. Experimental Environment 1) Hardware Environment: In this study, we built a cloud environment with hardware architecture composed of three physical servers. A computer as the cloud controller, the