1 Study of virtual data centers for cost savings and management María Virtudes López López School of Industrial Engineering and Information Technology Master s Degree in Cybernetics Research León, Spain Email: mlopel01@estudiantes.unileon.es Abstract Current trends in data centers, leaving behind the physical model and isolated entity providing physical services to third parties.nowadays data centers are turning to the era of virtualization and geographically distributed virtual infrastructures. Given the current circumstances of financial cutbacks we are, public administrations are forced to find new solutions in the field of information and communication technologies, to provide them with cost reduction. Due to the nature of new emerging trends in virtual data centers, the infrastructure of the Foundation Center for Supercomputing of Castilla and Leon (FCSCL) could cover certain needs of the public administration, in particular, carry out the deployment and installation of the electronic administration, framed in four spaces, corresponding to four virtual data center. Index Terms data center, virtualization, networking, cost reduction cloud computing A. Data centers I. Introduction The data centers continue to grow in size and complexity. In order to understand the current trends in design, we should first to know the implications that until now were only taken into account. Data centers are the central core of the physical support of the services offered by Internet, including web hosting equipment, electronic commerce, social networking and more general concepts of the new model of cloud computing: SaaS, software as a service: It s a model of distributing software to users through the Internet, who pay per use and they don t worry about maintenance operations. PaaS, platform as a service: It s a service that allows you to host and develop your own applications in a platform that provides tools for development, thereby the user can complete a solution. The user only has control over their applications but not about the platform or the underlying infrastructure. IaaS infrastructure as a service: The user has access to technological infrastructures (processing capacity, storage and network), where it s possible to host applications and platforms. In short the virtualization is the technique that enable us to optimize resources and improves the redistribution according to the needs of the services that require it. The current needs in terms of Internet services are becoming more complex and critical. We are conducting services associated with our nest of business to the cloud, not worryng about the physical infrastructure and leaving that management to the entities that have the technology to give us the same service at lower costs. It would be a mistake to meet all this demand, only building powerful larger data centers, without taking into consideration other aspects that help carry out these new challenges. To better understand the trend of today s data centers, we are going to present a snapshot of how this entities are organized and what basic components have to be aware of. 1) Physical organization: A data center is organized into rows of racks, each rack has a modular servers, switches, storage or special purpose appliances.a standard rack is 78in., between 23-25in. wide and 26-30in. deep. In each rack, as we have mentioned, are introduced different purpose modules, the smallest unit of a module is the "U" wich is 45mm. Thus, an enclosure may have a total of 42 modules 1U, which could be for example 42 servers. For higher densities can use containers or chassis that are placed in the racks. Every chassis has its own power supplies, fans, and backplane interconnect, these chassis have slots where servers are inserted. 2) Storage: The storage in the data center can be supplied in various ways. In most cases for efficiency, storage is housed in network storage systems, SAN s, connecting through high-speed networks, servers with different storage devices such as disk arrays. 3) Networking: Possible network types that can be found in a data center, are associated with the different purposes of interconnection between different devices or entities that must interact to carry out their function. So we can group the different networks depending on the type of communication: Server to Server: low latency communications. Infiniband, 10GEthernet. Customers to Server: external access to the data
2 center, GigaEthernet, 10GEthernet. Storage: low latency communications technologies such as Fibre Channel, iscsi, Ethernet. Management: access for maintenance and monitoring computers or network devices. Ethernet. 4) Electrical systems: Data centers can reach peak power consumption up to some megawatt or more. For these large loads of electricity consumption, it is necessary to supply electrical energy through high-voltage lines. In case you don t have redundancy in the electrical power supply, it is necessary a standby generator. To ensure a power supply without interruptions we must take advantage of the UPS or uninterruptible power supply units, providing power during a finite period of time, depending on the energy accumulated in their batteries. 5) Cool systems: Cool system infrastructures in data centers can be the most expensive and elaborate. The units called water chillers can occupy considerable space. This technology allows cool water to about 10 C, this water is channeled through a closed circuit of pipes to the server room. applications. Virtual infrastructure coordination layer(vicl): whose purpose is to join the virtual servers across multiple physical servers, creating a virtual data center geographically distributed. This layer must coordinate and manage virtual channels between the different data centers. This layer will be responsible for the deployment of applications at the geographical level, to perform efficient migration at the moment that you require. Service provider layer(spl) is responsible for the administration and execution of applications in the virtualized distributed data center. The aspects of configuration, performance and availability latencies must been matched to the lower layers for effective control of the applications. In this way, the service implementation belong to the customer. B. Virtualized and geographically distributed data centers Cloud computing adds more flexibility and dynamism in the provision of resources, because these are supplied for periods of time of use.the user is unaware of the underlying lower layers of the cloud platform. By other hand, potential customers of virtual data centers, generally require the use of acquired resources for long periods of time and also require control over the infrastructure provided. Service provider layer (SPL): is responsible for the administration and execution of applications in the data center environment. A possible conceptual model presented in Figure1, show four different layers or levels that describe the data centers of the future. Physical infrastructure layer (PIL): it s the bottom layer, which manages the physical infrastructure.this layer also provides scalability and isolation in sections of different servers for subsequent allocation to different customers. Therefore, this layer adds manageability and access under security conditions established. Virtual infrastructure layer (VIL): this layer offers the virtualization capability for servers, networks and storages and provides what we understand as virtual cluster. This layer and the services it provides would be under the control of the infrastructure provider who introduced the virtual servers in a abstract way to their customers.in addition, it should provide sufficient visibility to the administration of the higher levels, where we are running a variety of services and Figure 1. Logical organization of future data centers II. Implementation The purpose of this study is to see how to virtualize four data center to house them public services like electronic administration.the study can be used to any other scenario, take as an example for other functionality. The proposal is to translate the physical environment of each of the four entities to a full virtual centralized environment in a single physical data center. But also adding the possibility of interconnection of each one of the virtual centers with each of their physical originally environments. The architecture is illustred in Figure 2. A. Physical environment for all virtualization services Data center that offers virtualization services has to be provisioned of cluster of physical servers, a storage system, interconnection networks and virtualization tools. So we re
3 going to present each of the listed components that have been used for the complete virtualization of the four data centers. See the Figure 3 Figure 3. Physical interconnection scheme Figure 2. Virtualization of the four CPD s 1) Servers: Servers used to form the cluster of physical hosts have the following specifications: Model: HP Proliant DL 580 Intel Xeon MP Processor: X7350 processor. (16 Cores per node). Each node has four memory 256GB or 128GB RAM depending on the node Network: 2 GEthernet interfaces, 4 Fiber Channel Interfaces 4Gb/s and 2 Infiniband DDR interfaces. 2) Storage: A storage system used belong to NetApp manufacturer, leading supplier of storage systems for large centers of procesing data. In particular, the integrated system belongs to the FAS3100 series and is composed of multiple SATA and Fiber Channel disks by adding a total capacity of 110 TB. The system has two controllers that are responsible for the management of disk volumes that have been configured for different services. 3) Networking: The interconnect physical Ethernet network is based on two switches HP Procurves with capacity of 12 modules of 24 ports each one 10/100/1000Mbps or a maximum of 48 ports 10GEthernet, that we can combine according to our needs. For the Fiber Channel network, which is responsible for communications with storage system, we have two Brocade switches 300, with 24 ports each of 4Gbps. This network will allow us to communicate the devices that need to access resources of storage through the dual controller of the SAN itself. Maintaining a redundancy of different paths to access data. B. Logical Architecture for each Virtual Data Center Is necessary to virtualize two environments that are described below: Preproduction environment: test environment before production phase. The system have four environments of this type corresponding to the four entities. Production Environment: environment that will provide the services to the different users. The system has one environment for every entity. The items to virtualize are services, application servers, proxies, balancers, firewalls, VPN concentrators, switches and network adapters. So the virtual environment corresponds to an infrastructure as a service for multiple public entities.see Figure 4. C. Virtualized environment design The set of resources for the physical cluster is composed of three HP DL580 servers that provide resources as CPU and memory.the resources will be shared by the virtual machines. This allows a continuous resource balancing, so the virtual machines can migrate from physical host to another based on the workload automatically. 4) Virtualized software: The virtualization platform used is VMWare Vsphere. This software allows us to transform or virtualize the hardware resources of x86- based computer, including CPU, memory, disk and network adapters. Figure 4. Full virtualized scenario diagram
4 1) Virtual storage: This container of logical data stores files of virtual machines, the associated disks and other necessary files for the operability of the virtualized machines. In our case, these storages use a SAN with access to Fiber Channel or SATA disk, depending on the need for speed of access. Fiber Channel disks will give us better performance. The volumes exported from the SAN have been about 500GB. Every entity has two data stores like this. 2) Virtual network: The connectivity between machines has been possible through virtual switches, vswitch, that are similar as physical Ethernet switches. These vswitches detect which virtual machines are logically connected to the respective vswitch and use this information to pass data from one machine to another. The vswitch can be connected with physical switches using physical network adapters, thereby we have connectivity to the real environment as is shown in Figure 5. The distributed switches, called vnetwork Distributed Switches, which gives us the software VMware vsphere, working as a single switch via the physical host that form the cluster. These allow virtual machines maintain the consistency of the network settings when they migrate from one physical host to another. These operations were installing the operating system with the necessary applications and configuring local network. Then the routers/firewalls were configured to filter traffic across different networks and allow access from the outside by using VPN s. The next phase was the full deployment of the three virtual data centers remaining, to complete the deployment of the four entities. Then network elements were configured, mainly the vswitch and the VLAN s of each virtual data center to perform the cloning of the virtual machines and routers. Finally we had to reconfigure each of the machines, in spite of being clones from the first complete virtualized environment, the configuration of the various applications contained in the machines, differs depending on the entity that we are implementing. When the virtual infrastructure was finished, it was possible to perform various studies regarding the savings and effectiveness in various fields, which are the result of the joint design and consequent implementation of the four virtual data centers. Time reduction in the deployments of more data centers. The deployment of the last three data centers had a lower time consumption in terms of creation and configuration refers. If in a future will create another complete infrastructure for another entity, the necessary templates of machines as well as their network environment would be fully defined. So the operation of clonig machines would be almost immediate and it allows us efficient scalability and fast deployments. Resources reduction. If we make a disk and memory resources comparison beween physical and virtual environment design, the significant savings are shown in Figure 6. Figure 5. Virtual network diagram III. Results In this section we will discuss the cost savings involved in full virtualization of a data center and even more if we unify virtualized data centers of common purpose. First was the deployment of one virtual data center, with its two production and preproduction environments: Virtual machines: 28 Storage: 2 data stores (SATA and Fiber Channel) 500GB capacity each. Network: 2 vswitch, 2 virtual routers/firewalls and 9 VLAN s When the entire virtual infrastructure was up, the next step was virtual machine installation and configuration. Figure 6. Physical versus Virtual saving resources Time reduction in administration and monitoring tasks. For virtual platform administration was used vcenter
5 software, that allows us unified management of all the physical hosts that are part of the cluster and virtual machines. We can manage detailed information about the state of the clusters, the physical host, virtual machines, the operating system hosting, storage and the virtual network. Technical staff reduction The reduction in maintenance, management and monitoring tasks allow us a reduction of qualified technical personnel to carry them out. We could make an estimate, based on the work currently being carried out on the project,a technician can take over an average of 100 virtual machines. IV. Future work The next possible research line to implement is the ability to distribute geographically the global platform for the virtual data centers.in situations of physical system workload or network fault, the global platform would be automatically migrated to other physical infrastructure. This type of operation should be transparent to users. In this way, the local high availability provided, would be extended to geographically distributed avalibility. References [1] Kim Khoa Nguyen, Mohamed Cheriet, Mathieu Lemay, Victor Reijs, Andrew Mackarel, Alin Pastrama Environmental-aware virtual data center network Computer Networks Volume 56, Issue 10, 5 July 2012, Pages 2538 2550. [2] Krishna Kant Data center evolution. A tutorial on state of the art, issues, and challenges. Computers Network Volume 53, 2009 pages 2939-2965 [3] Department of Mathematics and Computer Science, Vrije Universiteit, De Boelelaan 1081, 1081 HV Amsterdam, The Netherlands An overview and appraisal of the Fifth GenerationComputerSystem project Future Generation Computer Systems Volume 9, Issue 2, July 1993, Pages 83 103 [4] Xicheng Lua,Huaimin Wanga,Ji Wanga,Jie Xub,Dongsheng Lia Internet-based Virtual ComputingEnvironment: Beyond the data center as a computer Future Generation Computer Systems Available online 17 August 2011 [5] Google App Engine 2012 https://developers.google.com/appengine/ [6] Amazon EC2, 2012 http://aws.amazon.com/es/ec2/ [7] The GreenStar Network Project 2012 http://www.greenstarnetwork.com/ [8] The Mantychore project 2012 http://www.mantychore.eu/