Setting up a private cloud for academic environment with open source software Cloud Computing Course ITU of Copenhagen February 27 th, 2012
Who am I? Zoran Pantić Infrastructure Architect & Systems Specialist Corporate IT @ University of Copenhagen zopa@itu.dk & zoran@pantic.dk http://zoranpantic.wordpress.com http://dk.linkedin.com/in/zoranpantic
Agenda Non-technical part: A bit about the project Why OSS? Focusing on academic environments Technical part: UEC/Eucalyptus Reflections on hardware, software, network & redundancy Scaling out UEC Conclusion Questions? Video demo
Project: Implementing private cloud solution in academic environments Based on Open Source Software (OSS) Focus on the logistical and technical challenges, and strategies of setting up a private cloud for academic environment Goal - providing guidelines and tutorials for implementing private cloud solution in academic environments: Design of the server- and network infrastructure
Why OSS? In general: Lowering the costs (no licensing headaches!) Interchangeability & portability (general) Socio-organizational reasons UEC/Eucalyptus: Amazon AWS-like on-premise private cloud Using Amazons API Big community supporting it
Academic environments why private cloud? Usually, the budget is low, and the project should start as soon as possible Growing strongly: The need for processing large data volumes The need to conserve power by optimizing server utilization
Academic environments why private cloud? (continued) Private clouds: Higher ROI than traditional infrastructure More customizable Quick responses to changes in demands Rapid deployment Increased security Focus on an organization s core business Effort required for running a private cloud is having a downward tendency
Academic environments private cloud challenges Challenges: Sociological Technological
Academic environments private cloud sociological challenges Sociological challenges, mostly political and economic: Existing structures oppose implementation of private cloud, Weak transparency of who is in charge of systems and economy, Researches cannot be market cost-effective, Administrators de facto in charge, instead of scientific groups, Tendency of IT department implementing things because they are interesting and fun, while maybe there is no need for those systems.
Academic environments private cloud technological challenges Technological challenges: Private cloud maturity, Problems porting of programming code, IT departments should be big enough, with enough expertize OSS: community cannot fix all your problems
Suggestions for implementing cloud solutions in academic environments To determine the needs and their nature, consult the professors that are in charge of the project (and its funding), Once started, implementation should be top-down steered, A test case should be designed and implemented, Researchers should be allowed to thoroughly test the solution - free of charge, Make sure that implementation succeeds first time! In general - get a very clear picture of what services are to be offered, who will use them, what they will use them for, and how!
Focus on academic environments Difference in implementing for infantry and supply troops Infantry - to support research, scientific computing and High Performance Computing (HPC) Supply - to support daily operational systems and tasks i.e. joint administration Bookkeeping, administration, Communications (telephony, e- mail, messaging) Infantry stateless instances vs. Supply stateful instances
Academic environments Infantry Uses non-standard & advanced research instruments Applicable in research, scientific computing and HPC, i.e.: Generally if users need VMs that they administer themselves (root access) - more appropriate to supply them with machines from private cloud, then giving access to virtual hosts behind firewall Organizations like ITU: for numerous different projects Organizations like DCSC: 1/3 of the jobs would be runnable on private cloud in HPC: Only in low end, for low memory and low core number jobs
Academic environments Infantry (continued) Summarized suggestions Have social psychology in mind as important factor Consult the professor in charge of money for the project Implement an open source solution UEC based on Eucalyptus, OpenStack, Joyent SmartOS (with both HW-level and OS-level virtualization!), OpenNebula, )
Academic environments Infantry UEC WebGUI
Academic environments Infantry HybridFox
Academic environments Supply Needs a stable and supported solution Summarized suggestions Have social psychology in mind as important factor Consult the system owner in charge of money for the project Implement a proprietary solution from reputable provider Microsoft Hyper-V, VMware Virtual Infrastructure, Sign a support agreement & agree a good SLA
Academic environments Supply VMware vsphere
UEC/Eucalyptus components UEC/Eucalyptus is an on-premise private cloud platform, designed as a distributed system - a modular set of 5 simple elements: Cloud Controller (CLC) Walrus Storage Controller (WS3) Cluster Controller (CC) Storage Controller (SC) Node Controller (NC)
UEC/Eucalyptus levels Three levels: Cloud level Cloud Controller (CLC) Walrus Storage Controller (WS3) Cluster level Cluster Controller (CC) Storage Controller (SC) Computing level Node Controller (NC)
Cloud Controller (CLC) Entry point to Eucalyptus cloud web interfaces for administering the infrastructure web services interface (EC2/S3 compliant) for end users /client tools Frontend for managing the entire UEC infrastructure Gathers info on usage and availability of the resources in the cloud Arbitrates the available resources, dispatching the load to the clusters Only one per cloud (no redundancy)
Walrus Storage Controller (WS3) Equivalent to Amazon s S3 Bucket based storage system with put/get storage model WS3 is storing the machine images and snapshots Persistent simple storage service, storing and serving files
Cluster Controller (CC) Entry point to a cluster Manages NCs and instances running on them Controls the virtual network available to the instances Collects information on NCs, reporting it to CLC One or several per cloud Only one per cluster (no redundancy)
Storage Controller (SC) Allows creation of block storage similar to Amazon s Elastic Block Storage (EBS) Provides the persistent storage for instances on the cluster level, in form of block level storage volumes Supports creation of storage volumes, attaching, detaching and creation of snapshots
Node Controller (NC) Compute node ( work horse ) Controls the instances supported hypervisors: KVM (preferred) and Xen in open source version, and VMware (ESX/ESXi) in Enterprise Edition Communicating with both OS and the hypervisor running on the node, and Cluster Controller Gathers the data about physical resource availability on the node and their utilization, and data about instances running on that node, reporting it to CC One or several per cluster
Reflections on hardware Processor architecture: Definitely 64-bit for performance reasons Multiprocessor, multicore, hyper threading VT-x enabled Node Controllers is a must Intel VT or AMD-V virtualization extensions Disk configuration: Local disks: RAID 10 (storage limits soon reached) Preferably SAN (iscsi) open source, see Nexenta /Napp-it
Reflections on software Ubuntu versions: Newest new features, but less stability (more bugs) LTS (Long Time Support) for more stability or larger deployments
Reflections on network 2 or 3 networks: WAN, Cloud public & Cloud private Firewall: open source based pfsense - to make the whole environment independent of the network infrastructure / environment where it will be plugged in
Reflections on redundancy No redundancy available in UEC by design In case of software or hardware error on a component: no failover solution is available; Solution: adding a new server, and then restoring the data
Scaling out the environment CLOUD CLUSTER 1 CLUSTER 2 CLUSTER 3 NC NC NC NC NC NC
Suggested scaling out possibilities 2 physical servers Server 1: CLC/WS3/CC/SC Server 2: NC 3 physical servers: Server 1: CLC/WS3 Server 2: CC/SC Server 3: NC
Suggested scaling out possibilities 4 physical servers Server 1: CLC Server 2: WS3 Server 3: CC/SC Server 4: NC 5 physical servers Server 1: CLC/WS3 Server 2: CC1/SC1 Server 3: NC1 Server 4: CC2/SC2 Server 5: NC2
Conclusion & recommendations for private clouds based on open source Although still at an early stage, being hard to install, manage and maintain for a regular admin and have steep learning curve (admins & users), implementation is suggested, at affordable, smaller scale Implement on a current/modern hardware Keep the knowledge updated Keep software platform and hardware updated if possible Monitor & analyze costs, available features and complexity, compared to budget, needs and internal resources available Asses the implementation possibilities based on the analyses
Alternative public clouds More mature Well documented Rich with features Easy to use Examples: Amazon s initiatives for academic use: Amazon Education program with grants for research applications; Having a project, academic organization applies for a recurring grant, gets the approval within two weeks time, and starts using it immediately after. Locally in Denmark, CABO was willing to supply the project with resources.
Questions?
Demo Demonstration of UEC environment and WebGUI
Thank you! Thank you for your attention! Still having questions? zopa@itu.dk zoran@pantic.dk