Enabling Large-Scale Testing of IaaS Cloud Platforms on the Grid 5000 Testbed



Similar documents
CHAMELEON: A LARGE-SCALE, RECONFIGURABLE EXPERIMENTAL ENVIRONMENT FOR CLOUD RESEARCH

NephOS A Licensed End-to-end IaaS Cloud Software Stack for Enterprise or OEM On-premise Use.

Flauncher and DVMS Deploying and Scheduling Thousands of Virtual Machines on Hundreds of Nodes Distributed Geographically

News about HPC and Inria

Provisioning and Resource Management at Large Scale (Kadeploy and OAR)

Automated deployment of virtualization-based research models of distributed computer systems

OpenStack: we drink our own Champagne. Teun Docter Software developer

Deploying Business Virtual Appliances on Open Source Cloud Computing

A Cost-Evaluation of MapReduce Applications in the Cloud

BRIDGING EMC ISILON NAS ON IP TO INFINIBAND NETWORKS WITH MELLANOX SWITCHX

Adding Virtualization Capabilities to Grid 5000

Plug-and-play Virtual Appliance Clusters Running Hadoop. Dr. Renato Figueiredo ACIS Lab - University of Florida

Cloud Optimize Your IT

PLUMgrid Toolbox: Tools to Install, Operate and Monitor Your Virtual Network Infrastructure

Private Distributed Cloud Deployment in a Limited Networking Environment

An Experimental Study of Load Balancing of OpenNebula Open-Source Cloud Computing Platform

Solution for private cloud computing

Cloud Computing for Control Systems CERN Openlab Summer Student Program 9/9/2011 ARSALAAN AHMED SHAIKH

STeP-IN SUMMIT June 18 21, 2013 at Bangalore, INDIA. Performance Testing of an IAAS Cloud Software (A CloudStack Use Case)

Virtualised MikroTik

Apache CloudStack 4.x (incubating) Network Setup: excerpt from Installation Guide. Revised February 28, :32 pm Pacific

Aerohive Networks Inc. Free Bonjour Gateway FAQ

Efficient Cloud Management for Parallel Data Processing In Private Cloud

Network performance in virtual infrastructures

Cloud Computing. Adam Barker

Cisco Prime Data Center Network Manager Release 7.0: Fabric Management for Cisco Dynamic Fabric Automation

How to Configure an Initial Installation of the VMware ESXi Hypervisor

ZEN LOAD BALANCER EE v3.04 DATASHEET The Load Balancing made easy

SURFsara HPC Cloud Workshop

Lab Configuring the PIX Firewall as a DHCP Server

IPOP-TinCan: User-defined IP-over-P2P Virtual Private Networks

FREE AND OPEN SOURCE SOFTWARE FOR CLOUD COMPUTING SERENA SPINOSO FULVIO VALENZA

HPC and Big Data. EPCC The University of Edinburgh. Adrian Jackson Technical Architect

Amazon EC2 Product Details Page 1 of 5

Kerrighed / XtreemOS cluster flavour

Leveraging BlobSeer to boost up the deployment and execution of Hadoop applications in Nimbus cloud environments on Grid 5000

Where IT perceptions are reality. Test Report. OCe14000 Performance. Featuring Emulex OCe14102 Network Adapters Emulex XE100 Offload Engine

Restricted Document. Pulsant Technical Specification

VMware vsphere 5.0 Evaluation Guide

2) Xen Hypervisor 3) UEC

- An Essential Building Block for Stable and Reliable Compute Clusters

Installation Guide Avi Networks Cloud Application Delivery Platform Integration with Cisco Application Policy Infrastructure

Cloud on TEIN Part I: OpenStack Cloud Deployment. Vasinee Siripoonya Electronic Government Agency of Thailand Kasidit Chanchio Thammasat University

High-Performance Computing Clusters

Fusion Service Schedule Virtual Data Centre ( VDC ) Version FUS-VDC-7.1

How To Install Openstack On Ubuntu (Amd64)

Monitoring Elastic Cloud Services

CUDA in the Cloud Enabling HPC Workloads in OpenStack With special thanks to Andrew Younge (Indiana Univ.) and Massimo Bernaschi (IAC-CNR)

IBM BladeCenter H with Cisco VFrame Software A Comparison with HP Virtual Connect

Mining Association Rules on Grid Platforms

Large Scale Management of Virtual Machines Cooperative and Reactive Scheduling in Large-Scale Virtualized Platforms

SURFsara HPC Cloud Workshop

CloudCenter Full Lifecycle Management. An application-defined approach to deploying and managing applications in any datacenter or cloud environment

TANDBERG MANAGEMENT SUITE 10.0

The Greenplum Analytics Workbench

Software-Defined Networking Architecture Framework for Multi-Tenant Enterprise Cloud Environments

IaaS Cloud Architectures: Virtualized Data Centers to Federated Cloud Infrastructures

Simplifying Big Data Deployments in Cloud Environments with Mellanox Interconnects and QualiSystems Orchestration Solutions

Virtualization, SDN and NFV

Bright Cluster Manager

Shareable Private Space on a Public Cloud

OGF25/EGEE User Forum Catania, Italy 2 March 2009

IRODS use case : Ciment, the Univ. Grenoble-Alpes HPC center. B.Bzeznik / X.Briand Irods users group meeting 11/06/2015

Springpath Data Platform with Cisco UCS Servers

Cisco IP Communicator (Softphone) Compatibility

Mobile Cloud Computing T Open Source IaaS

The FEDERICA Project: creating cloud infrastructures

Maintaining Non-Stop Services with Multi Layer Monitoring

Adapt Support Managed Service Programs

UZH Experiences with OpenStack

Cluster, Grid, Cloud Concepts

I N S T A L L A T I O N M A N U A L

Virtualization. Michael Tsai 2015/06/08

Elastic Cloud Computing in the Open Cirrus Testbed implemented via Eucalyptus

SOFTWARE-DEFINED NETWORKING AND OPENFLOW

REDCENTRIC INFRASTRUCTURE AS A SERVICE SERVICE DEFINITION

High Performance Computing in CST STUDIO SUITE

ZEN LOAD BALANCER EE v3.02 DATASHEET The Load Balancing made easy

Getting Started with HC Exchange Module

Vocia MS-1 Network Considerations for VoIP. Vocia MS-1 and Network Port Configuration. VoIP Network Switch. Control Network Switch

Automation and DevOps Best Practices. Rob Hirschfeld, Dell Matt Ray, Opscode

FLOW-3D Performance Benchmark and Profiling. September 2012

Transcription:

Enabling Large-Scale Testing of IaaS Cloud Platforms on the Grid 5000 Testbed Sébastien Badia, Alexandra Carpen-Amarie, Adrien Lèbre, Lucas Nussbaum Grid 5000 S. Badia, A. Carpen-Amarie, A. Lèbre, L. Nussbaum Testing IaaS Clouds on Grid 5000 1 / 24

Testing IaaS clouds stacks IaaS Cloud stacks: complex software Needs to be tested in realistic setups But testing often limited to: Single-machine installations Static deployments This talk: enabling large-scale testing of IaaS Cloud stacks on a shared, reconfigurable testbed S. Badia, A. Carpen-Amarie, A. Lèbre, L. Nussbaum Testing IaaS Clouds on Grid 5000 2 / 24

Outline 1 Quick overview of the Grid 5000 testbed 2 Support for Virtualization and Cloud on Grid 5000 3 Deploying IaaS Clouds on Grid 5000 S. Badia, A. Carpen-Amarie, A. Lèbre, L. Nussbaum Testing IaaS Clouds on Grid 5000 3 / 24

Grid 5000 Testbed for research on distributed systems High Performance Computing Grids Peer-to-peer systems Cloud computing History: 2003: Project started (ACI GRID) 2005: Opened to users Funding: Inria, CNRS and many local entities (regions, universities) Application Programming environment Application runtime Grid, Cloud or P2P middleware Operating system Networking Only for research on distributed systems no production usage Litmus test: are you interested in the result of the computation? Free nodes during daytime to prepare experiments Large-scale experiments during nights and week-ends Also a scientific object: how does one design such a testbed? S. Badia, A. Carpen-Amarie, A. Lèbre, L. Nussbaum Testing IaaS Clouds on Grid 5000 4 / 24

Leading to results in several fields Cloud: Sky computing on FutureGrid and Grid 5000 Nimbus cloud deployed on 450+ nodes Grid 5000 and FutureGrid connected using ViNe HPC: factorization of RSA-768 Feasibility study: prove that it can be done Different hardware understand the performance characteristics of the algorithms Grid: evaluation of the glite grid middleware Fully automated deployment and configuration on 1000 nodes (9 sites, 17 clusters) S. Badia, A. Carpen-Amarie, A. Lèbre, L. Nussbaum Testing IaaS Clouds on Grid 5000 5 / 24

Current status Lille 11 sites (1 outside France) 26 clusters 1700 nodes 7400 cores Diverse technologies: Intel (60%), AMD (40%) CPUs from one to 12 cores Myrinet, Infiniband {S,D,Q}DR Two GPU clusters 500+ users per year Rennes Bordeaux Toulouse Orsay Luxembourg Reims Nancy Lyon Grenoble Sophia S. Badia, A. Carpen-Amarie, A. Lèbre, L. Nussbaum Testing IaaS Clouds on Grid 5000 6 / 24

Backbone network Dedicated 10 Gbps backbone provided by RENATER (french NREN) Work in progress: packet-level and flow-level monitoring bandwidth reservation and limitation S. Badia, A. Carpen-Amarie, A. Lèbre, L. Nussbaum Testing IaaS Clouds on Grid 5000 7 / 24

Using Grid 5000: the user s point of view Site access machine (access.nancy.grid5000.fr) [SSH] Site frontend (nancy.grid5000.fr) [OARSUB, KADEPLOY] Site clusters/nodes (e.g.: grelon-32.nancy) Site access machine (access.lyon.grid5000.fr) [SSH] OARSUB OARSH Site clusters/nodes (e.g.: capricorne-12.lyon) Site frontend (frontend.lyon aka lyon) [OARSUB, KADEPLOY] SSH Grid'5000 dedicated backbone User [SSH] Site clusters/nodes (e.g.: genepi-21.grenoble) SSH OARSUB OARSH SSH Site access machine (access.grenoble.grid5000.fr) Site frontend [SSH] (frontend.grenoble aka grenoble) [OARSUB, KADEPLOY] Site access machine Site clusters/nodes (access.orsay.grid5000.fr) (e.g.: gdx-102.orsay) [SSH] Site frontend (frontend.orsay aka orsay) [OARSUB, KADEPLOY] Site clusters/nodes (e.g.: azur-42.sophia) Site frontend (frontend.sophia aka sophia) [OARSUB, KADEPLOY] Site access machine Key tool: SSH (access.sophia.grid5000.fr) [SSH] Private network: connect through access machines Data storage: NFS (one server per Grid 5000 site) S. Badia, A. Carpen-Amarie, A. Lèbre, L. Nussbaum Testing IaaS Clouds on Grid 5000 8 / 24

Resource management with OAR Batch scheduler with specific features interactive jobs advance reservations powerful resource matching Resources hierarchy: cluster / switch / node / cpu / core Properties: memory size, disk type & size, hardware capabilities, network interfaces,... Other kind of resources: VLANs, IP ranges for virtualization I want 1 core on 2 nodes of the same cluster with 4096 GB of memory and Infiniband 10G + 1 cpu on 2 nodes of the same switch with dualcore processors for a walltime of 4 hours... oarsub -I -l "{memnode=4096 and ib10g= YES }/cluster=1/nodes=2/core=1 +{cpucore=2}/switch=1/nodes=2/cpu=1,walltime=4:0:0" S. Badia, A. Carpen-Amarie, A. Lèbre, L. Nussbaum Testing IaaS Clouds on Grid 5000 9 / 24

Resource management with OAR - visualization Resources status Gantt chart S. Badia, A. Carpen-Amarie, A. Lèbre, L. Nussbaum Testing IaaS Clouds on Grid 5000 10 / 24

Description, selection, verification of resources Describing resources understand results Detailed description on the Grid 5000 wiki Machine-parsable format (JSON) S. Badia, A. Carpen-Amarie, A. Lèbre, L. Nussbaum Testing IaaS Clouds on Grid 5000 11 / 24

Description, selection, verification of resources Describing resources understand results Detailed description on the Grid 5000 wiki Machine-parsable format (JSON) Selecting resources OAR database filled from JSON oarsub -p "wattmeter= YES and gpu= YES " S. Badia, A. Carpen-Amarie, A. Lèbre, L. Nussbaum Testing IaaS Clouds on Grid 5000 11 / 24

Description, selection, verification of resources Describing resources understand results Detailed description on the Grid 5000 wiki Machine-parsable format (JSON) Selecting resources OAR database filled from JSON oarsub -p "wattmeter= YES and gpu= YES " Verifying resources G5K-checks: validates resources against their description (detect hardware failures and misconfigurations at each boot) S. Badia, A. Carpen-Amarie, A. Lèbre, L. Nussbaum Testing IaaS Clouds on Grid 5000 11 / 24

Reconfiguring the testbed with Kadeploy Provides a Hardware-as-a-Service Cloud infrastructure Enable users to deploy their own software stack & get root access Standard environments provided to users Customizations automated using Kameleon Scalable, efficient, reliable and flexible: Chain-based and BitTorrent environment broadcast 255 nodes deployed in 3 minutes Command-line interface & REST API for scripting http://kadeploy3.gforge.inria.fr/ S. Badia, A. Carpen-Amarie, A. Lèbre, L. Nussbaum Testing IaaS Clouds on Grid 5000 12 / 24

Customizing the experimental environment CPU performance I Reconfigure experimental conditions with Distem Introduce heterogeneity in an homogeneous cluster Emulate complex network topologies 0 1 2 CPU cores 3 5 4 6 7 n1 n2 VN 1 VN 2 VN 3 if0 1 1 M Mb bp ps, 3 s, 30 0ms ms s 3m s, bp s 0 M, 1m s 10 Mbp 0 if0 10 5 Mbps, 10ms if0 10 Mbps, 5ms n3 4 Mbps, 12ms if1 6 Mbps, 16ms ms if0 00, 2 s ps Kb 100m 10, s p Kb 20 2 51 00 K 2K b bp ps, 3 s, 40 0ms ms if0 n4 n5 Virtual node 4 http://distem.gforge.inria.fr/ S. Badia, A. Carpen-Amarie, A. Lèbre, L. Nussbaum Testing IaaS Clouds on Grid 5000 13 / 24

Virtualisation & Cloud XP requirements Efficient provisionning of machines Kadeploy IP addresses for Virtual Machines Two different solutions on Grid 5000: G5K-Subnets KaVLAN S. Badia, A. Carpen-Amarie, A. Lèbre, L. Nussbaum Testing IaaS Clouds on Grid 5000 14 / 24

Network reservation with G5K-subnets Grid 5000 enable different users to run experiments concurrently Need to mechanism to provide IP ranges for virtual machines G5K-subnets adds IP ranges reservation to OAR oarsub -l slash_22=2+nodes=8 -I IP ranges are routable inside Grid 5000 But no isolation: one can steal IP addresses S. Badia, A. Carpen-Amarie, A. Lèbre, L. Nussbaum Testing IaaS Clouds on Grid 5000 15 / 24

Network isolation with KaVLAN Reconfigures switches for the duration of a user experiment to achieve complete level 2 isolation: Avoid network pollution (broadcast, unsolicited connections) Enable users to start their own DHCP servers Experiment on ethernet-based protocols Interconnect nodes with another testbed without compromising the security of Grid 5000 Relies on 802.1q (VLANs) Compatible with many network equipments Can use SNMP, SSH or telnet to connect to switches Supports Cisco, HP, 3Com, Extreme Networks and Brocade Controlled with a command-line client or a REST API S. Badia, A. Carpen-Amarie, A. Lèbre, L. Nussbaum Testing IaaS Clouds on Grid 5000 16 / 24

KaVLAN - different VLAN types site A default VLAN routing between Grid 5000 sites SSH gw local, isolated VLAN only accessible through a SSH gateway connected to both networks global VLANs all nodes connected at level 2, no routing routed VLAN separate level 2 network, reachable through routing site B S. Badia, A. Carpen-Amarie, A. Lèbre, L. Nussbaum Testing IaaS Clouds on Grid 5000 17 / 24

Delivering IaaS clouds to users Kadeploy, G5K-subnets and KaVLAN are low-level mechanisms While it is possible to use them to deploy virtually any IaaS cloud stack, not everybody wants to do that Need for higher level tools that ease the deployment We will present two such tools S. Badia, A. Carpen-Amarie, A. Lèbre, L. Nussbaum Testing IaaS Clouds on Grid 5000 18 / 24

Deploying IaaS Clouds with G5K-campaign G5K-campaign: Framework for coordinating experiments Relies on the Grid 5000 REST API Extendable with engines Specific engines written for Clouds installation Uses Chef cookbooks to describe the installation process Relies on G5K-subnets for IP ranges allocation S. Badia, A. Carpen-Amarie, A. Lèbre, L. Nussbaum Testing IaaS Clouds on Grid 5000 19 / 24

Cloud engine Grid 5000 API OAR Kadeploy G5ksubnets Cloud frontend Cloud nodes Run Reserve Deploy Reserve subnets Parallel deploy Get subnets Send configuration Parallel Install Parallel Configure Installation results S. Badia, A. Carpen-Amarie, A. Lèbre, L. Nussbaum Testing IaaS Clouds on Grid 5000 20 / 24

Results Generic Cloud deployment engine supporting OpenNebula, CloudStack and Nimbus Can create a Cloud with hundreds of nodes Example deployment: OpenNebula cloud 80 nodes from 3 Grid 5000 sites 350 virtual machines used to run Hadoop less than 20 minutes to deploy including 6 minutes for the initial Kadeploy run S. Badia, A. Carpen-Amarie, A. Lèbre, L. Nussbaum Testing IaaS Clouds on Grid 5000 21 / 24

OpenStack on Grid 5000 "default" mode: flatdhcp OpenStack-provided DHCP server cannot co-exist with the Grid 5000 DHCP server Requires isolation KaVLAN Connection to the rest of Grid 5000 through KaVLAN gateways or dual-connected nodes Automated using Puppet recipes from PuppetLabs/StackForge Example deployment: 30 physical machines in 20 minutes Used as a staging area to port a bio-informatics workflow to AWS S. Badia, A. Carpen-Amarie, A. Lèbre, L. Nussbaum Testing IaaS Clouds on Grid 5000 22 / 24

Future works Enlarge the scale of deployments Requires improvements to orchestration of deployments Extend the testbed to support: Network virtualization (OpenFlow) Big Data experiments S. Badia, A. Carpen-Amarie, A. Lèbre, L. Nussbaum Testing IaaS Clouds on Grid 5000 23 / 24

Conclusions Grid 5000: a versatile, reconfigurable testbed Reconfigure the software stack using Kadeploy Reserve IP ranges with G5K-subnets Network isolation with KaVLAN Supports OpenNebula, CloudStack, Nimbus, OpenStack You can get an account. Mail me lucas.nussbaum@loria.fr S. Badia, A. Carpen-Amarie, A. Lèbre, L. Nussbaum Testing IaaS Clouds on Grid 5000 24 / 24