Virtualization. (and cloud computing at CERN)

Similar documents

PES. Batch virtualization and Cloud computing. Part 1: Batch virtualization. Batch virtualization and Cloud computing

Cloud Computing PES. (and virtualization at CERN) Cloud Computing. GridKa School 2011, Karlsruhe. Disclaimer: largely personal view of things

OpenNebula Leading Innovation in Cloud Computing Management

Image Distribution Mechanisms in Large Scale Cloud Providers

Cloud Computing Architecture with OpenNebula HPC Cloud Use Cases

Solution for private cloud computing

Dynamic Extension of a Virtualized Cluster by using Cloud Resources CHEP 2012

U-LITE Network Infrastructure

Cloud Computing through Virtualization and HPC technologies

Solution for private cloud computing

Computing in High- Energy-Physics: How Virtualization meets the Grid

OpenNebula Cloud Innovation and Case Studies for Telecom

Virtualization Infrastructure at Karlsruhe

Data Centers and Cloud Computing

IaaS Cloud Architectures: Virtualized Data Centers to Federated Cloud Infrastructures

Integration of Virtualized Workernodes in Batch Queueing Systems The ViBatch Concept

Tier0 plans and security and backup policy proposals

OpenNebula Cloud Case Studies

Intro to Virtualization

HTCondor at the RAL Tier-1

Enabling Technologies for Distributed Computing

Open Source Cloud Computing Management with OpenNebula

OpenNebula Cloud Case Studies for Research and Industry

Enabling Technologies for Distributed and Cloud Computing

WM Technical Evolution Group Report

SURFsara HPC Cloud Workshop

Data Centers and Cloud Computing. Data Centers

2) Xen Hypervisor 3) UEC

A quantitative comparison between xen and kvm

Virtualization, Grid, Cloud: Integration Paths for Scientific Computing

Performance Management for Cloudbased STC 2012

Elastic Cloud Computing in the Open Cirrus Testbed implemented via Eucalyptus

Cloud Computing for Control Systems CERN Openlab Summer Student Program 9/9/2011 ARSALAAN AHMED SHAIKH

Mobile Cloud Computing T Open Source IaaS

Virtualisation Cloud Computing at the RAL Tier 1. Ian Collier STFC RAL Tier 1 HEPiX, Bologna, 18 th April 2013

SUSE Cloud 2.0. Pete Chadwick. Douglas Jarvis. Senior Product Manager Product Marketing Manager

Introduction to OpenStack

Managing managed storage

Using SUSE Cloud to Orchestrate Multiple Hypervisors and Storage at ADP

Data Centers and Cloud Computing. Data Centers. MGHPCC Data Center. Inside a Data Center

Cloud Computing. Chapter 1 Introducing Cloud Computing

SURFsara HPC Cloud Workshop

Batch and Cloud overview. Andrew McNab University of Manchester GridPP and LHCb

Betriebssystem-Virtualisierung auf einem Rechencluster am SCC mit heterogenem Anwendungsprofil

Alternative models to distribute VO specific software to WLCG sites: a prototype set up at PIC

Sistemi Operativi e Reti. Cloud Computing

Infrastructure as a Service (IaaS)

Experience with Server Self Service Center (S3C)

Infrastructure as a Service

CS 695 Topics in Virtualization and Cloud Computing and Storage Systems. Introduction

Η υπηρεσία Public IaaS ΕΔΕΤ ανάπτυξη και λειτουργία για χιλιάδες χρήστες

IOS110. Virtualization 5/27/2014 1

With Red Hat Enterprise Virtualization, you can: Take advantage of existing people skills and investments

Options in Open Source Virtualization and Cloud Computing. Andrew Hadinyoto Republic Polytechnic

GESTIONE DI APPLICAZIONI DI CALCOLO ETEROGENEE CON STRUMENTI DI CLOUD COMPUTING: ESPERIENZA DI INFN-TORINO

Introduction What is the cloud

A Complete Open Cloud Storage, Virt, IaaS, PaaS. Dave Neary Open Source and Standards, Red Hat

Preview of a Novel Architecture for Large Scale Storage

Emerging Technology for the Next Decade

<Insert Picture Here> Private Cloud with Fusion Middleware

<Insert Picture Here> Architekturen, Bausteine und Konzepte für Private Clouds Detlef Drewanz EMEA Server Principal Sales Consultant

Cloud Models and Platforms

CS 6343: CLOUD COMPUTING Term Project

A Web-based Portal to Access and Manage WNoDeS Virtualized Cloud Resources

Linux/Open Source and Cloud computing Wim Coekaerts Senior Vice President, Linux and Virtualization Engineering

A Holistic Model of the Energy-Efficiency of Hypervisors

The VMware Administrator s Guide to Hyper-V in Windows Server Brien Posey Microsoft

Traditional v/s CONVRGD

Cloud computing - Architecting in the cloud

CS 695 Topics in Virtualization and Cloud Computing. Introduction

Shoal: IaaS Cloud Cache Publisher

Best Practices for Managing Storage in the Most Challenging Environments

Red Hat enterprise virtualization 3.0 feature comparison

Cloud/SaaS enablement of existing applications

VirtualclientTechnology 2011 July

Cloud Server. Parallels. Key Features and Benefits. White Paper.

Virtualization with Windows

Scientific Computing Data Management Visions

Computing at the HL-LHC

MADFW IaaS Program Review

Best Practices for Monitoring Databases on VMware. Dean Richards Senior DBA, Confio Software

IBM Platform Computing Cloud Service Ready to use Platform LSF & Symphony clusters in the SoftLayer cloud

AMD SEAMICRO OPENSTACK BLUEPRINTS CLOUD- IN- A- BOX OCTOBER 2013

Building Clouds with OpenNebula 2.2 and StratusLab

ZEN LOAD BALANCER EE v3.04 DATASHEET The Load Balancing made easy

RED HAT ENTERPRISE VIRTUALIZATION FOR SERVERS: COMPETITIVE FEATURES

StACC: St Andrews Cloud Computing Co laboratory. A Performance Comparison of Clouds. Amazon EC2 and Ubuntu Enterprise Cloud

1 Copyright 2011, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 7

How To Build A Cloud Stack For A University Project

Cloud Computing and Amazon Web Services

Cloud Computing. Chapter 1 Introducing Cloud Computing

Agenda. Begining Research Project. Our problems. λ The End is not near...

MODULE 3 VIRTUALIZED DATA CENTER COMPUTE

Making a Smooth Transition to a Hybrid Cloud with Microsoft Cloud OS

Mark Bennett. Search and the Virtual Machine

A Very Brief Introduction To Cloud Computing. Jens Vöckler, Gideon Juve, Ewa Deelman, G. Bruce Berriman

Database Services for CERN

Performance Management for Cloud-based Applications STC 2012

Simplifying Storage Operations By David Strom (published 3.15 by VMware) Introduction

Deep Dive: Maximizing EC2 & EBS Performance

Transcription:

Virtualization (and cloud computing at CERN) Ulrich Schwickerath Special thanks: Sebastien Goasguen Belmiro Moreira, Ewan Roche, Romain Wartel See also related presentations: CloudViews2010 conference, Porto HEPIX spring and autumn meeting 2009, 2010 Virtualization vision, Grid Deployment Board (GDB) 9/9/2009 Batch virtualization at CERN, EGEE09 conference, Barcelona 1st and 2nd multi-threading and virtualization workshop, CERN, 2009

The organizers wish list for this presentation Questions to be addressed in this presentation what is the motivation to go for virtualization and why do we need it how does the technical implementation of an IaaS architecture at CERN look like how to manage systems for large scale virtualization how we do stuff like image distribution which experiences/lessons did we learn already

Some facts about CERN European Organization for Nuclear Research The world s largest particle physics laboratory Located on Swiss/French border Funded/staffed by 20 member states in 1954 With many contributors in the USA Birth place of World Wide Web Made popular Demons by the movie Angels Flag ship accelerator LHC http://www.cern.ch and

Facts about LHC LHCb TOTEM LHCf Alice Circumference of LHC: 26 659 m Magnets : 9300 Temperature: -271.3 C (1.9 K) Cooling: ~60T liquid He Max. Beam energy: 7TeV Current beam energy: 3.5TeV ATLAS CMS

Facts about LCG computing Data Signal/Noise ratio 10-9 Data volume: High rate * large number of channels * 4 experiments 15 PetaBytes of new data each year Compute power Event complexity * Nb. events * thousands users 100 k CPUs (cores) 20% of this at CERN Worldwide analysis & funding Computing funding locally in major regions & countries Efficient analysis everywhere GRID technology

Facts about CERN Computer Center Disk and tape: 1500 disk servers 5PB disk space 16PB tape storage Computing facilities: >20.000 CPU cores (batch only) Up to ~10000 concurrent jobs Job throughput ~200 000/day http://it-dep.web.cern.ch/it-dep/communications/it_facts figures.htm

The organizers wish list for this presentation Questions to be addressed in this presentation what is the motivation to go for virtualization and why do we need it how does the technical implementation of an IaaS architecture at CERN look like how to manage systems for large scale virtualization how we do stuff like image distribution which experiences/lessons did we learn already

Virtualization why? Example from CERN: CPU utilization of dedicated resources for special tasks, eg web servers 155 machines

Virtualization why? One reason: transistors count doubling every two years CPUs with more and more cores

Virtualization: why? Service consolidation Improve resource usage by squeezing underutilized machines onto single hypervisors Ease management by supporting live migration Decouple hardware life cycle from applications running on the box 10

Virtualization: why? Service consolidation Improve resource usage by squeezing underutilized machines onto single hypervisors Ease management by supporting live migration Decouple hardware life cycle from applications running on the box 11

Virtualization: why? Service consolidation Improve resource usage by squeezing underutilized machines onto single hypervisors Ease management by supporting live migration Decouple hardware life cycle from applications running on the box 12

Virtualization: why? Service consolidation Improve resource usage by squeezing underutilized machines onto single hypervisors Ease management by supporting live migration Decouple hardware life cycle from applications running on the box What does that mean? 13

Virtualization: why? Hardware life cycle: Hardware is replaced every 3 years: Hardware is bought with 3 years warranty New hardware is more power efficient The dilemma: new hardware often requires new operating systems/drivers Old software often does not work on new operating systems... Virtualization can solve this problem! Note: Also operating systems have a limited life time! Service consolidation at CERN: just now being put into production 14

Virtualization for batch processing? Situation for the batch farm at CERN Issue: worker node transition to new OS versions Average SLC4 usage Average SLC5 usage SLC4/SLC5 mixture: New resources didn't work with SLC4 Not all applications were ready for SLC5 15

Virtualization for batch processing? The batch farm at CERN: some facts Issue: increasing complexity complicates scheduling of intrusive updates: ~ 80 different queues ~ 50 different node functions and sub-clusters ~ 40 different hardware types 16

Virtualization of lxbatch: Motivation Facts: CERN batch farm lxbatch ~3500 physical hosts ~20000 CPU cores >80 queues Challenges: How to resolve conflicts between new hardware and old OS? How to provide the right mix of environments matching needs? How to manage intrusive interventions in a fragmented setup? Can virtualization help here? YES! 17

Virtualizing batch resource Challenges: How to resolve conflicts between new hardware and old OS? How to provide the right mix of environments matching needs? How to manage intrusive interventions in a fragmented setup? Idea: Use virtual batch worker nodes! You will need a scalable and robust infrastructure which can support many thousand virtual machines a way to manage these virtual machines 18

Virtualizing batch resource Challenges: How to resolve conflicts between new hardware and old OS? How to provide the right mix of environments matching needs? How to manage intrusive interventions in a fragmented setup? You will need to support different virtual machine images of all required flavors have an efficient way to transport these to your physical nodes a decision making system which chooses the hardware starts the virtual machine monitors it during it's lifetime 19

Virtualizing batch resource Challenges: How to provide the right mix of environments matching needs? How to resolve conflicts between new hardware and old OS? How to manage intrusive interventions in a fragmented setup? Idea: If you limit the life time of your virtual batch worker nodes, and always restart from the latest image version, you get your intrusive software upgrades for free! For hardware intervention, just take the physical node out of production, and wait until all VMs are gone. That's equivalent to draining of physical batch nodes! 20

The organizers wish list for this presentation Questions to be addressed in this presentation what is the motivation to go for virtualization and why do we need it how does the technical implementation of an IaaS architecture at CERN look like how to manage systems for large scale virtualization how we do stuff like image distribution which experiences/lessons did we learn already

Virtualizing batch resources... Cloud Computing SaaS (e.g Google docs), PaaS (e.g Google App Engine), IaaS (e.g EC2) layers IaaS layer offers Elasticity, On-demand and utility pricing IaaS enabled by virtualization Glossary SaaS : Software as a Service PaaS : Platform as a Service IaaS: Infrastructure as a Service See presentation from Tony Cass! 22

Virtual Batch Principles Virtual batch worker nodes: Clones of real worker nodes but not directly centrally managed Platform Infrastructure Sharing Facility Can coexist with physical worker nodes in lxbatch(isf) Instantiated proactively depending on demand For high level VM management Dynamically join the batch farm as worker nodes Limited lifetime: stop accepting jobs after 24h, and then gets destroyed when empty Only one user job per VM at a time Note: The limited lifetime allows for a fully automated system which dynamically adapts to the current needs, and facilitates the deployment of intrusive updates. 23

Virtual Batch Principles Image creation Derived from a centrally managedplatform golden node Infrastructure Sharing Facility Ensures that they are always up to date (ISF) Stored in an image repository For high level VM management Pushed to the hypervisors in case of an update Images Staged on hypervisors to speed up instance creation Master images, instances use LVM snapshots Start with few different flavours only (e.g SLC5, SLC4) 24

Virtual Batch: Principles Image distribution Currently only shared file system available is AFS Scalability for O(1000) hypervisors is an issue Resolved by using peer to peer methods SCP wave Rtorrent Close collaboration with HEPiX virtualization Working Group 25

SCP wave Principles: Binary tree based distribution Each node which finished the transfer becomes a new parent node Slow at the beginning, but with a logarithmic speed up Available at: http://www.opennebula.org/software:ecosystem:scp-wave 26

BitTorrent setup Several implementations were tested, rtorrent was finally selected Careful tuning is required: The number of connected peers is restricted to at most 5 The memory usage for the transfers needs to be tuned We use a central tracker based on OpenTracker, complemented with DHT The tracker runs an rtorrent client and acts as initial sender If you are interested, get in touch with us 27

Virtual Batch: Principles Image distribution: other options NFS : discarded because of scalability concerns. It would require a dedicated infrastructure to make it scale to O(1000) physical machines AFS : even with replication performance and scalability concerns CERNVMFS : may be an option but requires several layers of squid proxies which we currently don't have (yet?) Multicast: possibly the most efficient way, but currently not (yet?) supported by the CERN networking infrastructure 28

Virtual Batch: Principles Virtual machine management system Platform Infrastructure Sharing Facility The VM management system must perform the following (ISF) tasks: Receive and process requests for virtual machines For high level VM management Provide a queue for pending requests Choose the next request to be processed Terminate them when requested Choose a matching physical machine with enough free resources Talk to the selected machine, start the virtual machine and monitor it 29

Virtual Batch: Principles VM placement and management system Platform Infrastructure Sharing Facility (ISF) We count on existing solutions For high level VM management Tested both an open source and a proprietary solution OpenNebula (ONE) Platform's Infrastructure Sharing Facility (ISF) Both implement a request queue and a basic scheduler to place VMs 30

Virtual Batch: Architecture Putting it all together 31

Status of building blocks (test system) Initial deployment Central management Monitoring and alarming Batch application Hypervisor cluster VM kiosk and image distribution VM management system OK OK OK OK OK OK Mostly implemented ISF OK, ONE missing OK (OK) Switched off for tests missing missing 32

The organizers wish list for this presentation Questions to be addressed in this presentation what is the motivation to go for virtualization and why do we need it how does the technical implementation of an IaaS architecture at CERN look like how to manage systems for large scale virtualization how we do stuff like image distribution which experiences/lessons did we learn (already)

Lessons learned Most important: scalability and reliability at all levels! Hypervisor cluster Manage it centrally like a big farm of unique machines Never do manual interventions on the machines Image distribution system It must be designed to scale to several thousand clients The distribution time must be reasonable Virtual machine provisioning system Must be able to deal with several thousand hypervisors Must be able to manage several ten-thousand machines Must allow for a redundant setup (eg fail-over) Batch system software Must be able to manage many thousand clients All of these need careful testing

Scalability tests: can we actually do it? Temporary access to about 500 new physical worker nodes 2 x XEON L5520 @ 2.27GHz CPUs (2 x 4 cores) 24GB RAM ~1TB local disk space HS06 rating ~97 SLC5 XEN (1 rack with KVM for testing) Up to 44 VMs per hypervisor (on private IPs) Temporary batch master machine(s): 2 x XEON L5520 @ 2.27GHz CPUs (8 cores) 48GB RAM 3x 2TB disks, software raid 5 setup 10 GE 35

Scalability tests: hypervisor cluster All machines in the hypervisor cluster Belong to the same cluster called lxcloud Share the same software setup Are centrally managed with the Quattor tool kit Are monitored with LEMON CERN has several years of operational experience with such a setup. We know that it will scale up to several thousand machines. New nodes automatically download images and sign up to ONE. http://elfms.web.cern.ch/elfms 36

Scalability tests: Image distribution system Number of target machines Image Distribution: 8GB test file to 462 hypervisors SCP wave: 38 min Throttling possible Rtorrent: 17 min time/min 37

Scalability tests: provisioning systems One shot test with OpenNebula: Inject virtual machine requests And let them live and die, reduced life time 4h Record the number of alive machines seen by LSF every 30s Units: 1h5min 38

Scalability tests: provisioning systems Experiences: We started with a few hundred virtual machines Scaling up the systems required close collaboration with developers Both systems are capable to manage several thousand machines now 39

Scalability tests: batch system software Questions to address Up to which number of worker nodes does LSF scale? What size is required for the LSF master nodes? What is the response time as a function of worker nodes? At which point do we have to redesign the batch service? Note: These questions are independent of virtualization! The new infrastructure allowed us probe the software way beyond what can be done with physical resources! This is of high interest for planning of the layout of computing resources. Remark: LSF=Load Sharing Facility, provided Platform Computing 40

Scalability tests: batch system software 5000 worker nodes 41

Nodes Scalability tests: batch system software 16000 14000 12000 10000 8000 6000 4000 2000 0 32 Running jobs Total jobs 4 60 88 144 200 256 312 368 424 480 536 592 648 704 760 816 872 928 984 1040 1096 1152 1208 1264 1320 1376 1432 1488 1544 1600 116 172 228 284 340 396 452 508 564 620 676 732 788 844 900 956 1012 1068 1124 1180 1236 1292 1348 1404 1460 1516 1572 1628 500000 400000 300000 200000 100000 0 32 4 60 88 144 200 256 312 368 424 480 536 592 648 704 760 816 872 928 984 1040 1096 1152 1208 1264 1320 1376 1432 1488 1544 1600 116 172 228 284 340 396 452 508 564 620 676 732 788 844 900 956 1012 1068 1124 1180 1236 1292 1348 1404 1460 1516 1572 1628 16000 14000 12000 10000 8000 6000 4000 2000 0 4 32 60 88 144 200 256 312 368 424 480 536 592 648 704 760 816 872 928 984 1040 1096 1152 1208 1264 1320 1376 1432 1488 1544 1600 116 172 228 284 340 396 452 508 564 620 676 732 788 844 900 956 1012 1068 1124 1180 1236 1292 1348 1404 1460 1516 1572 1628 Response time 60 50 40 30 20 10 0 32 4 60 88 144 200 256 312 368 424 480 536 592 648 704 760 816 872 928 984 1040 1096 1152 1208 1264 1320 1376 1432 1488 1544 1600 116 172 228 284 340 396 452 508 564 620 676 732 788 844 900 956 1012 1068 1124 1180 1236 1292 1348 1404 1460 1516 1572 1628 42

Future Plans Put a small number of virtual worker nodes in production in lxbatch Monitor performance, usage and overall processes Setup prod and dev machines for ONE and ISF and run both side by side 43

Future Plans and Technical issues: VM CPU accounting and CPU factors Identify and solve possible storage issues (scratch, swap, AFS cache) Could have ~100 VM in prod lxbatch by ~autumn 2010, assuming no further surprises 44

Towards Cloud Computing Merging this effort with the work of HEPiX working group has great potential for CERN and WLCG. 45

Long term ideas for CERN (SLC = Scientific Linux CERN) Today I Batch Physical SLC4 WN Physical SLC5 WN Near future: Batch SLC4 WN SLC5 WN hypervisor cluster Physical SLC4 WN Physical SLC5 WN (far) future? Batch T0 development other/cloud applications Internal cloud 46

Conclusions We have developed the mechanisms to offer an IaaS service on a large scale: We have a scalable hypervisor farm setup, supporting XEN and KVM We have an efficient image distribution system This IaaS provider currently moving to production for virtual worker nodes Technically able to provide IaaS at large scale For IT Maybe later also for users? 47

Thank you! Any questions? 48

CPU performance: HS06 benchmark results (preliminary) HS06 rating Test case: 1 VM / hypervisor, raising the number of CPUs/hypervisor Physical box: 97 HS06 100 XEN 90 80 70 60 50 Theoretical Use case: VM for multithreaded apps preliminary 40 30 20 10 0 1 2 3 4 5 6 7 8 Number of CPUs per VM 49

CPU performance: HS06 benchmark results (preliminary) Test case : raise number of 1core VMs on a single hypervisor Working point 12 preliminary XEN 10 average HSP06 per VM HS06 rating per VM (average) 14 8 6 4 2 0 1 2 3 4 5 6 7 8 9 N VMs per hypervisor Number of VMs per hypervisor 50

Disk performance: iozone tests Test case: Raise the number of VMs running iozone on a single hypervisor Compare with running N iozone processes in parallel on a single physical server Write performance OK Read performance requires tuning Tuning: eg. change read-ahead size in /sys/block/svdx/queue/read_ahead_kb Command line parameters: iozone -Mce -r 256k -s 8g -f /root/iozone.dat -i0 -i1 -i 51

Disk performance: iozone tests read [kb/s] Read [kb/s] read performance 90000 Compare with physical node: ~100MB/s read ~82MB/s write 80000 70000 60000 Too optimistic... 50000 ~20% performance penalty 40000 30000 20000 Needs further investigation! 10000 0 1 2 3 4 5 6 7 8 9 number of VMs Command line parameters: iozone -Mce -r 256k -s 8g -f /root/iozone.dat -i0 -i1 -i 52

Network performance Still to do... no big worries here though 53

Batch virtualization: architecture Job submission CE/interactive Grey: physical resource Colored: different VMs VM kiosk Batch system management Hypervisors / HW resources VM worker nodes With limited lifetime Centrally managed Golden nodes VM management system 54

Cloud standards No standards yet OGC OCCI, CCIF, Cloud Manifesto, Cloud Security Alliance...lots of bodies at work (i.e according to NASA speaker ~20) deltacloud.org (Driven by RH, unifying API with adapters) Vcloud (from VMware) and EC2 API seem key for IaaS interface OpenNebula speaks Deltacloud and Deltacloud speaks OpenNebula 55