Themis Athanassiadou HPC Project Manager. ClusterVision. ClusterVision. Engineer Innovate Integrate



Similar documents
Data Centers and Cloud Computing

Sistemi Operativi e Reti. Cloud Computing

Data Centers and Cloud Computing. Data Centers

Data Centers and Cloud Computing. Data Centers. MGHPCC Data Center. Inside a Data Center

How To Compare Cloud Computing To Cloud Platforms And Cloud Computing

Mobile Cloud Computing T Open Source IaaS

Private Cloud Database Consolidation with Exadata. Nitin Vengurlekar Technical Director/Cloud Evangelist

Cloud Computing Architecture with OpenNebula HPC Cloud Use Cases

IOS110. Virtualization 5/27/2014 1

Virtualization & Cloud Computing (2W-VnCC)

Comparing Open Source Private Cloud (IaaS) Platforms

A Gentle Introduction to Cloud Computing

Using SUSE Cloud to Orchestrate Multiple Hypervisors and Storage at ADP

Comparing Ganeti to other Private Cloud Platforms. Lance Albertson

<Insert Picture Here> Private Cloud with Fusion Middleware

Hadoop on OpenStack Cloud. Dmitry Mescheryakov Software

SUSE Cloud 2.0. Pete Chadwick. Douglas Jarvis. Senior Product Manager Product Marketing Manager

Week Overview. Installing Linux Linux on your Desktop Virtualization Basic Linux system administration

A Complete Open Cloud Storage, Virt, IaaS, PaaS. Dave Neary Open Source and Standards, Red Hat

Cloud Computing: Making the right choices

Cisco Application-Centric Infrastructure (ACI) and Linux Containers

Cloud Platform Comparison: CloudStack, Eucalyptus, vcloud Director and OpenStack

RED HAT ENTERPRISE VIRTUALIZATION

Linux/Open Source and Cloud computing Wim Coekaerts Senior Vice President, Linux and Virtualization Engineering

Cloud Computing through Virtualization and HPC technologies

Operating Systems Virtualization mechanisms

Enterprise-Class Virtualization with Open Source Technologies

Cloud Computing with Red Hat Solutions. Sivaram Shunmugam Red Hat Asia Pacific Pte Ltd.

IBM Platform Computing : infrastructure management for HPC solutions on OpenPOWER Jing Li, Software Development Manager IBM

Private Cloud Management

With Red Hat Enterprise Virtualization, you can: Take advantage of existing people skills and investments

Virtual Machine Monitors. Dr. Marc E. Fiuczynski Research Scholar Princeton University

KVM, OpenStack, and the Open Cloud

Why Cisco for Cloud? IT Service Delivery, Orchestration and Automation

Infrastructure as a Service (IaaS)

Definitions. Hardware Full virtualization Para virtualization Hosted hypervisor Type I hypervisor. Native (bare metal) hypervisor Type II hypervisor

IBM Platform Computing Cloud Service Ready to use Platform LSF & Symphony clusters in the SoftLayer cloud

WHITE PAPER: Egenera Cloud Suite for EMC VSPEX. The Proven Solution For Building Cloud Services

RED HAT CLOUD SUITE FOR APPLICATIONS

STeP-IN SUMMIT June 18 21, 2013 at Bangalore, INDIA. Performance Testing of an IAAS Cloud Software (A CloudStack Use Case)

Scientific Computing Data Management Visions

Red Hat Network Satellite Management and automation of your Red Hat Enterprise Linux environment

OpenStack: we drink our own Champagne. Teun Docter Software developer

Introduction to Cloud Computing

Emerging Technology for the Next Decade

Understand IBM Cloud Manager V4.2 for IBM z Systems

RED HAT ENTERPRISE VIRTUALIZATION FOR SERVERS: COMPETITIVE FEATURES

RED HAT CONTAINER STRATEGY

HPC Cluster Decisions and ANSYS Configuration Best Practices. Diana Collier Lead Systems Support Specialist Houston UGM May 2014

The path to the cloud training

COS 318: Operating Systems. Virtual Machine Monitors

Project Documentation

Introduction to OpenStack

wu.cloud: Insights Gained from Operating a Private Cloud System

cloud functionality: advantages and Disadvantages

Data Centers and Cloud Computing. Data Centers

Accelerating Innovation with Self- Service HPC

IaaS Cloud Architectures: Virtualized Data Centers to Federated Cloud Infrastructures

The Virtualization Practice

Simplifying Big Data Deployments in Cloud Environments with Mellanox Interconnects and QualiSystems Orchestration Solutions

The path to the cloud training

New Data Center architecture

Global Headquarters: 5 Speen Street Framingham, MA USA P F

Cloud Optimize Your IT

Building Private & Hybrid Cloud Solutions

Appro Supercomputer Solutions Best Practices Appro 2012 Deployment Successes. Anthony Kenisky, VP of North America Sales

Today. 1. Private Clouds. Private Cloud toolkits. Private Clouds and OpenStack Introduction

INTRODUCTION TO CLOUD MANAGEMENT

Cisco Intelligent Automation for Cloud

Datacenters and Cloud Computing. Jia Rao Assistant Professor in CS

Red Hat Satellite Management and automation of your Red Hat Enterprise Linux environment

PISTON CLOUDOS WITH OPENSTACK: TURN-KEY WEB-SCALE INFRASTRUCTURE SOFTWARE. Easy. CloudOS Compendium TECHNICAL WHITEPAPER

9/26/2011. What is Virtualization? What are the different types of virtualization.

FOR SERVERS 2.2: FEATURE matrix

The Building Blocks to the Architecture of a Cloud Platform

Monitor Open stack environments from the bottom up and front to back. Roger Ruttimann VP Engineering, GroundWork OpenSource November 17, 2015

Introduction to Openstack, an Open Cloud Computing Platform. Libre Software Meeting

CHAPTER 2 THEORETICAL FOUNDATION

Cloud Simulator for Scalability Testing

Empowering Private Cloud with Next Generation Infrastructure. Martin Ip, Head of Advanced Solutions and Services Macroview Telecom

GPU Accelerated Signal Processing in OpenStack. John Paul Walters. Computer Scien5st, USC Informa5on Sciences Ins5tute

Red Hat Enterprise Linux 6. Stanislav Polášek ELOS Technologies

RED HAT OPENSTACK PLATFORM A COST-EFFECTIVE PRIVATE CLOUD FOR YOUR BUSINESS

Virtualization and Cloud Computing

Becoming a Cloud Services Broker. Neelam Chakrabarty Sr. Product Marketing Manager, HP SW Cloud Products, HP April 17, 2013

CUDA in the Cloud Enabling HPC Workloads in OpenStack With special thanks to Andrew Younge (Indiana Univ.) and Massimo Bernaschi (IAC-CNR)

OpenNebula The Open Source Solution for Data Center Virtualization

CLOUD COMPUTING. Virtual Machines Provisioning and Migration Services Mohamed El-Refaey

Bright Cluster Manager

GUIDELINE. on SERVER CONSOLIDATION and VIRTUALISATION. National Computer Board, 7th Floor Stratton Court, La Poudriere Street, Port Louis

2) Xen Hypervisor 3) UEC

Transcription:

Themis Athanassiadou HPC Project Manager

About 12 years Europe's Dedicated Specialist for High-Performance Computing End-to-end hardware/software/services solution provider HPC engineering and innovation is at the heart of what we do Active in Europe, Middle-East & Africa, Asia-Pacific Amsterdam based: - Well connected to major European business locations Quality - ISO9001:2008 & ISO14001 certified More than 400 projects and 250 customers

Customers Industry Government Education

HPC in Industry Systems share Top 500 Government 4% Vendor 3% Academic 18% Industry 51% Research 24% Industry Research Academic Government Vendor

Today, to Out-Compute is to Out-Compete HPC: Enables development of new products and services Reduces time to market Reduces R&D costs Increases quality Reduces personnel costs

HPC powers industry giants

and empowers a broad spectrum of other businesses

Adopting HPC can be challenging for many businesses Lack of infrastructure Cost of equipment Cost of operation Lack of expertise Lack of experience

CV Innovations that ease HPC adoption OpenStack Cloud Compute Convergence of big data, cloud and HPC Merge with the general IT infrastructure Open industry standard, open source Securely host CPU cycles and storage HPC mineral oil based cooling solution Saving ~20% power: no air-cooling or fans Less current leakage, higher HPC performance Skinless servers Re-use of racks and same oil for > 15 years Remote System Administration Outsourced infrastructure management Power of scaling: lower your cost Especially suitable for OpenStack You can focus on workflow instead of hardware Trinity HPC enabled cloud environment OpenStack base. Support for containers Enhanced for High Performance Compute Full HPC Ecosystem

Why cloud? Freedom of choice and flexibility Main characteristics: On demand self-service Broad network access Resource pooling Rapid elasticity Measured service Service models: Software as a Service Platform as a Service Infrastructure as a Service Deployment models: Private Cloud Public Cloud Hybrid Cloud Community Cloud

Ideally Engineering (HPC nodes) IT department Finance Physical Hardware (Servers, Database, Storage, GPU Pool)

To get there, you need some form of virtualization Engineering IT department Finance VM VM VM Virtual Machine Monitor Physical Hardware (Servers, Database, Storage, GPU Pool)

Virtualization technologies HYPERVISORS (eg. Vmware, Xen, KVM) Full virtualization Great workload isolation Slower VS CONTAINERS (eg LXC, Docker) Lightweight virtualization Good workload isolation Faster Image credit: CISCO

Which is best for HPC? HPC applications: usually tuned to specific hardware strive to maximize performance (compute, I/O) some require a very fast network (Infiniband) Clouds only guarantee "minimal" level of performance CPU test: Linpack performance on 2 sockets (16 cores) IBM research report: RC25482 (AUS1407-001) July 21, 2014

Is performance using container virtualization good enough? According to the IBM study: Docker equals or exceeds KVM performance in every case tested (CPU, memory, I/O) For I/O-intensive workloads, both forms of virtualization should be used carefully. Network, which is very important in many applications, needs to be tested. CPU test: Linpack performance on 2 sockets (16 cores) Container based virtualization is a great starting point for HPC in the Cloud.

Building a suitable HPC Cloud An HPC Cloud should strive to find the balance between flexibility, convenience and acceptable performance.

Choosing a suitable Cloud for HPC Depends on the need (Public? Private? Level of abstraction? Specialized hardware? Performance? ) A number of companies, including Penguin, R-HPC, Amazon, Univa, SGI, Sabalcore, UberCloud and Gompute offer specialized HPC clouds. Evaluate and choose. For full freedom of choice/customization/security, build your cloud from an OpenSource project (OpenStack, OpenNebula, Eucalyptus) + add HPC functionality using containers.

Openstack: Cloud building toolkit of choice Open source set of software tools for building and managing cloud computing platforms for public and private clouds. Currently managed by the OpenStack Foundation. More than 200 companies have joined, including Dell, Intel, Red Hat and Oracle Rapidly becoming industry standard It is primarily deployed as an infrastructure as a service (IaaS) solution.

Challenges addressed by OpenStack Problem Manager: Virtual server and HPC environments running independently. Solution With OpenStack, you merge these into one single efficient environment Manager: Resources wasted when Virtual Desktop Infrastructure (VDI) is idle at night Admin: Inability to collaborate with external parties due to lack of a security infrastructure for hosting CPUs/disks Admin : Inflexibility in building and maintaining similar environments on multiple physical platforms User: Finance department needs to run the payroll for tomorrow, and doesn t have the resources to do so in time! With OpenStack you can easily switch from the Virtual Desktop Infrastructure (VDI) to HPC With OpenStack, you can securely host CPU/DISK for paying customers through the use of virtual instances / environments With OpenStack, you can easily share images which contain predefined application binaries and/or environments With OpenStack this user would more easily access a larger or even external infrastructure with the required CPU / Storage environment

Enhancing OpenStack for HPC Full HPC Stack - Monitoring - Checking - Module and Library Environment - Scheduler (SLURM, PBS, SGE etc) Performance Optimisations (containers) Typical HPC services and integration - InfiniBand - Parallel filesystems - GPUs/Accelerators

Trinity: Linking Cloud and HPC Trinity is a set of software tools for building and managing virtual HPC or OpenStack environments in a platform as a service (PaaS) solution, customized for HPC performance. Adds ease of management (Trinity dashboard) to HPC Scalable to tens of thousands of nodes Full hardware support (IPMI, infiniband, PXE) Provides full HPC stacks (schedulers, MPI, libraries) No performance loss (virtualization based on Docker) Allows customers to host their own private or public IaaS cloud (for general IT) Load balancing (HPC) partitions Environment customization

All of the standard HPC cluster manager requirements includ Features Bright CM IBM Platform Trinity Node provisioning Health check and monitoring GUI and command line interfaces SLURM, SGE & PBS support Parallel shell Modules environment Compilers, debuggers & profilers MPI + Scientific libraries Containerized HPC building blocks Cloud Computing Ready

A traditional cluster Login node(s) Worker node(s) Storage node(s)

A Trinity managed cluster Trinity Dashboard (Single management interface) Login node(s) Worker node(s) Storage node(s) Virtual Cluster A: runs the HPC stack for department A Virtual Cluster B: runs the HPC stack for department B Virtual Cluster C: runs VDI Virtual Cluster D: runs general IT infrastructure using OpenStack

In summary: An HPC Cloud is a powerful tool for the arsenal of any industry, small or big. It gives both power and flexibility at many stages of product design and testing It can reduce cost by consolidating resources used for different purposes New software technologies are alleviating performance concerns Many vendor choices to suit every need Trinity is a great choice for a private cloud, providing a full cluster manager, HPC stack and cloud management for HPC, IT, Data.

Thank You!

HPC, Cloud and BigData are coming together HPC CRM Database VDI Email Compute Virtualization Authentication Object storage Dashboard Resource Management Monitoring Deployment Hardware Resources (nodes, network, disks etc) HPC, Cloud and BigData have a lot of overlap: Centralised resources Same complex management, same complex environment Similar high performance storage and powerful server requirements Almost same physical networking Same controller <-> worker-node relationship

Neat tricks in Virtual Machines Current time Current time Node A Node B Node C Node D t t Reliable checkpoint restart of jobs Towards a 100% scheduling efficiency Fast track high priority users Move jobs within the cluster & outside the cluster The price: loosing performance (5% -> 1%)

Why HPC in the Cloud? What is OpenStack? A set of software tools for building and managing cloud computing platforms for public and private clouds. OpenStack is primarily deployed as an infrastructure as a service (IaaS) solution. OpenStack began in 2010 as a joint project of Rackspace Hosting and NASA. Currently, it is managed by the OpenStack Foundation. More than 200 companies have joined this project, including Dell, Intel and Oracle.

Challenges addressed with OpenStack Problem Manager: vmware and HPC environment running independently from each other Manager: Resources wasted on a Virtual Desktop Infrastructure (VDI) is idle at night Admin: Inability to collaborate with external parties due to lack of a security infrastructure for hosting CPUs/disks User Group: Inflexibility in building and maintaining similar environments on multiple physical platforms User: has a deadline for a paper/conference tomorrow, requires fast amounts of immediate CPU cycles Solution With OpenStack, you merge these into one single efficient environment With OpenStack you can easily switch from the Virtual Desktop Infrastructure (VDI) to HPC With OpenStack, you can securely host CPU/DISK for paying customers through the use of virtual instances / environments With OpenStack, you can easily share images which contain predefined application binaries and/or environments With OpenStack this user would more easily access a larger or even external infrastructure with the required CPU / Storage environment

Using the cloud to address growing challenges in HPC IT Manager Rising infrastructure & personnel costs Growing complexity / fragmented infrastructure Increasingly complicated personnel needs System Administrator Growing complexity of hardware & software environment Managing tenants with different needs/workflows Managing secure access to resources Dealing with hardware changes User Cluster environment different from workstation Software stack needed for workflow not pre-installed Non- availability of resource when needed most

Why HPC in the Cloud? What can the Cloud bring to HPC? HPC and Cloud also have significant differences Cloud: Split bigger computing units into smaller Timesharing execution model Elastic Provides commodity (virtual) hardware Increases utilisation to 85% HPC: Merge smaller computing units into a single whole Batch execution model Backfilling Needs specialised hardware (GPU, IB) Utilisation already above 85%

Recent References Dolphin Geophysical Lasting Partnership Onshore / offshore HPC solutions & extended consulting services Bright, BeeGFS, Dell / Asus hardware Volvo IT HPC services partner for Dell Main installs Lyon (France) Gothenburg (Sweden) Ecole Polytechnique Fédérale de Lausanne Framework contract 512+ compute nodes Intel Ivy Bridge / Haswell (Truescale & Servers) Close collaboration on application fine tuning XCat2 National Supercomputer Center (Sweden) 640 compute nodes Asus 4 nodes in 2U systems First Haswell reference: 2640V3 (8 cores / 2.6GHz)

What is RSA and why does it fit so nicely? What? end-to-end HPC management Why? - Avoid single point of failure - Empower your highly qualified admins - Use the power of scaling: lower your cost - Reliability of service delivery How? - Central monitoring system - Know instantly when something is wrong and react upon it - Software updates - Remote and onsite repair - Notification and explanation for actions - Management reporting

OpenStack Trinity OpenStack Trinity Based on open source components with custom OpenStack plugins xcat Deployment Docker Virtualization and image management SLURM Scheduling OpenMPI Communication RSA/Nagios Monitoring and Health-checks Cookbook style recipes and easy customizations

Trinity Why xcat? 1. xcat is stable, full featured and well tested & Open Source 2. xcat is also usable without OpenStack 3. Supports node discovery, image management, stateless nodes, IPMI abstraction, PXE boot 4. OpenStack ironic is still in beta 5. Bright is expensive 6. The main xcat weakness (lousy UI) will be mitigated by standardization and adding a custom OpenStack dashboard to xcat

Trinity Why Docker? Docker is an operating system level virtualization framework. 1. Used extensively by Google and other industry giants 2. Lightweight 3. Much faster than full machine virtualization 4. Good support for image management and versioning 5. Pluggable virtualization drivers (lxc, OpenVZ) 6. Pluggable storage virtualization (AUFS, devicemapper, VFS)

Current status of Trinity Phase 1 of the project is completed We support the following core features 1. Cluster partitioning (virtual clusters) 2. Multiple OS and applications versions between clusters 3. Isolated (sandboxed) applications between clusters 4. Setup OpenStack insidetrinity Will require manual customizations and improvisation to support this in the field. Overall dashboard is absent. POC Q3 2014: Bristol & Cambridge University.

A new approach: Container-based virtualization Linux containers LXC is an operating system level virtualization method for running multiple isolated Linux systems (containers) on a single control host. Docker is an open-source project that automates the deployment of applications inside containers, by providing an additional layer of abstraction and automation. - Lightweight - Fast provisioning - Workload isolation - Near bare metal performance

Adopting cloud computing is easier Lack of infrastructure Cost of equipment Cost of operation Lack of expertise Lack of experience

Current status of HARP Phase 2 (Q3 2014): 1. Allow remote management (RSA inside OpenStack dashboard) 2. Allow self-service partitioning by customers (use case 5) Phase 3 (Q4 2014): 1. Allow automated elasticity (meta-scheduling, repartition resources based on a calendar- use case 2) 2. Allow sharing of resources between HPC clusters 3. Allow self-service of jobs by customers of our customers (SaaS model)