CERN Cloud Architecture Belmiro Moreira belmiro.moreira@cern.ch @belmiromoreira Geneva, July 16 th, 2015 2
CERN Cloud Architecture Overview HAProxy Nova Horizon Glance Keystone Heat Cinder Ceilometer
CERN Cloud - Deployment Configura)on infrastructure based on Puppet Community Puppet modules for OpenStack Puppet OpenStack composi)on layer developed by CERN Scien)fic CERN 6, CERN CentOS 7 Opera)ng Systems, Windows 2012 R2 RDO Community Packages for Linux
CERN Cloud - Deployment Foreman hostgroup is a node role: cloud_compute/level0/api/cell00/nova cloud_compute/level0/api/cell00/ec2 cloud_compute/level1/gva_shared_001/rabbitmq cloud_compute/level2/kvm/gva_shared_001
CERN Cloud Architecture (over Cme) Controller Keystone Horizon Glance Nova Compute node Nova- compute
CERN Cloud Architecture (over Cme) Controller Keystone Horizon Glance Nova Compute node Nova- compute
CERN Cloud Architecture (over Cme) Top Cell controller Compute Cell controller Nova Compute node Nova- compute Keystone Glance Horizon Nova Compute Cell controller Nova Compute node Nova- compute
CERN Cloud Architecture Horizon Glance Top Cell controller Compute Cell controller Nova Compute node Nova- compute Keystone Nova Cinder Ceilometer Compute Cell controller Nova Compute node Nova- compute Heat
CERN Cloud Architecture - numbers 20 cells ~4800 compute nodes Hyper-V ~150 compute nodes >12000 VMs
Nova Controllers Node that runs the OpenStack services to manage the Cloud (scheduling, manage volumes, manage images, ) Two type of controllers Top cell controllers Child cell controllers 11
Top Cell - Controllers OpenStack services don t share the same the host Small and medium size VMs Freed some previously idle resources Scaled our based on the service need Services updated independently Subset of services s)ll on physical machines
Child Cell - Controllers One physical node per cell Nova services RabbitMQ message broker Move to a distributed cell architecture Availability zones span between different cells If a cell goes down new VMs are spawned on other cells 13
Nova Cells What are cells? Groups of hosts configured as a tree Top level cell runs API services Child cells run compute nodes Every cell has its own control plane 14
Nova Cells Why cells? Scale compute capacity by adding new cells Infrastructure resilience Used by big companies (Rackspace, ebay/paypal, GoDaddy, Walmart) It s becoming the default configura)on CellsV2
Nova Cells Cells at CERN No regions! Single entry point for users Datacentre mapping based in AVZs Architecture hidden to the user Easier to manage different hardware types
Nova Scheduler Two schedulers Cell scheduler Top cell level Decides the cell Node scheduler Compute cells Decides the hypervisor Node scheduler Cell scheduler Node scheduler HV HV HV HV HV 17
Nova Scheduler - Cells Possible to associate projects to cells Chooses the most appropriate cell depending on free resources By default not datacentre aware Improved to make it datacentre aware Cells mapped to a datacentre Possible to specify a datacentre 18
Nova Scheduler - Compute Time to schedule VMs increases with the number of nodes Performance benefits with small cells (~200 nodes) Schedulers update the database only when the target hypervisor is decided With many simultaneous requests and mul)ple scheduler failures due to race condi)ons 19
Nova Conductor Proxy to access the DB Not mandatory to deploy DB creden)als only in nova controllers Small number of connec)ons to the DB Upgrade by stages
Conclusions OpenStack scales and meets our needs for LHC Run 2 Working with upstream for new features and bug fixes Cells deployment allows us to scale out easily adding new groups of nodes How to refactor legacy cells? Scale up/down the control plane? 21