JASMIN Cloud ESGF and UV- CDAT Conference 09-11 December 2014 STFC / Stephen Kill Philip Kershaw (1, 2), Jonathan Churchill (5), Bryan Lawrence (1, 3, 4), Stephen Pascoe (1, 4) and MaE Pritchard (1) Centre for Environmental Data Archival, STFC Rutherford Appleton Laboratory, Didcot, UK (1) NaPonal Centre for Earth ObservaPon, NERC, UK (2) Department of Meteorology, University of Reading, Reading, UK (3) NaPonal Centre for Atmospheric Science, NERC, UK (4) ScienPfic CompuPng Department, STFC Rutherford Appleton Laboratory, Didcot, UK
IntroducPon and Background Light blue = total of all tape at STFC Green = Large Hadron Collider (LHC) Tier 1 data on tape Dark blue = data on disk in JASMIN JASMIN a Big Data processing and analysis facility Response to Big Data challenge for Climate science and EO from increases from hi- res models and producpon of observapons Funded through UK government capital investment (NERC and UK Space Agency) in two phases JASMIN I and CEMS (2012) Climate science, Earth System modelling focus Support UK and European HPC facilipes Successful demonstrated hosted processing (typical factor x30 increase in performance for workloads) Panasas parallel file system + batch compute + VMware virtualisapon JASMIN II (completes Mar 2015) Expanded remit to whole NERC environmental sciences community Significant expansion of compute ~ x10 to over 3000 cores Storage ~ x2 to 13PB Network >1000 ports @ 10Gb, 40GbE t outside world Cloud Service VMware vcloud customised 100 * 16 core servers licence 130-140GB/s in IOR tests for the new storage
JASMIN 1 Success: UPSCALE UPSCALE: UK on PRACE weather resolving SimulaPons of Climate for global Environmental risk Ensembles of global atmospheric climate simulapons at weather forecaspng resolupon Required more than 30 Pmes the compupng Pme available to our team on UK supercomputer HECToR Successfully applied for a 144 million core hour from PRACE laspng for 1 year on HERMIT in Germany Produced more than 400 TB of data over 10 months, which was shipped to JASMIN and the Met Office archives Deployment of VMs running custom scienpfic sogware, co- located with data Image: P- L Vidale & R. Schiemann, NCAS Mizielinksi et al (GeoscienPfic Model Development, submieed) High resolupon global climate modelling; the UPSCALE project, a large simulapon campaign
Does Cloud CompuPng fit? Cloud compupng is a model for enabling ubiquitous, convenient, on- demand network access to a shared pool of configurable compupng resources that can be rapidly provisioned and released with minimal management effort or service provider interacpon. NIST SP800-145 5 essenpal characterispcs On- demand self- service Broad network access Resource pooling Rapid elaspcity Measured service 3 service models IaaS (Infrastructure as a Service) PaaS (Plaporm as a Service) SaaS (Sogware as a Service) 4 deployment models Private cloud Community cloud Public cloud Hybrid cloud
Raw infrastructure power (data available all the Dme, next to the compute) but more constrained service model TradiPonal *nix user management High performance compute + global file system Batch Compute High level of experpse required for users Lessons Learnt from JASMIN I and The long tail Cloud Service Model Rich and flexible service model but sacrifice in performance AbstracDon of Physical Resources Share of Cloud Resource Compute Network Storage MulP- tenancy, virtual applicapon environments OrganisaPon A OrganisaPon B Internal Services OrganisaPon C Long tail research
JASMIN Cloud Architecture IPython Notebook VM with access cluster through IPython.parallel JASMIN Plaporm VM CloudBioLinux Desktop with dynamic RAM boost JASMIN Cloud Management Interfaces Firewall Firewall JASMIN Internal Network External Network inside JASMIN Managed Cloud - PaaS, SaaS Firewall + NAT Project1- org VM 0 VM 0 VM Direct access to batch processing cluster Lotus Batch Compute Appliance Catalogue Direct File System Access Panasas storrage Firewall + NAT Firewall + NAT Appliance Catalogue IPython JupyterHub VM File Server VM IPython VM 0 VM Slave VM 0 Appliance Catalogue CloudBioLinux Fat Node File Server VM CloudBioLinux VM VM 0 VM 0 opprad- org Standard Remote Access Protocols Unmanaged Cloud IaaS, PaaS, SaaS gp, hep, eos- cloud- org
Hierarchical OrganisaPon and Governance Structure JASMIN HPC CommiEee Project 1 ConsorPum 1 Project 2 Project 3 ConsorPum 2 Project 1 ConsorPa split by research domains JVO 1 JVO 2 JVO 1 JVO Admin JVO 1 Technical administrapon JVO = JASMIN Virtual OrganisaPon
Manage Complexity with Documented Processes Analysis The meepng of requirements with the applicapon of the available technology Defined a set of processes Each is documented A combinapon of manual and automated steps with fallback to manual steps where needed Project negopapon Match proposal with a JASMIN ConsorPum Proposal sponsor negopates with consorpum manager(s) amount and type of resources number of Virtual Orgs, compute, network and storage quotas JVO (JASMIN Virtual OrganisaPon) Onboarding Create a virtual organisapon tenancy Register responsible administrator(s) InducPon + training Management of a Catalogue of Appliance Templates Create new appliance Update appliance PropagaPon to organisapons Control for Managed and Unmanaged environments, stem cells Networking Manage quota of public IPs for tenancy
vcloud Driver Portal Implements VMware s vcloud API Management Interface JASMIN Cloud Apache libcloud OpenStack Driver Web Service Interface AWS Driver Wrapping vendor- specific Cloud APIs 0ther drivers Missing vcloud- specific - we are extending libcloud to support addiponal features VMware vcloud Director was selected as the private cloud implementapon: For its maturity vs. Open Source alternapves at the Pme of JASMIN I Capital funding favoured it vs. increased recurrent of Open Source Cloud broker technologies provide an insulapon layer from cloud provider- specific APIs Apache libcloud (Python) selected This allows Future use of an alternapve private cloud implementapon e.g. OpenStack ImplementaPon of Cloudburs8ng augment private cloud with access to Public cloud providers via APIs supported in libcloud Its not a panacea: Libcloud provides a common subset of the features of each cloud provider supported
JASMIN Analysis Plaporm (JAP) MulP- node infrastructure requires a way to install tools quickly and consistently The community needs a consistent plaporm where ever they need them. Users need help migrapng analysis to JASMIN. JAP provides RPMs and pre- built images based on CentOS hep://proj.badc.rl.ac.uk/cedaservices/wiki/jasmin/analysisplaporm
Community Intercomparison A command line interface and Python APIs for Suite (CIS) Time- series Line plots ScaEer plots Global plots Dataset Format AERONET Text Overlay plots Histograms Curtain plots MODIS CALIOP HDF HDF CloudSAT HDF AMSRE HDF Co- locapon Model gives global output every 3 hours for a full month ObservaPons are day- Pme site measurements, every 15 min for a full month Source Sampling CollocaPon TRMM CCI aerosol & cloud SEVIRI Flight campaign data Models HDF NetCDF NetCDF RAF NetCDF
Early Results for JASMIN II Development and rollout a huge challenge Manage complexity with well documented processes User community Model emerging of PaaS (Managed Cloud) + IaaS (Unmanaged Cloud) PaaS with access to HPC and global file system IaaS for applicapon and disseminapon of outputs to wider community Fellow tenants helping each other out and developing common solupons IPython Notebook with JupyterHub NERC Environmental Workbench Docker JASMIN Analysis Plaporm Try out JASMIN- like env in VirtualBox
Further informapon JASMIN website: hep://jasmin.ac.uk/ Centre for Environmental Data Archival hep://www.ceda.ac.uk JASMIN paper (Sept 2013) hep://home.badc.rl.ac.uk/lawrence/stapc/2013/10/14/ LawEA13_Jasmin.pdf Follow- up paper on cloud work planned JASMIN Analysis Plaporm hep://proj.badc.rl.ac.uk/cedaservices/wiki/jasmin/analysisplaporm Apache libcloud heps://libcloud.apache.org