Scientific Cloud Computing Infrastructure for Europe Strategic Plan Bob Jones, IT department, CERN
Origin of the initiative Conceived by ESA as a prospective for providing cloud services to space sector in Europe Presented to the IT working group of the EIROforum where other members (CERN, EMBL) joined Two workshops held during 2011 June: hosted by ESA in Frascati October: hosted by EMBL in Heidelberg EIROforum: CERN, EFDA JET, EMBL, ESA, ESO, ESRF, European XFEL, ILL December 2011 Bob Jones, CERN 2
Establish a sustainable multi tenant cloud computing infrastructure in Europe Initially based on the needs for the European Research Area & space agencies Based on commercial services from multiple IT industry providers Which adhere to internationally recognised policies and quality standards to be adopted by a governance structure involving all stakeholders http://cdsweb.cern.ch/record/1374172/files/cern OPEN 2011 036.pdf December 2011 Bob Jones, CERN 3
Objectives of the initiative Set up a cloud computing infrastructure for European Research Area Identify and adopt policies for trust, security and privacy on a European level Create a light weight governance structure involving all stakeholders Define a short and medium term funding scheme December 2011 Bob Jones, CERN 4
Supply side companies: Atos Origin, BT Services, Cap Gemini, CloudSigma, Interoute, Logica, Orange, SAP, Terradue, The ServerLabs, T Systems, etc. December 2011 Bob Jones, CERN 5
Timeline Set up (2011) Pilot phase (2012 2014) Full scale cloud service market (2014 ) Select flagships use cases, identify service providers, define governance model Deploy flagships, Analysis of functionality, performance & financial model More applications, More services, More users, More service providers December 2011 Bob Jones, CERN 6
Draft Governance Model for Pilot Phase Consortium Board 3 users organisations & 3 supplier companies Currently chaired by ESA Management Team Service Providers Board Users Board December 2011 Bob Jones, CERN 7
Flagship use cases Proposed by demand side user organisations addressing scientific challenges with societal impact High profile applications that catch the public imagination and encourage others to use the services Show need for significant scale of resources, federation/aggregation of data sets, long term archiving and on demand processing Must be sponsored by user organisations that have a long term objective of outsourcing a portion of their computing requirements Must be prepared to contribute their own resources during the pilot phase to port application and contribute to the cost of procuring required services from the supply side Must participate in a costing exercise where the total cost of deploying and operating the flagship application in house can be compared to the cost of procuring the services via the cloud computing initiative December 2011 Bob Jones, CERN 8
Initial flagships use cases Call for proposals Proposals received in format following template agreed by demand and supply side Eligibility review of collected proposals (user side) resulted in 3 recommended flagships CERN: ATLAS High Energy Physics Cloud Use EMBL: Genomic Assembly in the Cloud ESA / CNES / DLR: SuperSites Exchange Platform Flagships sent to supply side for analysis 9
CERN-ATLAS Flagship Use Case In collaboration with ATLAS largest user of disk and CPU runs most jobs highest data transfer rates ATLAS have an internal activity in collaboration with IT to investigate cloud use Goals: Evaluate cloud technologies for ATLAS use cases Design model to integrate cloud resources with ATLAS distributed computing Implementation in ATLAS software framework CERN IT Department CH-1211 Genève 23 Switzerland www.cern.ch/it Ian Bird, CERN Flagship
ATLAS Computing Model Tiers CERN IT Department CH-1211 Genève 23 Switzerland www.cern.ch/it ATLAS Data Management - I. Ueda - APRWS 2011.03.14. 3
ATLAS workflow manager CERN IT Department CH-1211 Genève 23 Switzerland www.cern.ch/it CERN Flagship
Aspects to investigate Simulations (~no input) with stage out to: Traditional grid storage vs Long term cloud storage Data processing (== Tier 1 ) This implies large scale data import and export to/from the cloud resource Distributed analysis (== Tier 2 ) Data accessed remotely (located at grid sites), or Data located at the cloud resource (or another?) Bursting for urgent tasks Centrally managed: urgent processing Regionally managed: urgent local analysis needs CERN IT Department CH-1211 Genève 23 Switzerland www.cern.ch/it CERN Flagship
Potential mechanism CERN IT Department CH-1211 Genève 23 Switzerland www.cern.ch/it Ian Bird, CERN Flagship
AAA WLCG today has a worldwide trust federation For authentication based on X509 For authorization based on X509 extensions How would this map onto cloud resources? We need to ensure data has enforceable (but straightforward) access controls We need accounting of resource usage, and potentially traceability If CERN buys a cloud resource how do we share that between different groups? Mechanism needed to allocate (fair-share) between user groups? CERN IT Department CH-1211 Genève 23 Switzerland www.cern.ch/it Ian Bird, CERN Flagship
Immediate and longer term goals Determine costs of commercial cloud resources from various sources Compute resources Network transfers into and out of cloud Short and long term data storage in the cloud Develop understanding of appropriate SLA s How can they be broadly applicable to LHC or HEP Understand policy and legal constraints; e.g. in moving scientific data to commercial resources Performance and reliability compared to WLCG baseline Use of standards (interfaces, etc.) & interoperability between providers Can CERN transparently offload work to a cloud resource Which type of work makes sense? Long term: Can we use commercial services as a significant fraction of overall resources available to CERN experiments? At which point is it economic/practical to rely on 3 rd party providers? CERN IT Department CH-1211 Genève 23 Switzerland www.cern.ch/it Ian Bird, CERN Flagship
December 2011 Bob Jones, CERN Henri Laur, ESA 17
December 2011 Bob Jones, CERN Henri Laur, ESA 18
December 2011 Bob Jones, Henri CERNLaur, ESA 19
December 2011 Bob Jones, CERN Henri Laur, ESA 20
December 2011 Bob Jones, CERN Henri Laur, ESA 21
December 2011 Bob Jones, CERN Henri Laur, ESA 22
December 2011 Bob Jones, CERN Henri Laur, ESA 23
December 2011 Bob Jones, CERN 24
December 2011 Bob Jones, CERN Jonathon Blake, EMBL 25
December 2011 Bob Jones, CERN Jonathon Blake, EMBL 26
December 2011 Bob Jones, CERN Jonathon Blake, EMBL 27
December 2011 Bob Jones, CERN Jonathon Blake, EMBL 28
December 2011 Bob Jones, CERN Jonathon Blake, EMBL 29
December 2011 Bob Jones, CERN Jonathon Blake, EMBL 30
December 2011 Bob Jones, CERN Jonathon Blake, EMBL 31
December 2011 Bob Jones, CERN Jonathon Blake, EMBL 32
Flagship use cases ATLAS H.E.P. Cloud Use (CERN) Genomic Assembly in the Cloud (EMBL) SuperSites Exchange Platform (ESA) Scientific goal/society impact/photogenic Scale of resources used Federation/Aggregation of datasets Long term archiving of data On demand processing Impact on community & benefits Potential increase of users Interoperability Data security Maturity Access to license controlled sw Each flagship sponsor provides ~1 2 FTE for porting to cloud infrastructure and ~ 100K as contribution to resources consumed December 2011 Bob Jones, CERN 33
Helix Nebula EC project proposal Coordination action submitted to INFRA 2012 3.3 in November 2011 Requested total EC funding 2M no. Organisation name Short name Country 1 (coord) European Organization for Nuclear Research CERN CH 2 STICHTING EUROPEAN GRID INITIATIVE EGI.eu NE 3 European Molecular Biology Laboratory EMBL DE 4 ATOS ORIGIN NEDERLAND Atos NE 5 T Systems International GMBH T Systems DE 6 CLOUDSIGMA AG CloudSigma CH 7 SAP AG SAP DE 8 Logica Deutschland GmbH & Co KG Logica DE 9 CONSIGLIO NAZIONALE DELLE RICERCHE CNR IT 10 Cloud Security Alliance EMEA CSA UK December 2011 Bob Jones, CERN 34
Helix Nebula proposal To interact with GEANT, EGI & PRACE WP5: Flagship deployment (Logica) WP3: Representation of requirements Resource (CloudSigma) requirements WP4: Cloud platform & provisioning (Atos) WP6: Inter-operability with e-infrastructures (EGI.eu) WP7: Business models (SAP) WP8: Governance models (T-Systems) WP9: Evaluation, roadmap & development plan (EMBL) WP1: Management (CERN) WP2: Dissemination/Outreach (CSA) December 2011 Bob Jones, CERN 35
Addressing actions of the Digital Agenda for Europe Supporting the single digital market: building such a European Infrastructure will help create a single digital market for cloud computing Enhancing interoperability and standards: the European Cloud Computing Infrastructure will permit geographically dispersed and separately managed devices, applications and services to interact seamlessly Stimulating research and innovation: implementing this strategic plan will generate more private investment for IT research to develop a new generation of applications and services as well as reinforce the coordination and pooling of resources Improving trust and security: it will also provide a coordinated European approach to security for cloud computing and adhere to rules on data protection December 2011 Bob Jones, CERN 36
Summary The objective of this initiative is to establish a sustainable cloud computing infrastructure for the European Research Area based on commercially provided services It is a collaborative initiative bringing together all the stakeholders to establish a public private partnership Interoperability with existing e infrastructures is a goal of the initiative It has commitments from the IT industry and user organisations to get underway in 2012 3 initial flagship use cases identified Framework collaboration EC proposal submitted in Nov 11 December 2011 Bob Jones, CERN 37