Version Date Description of Revision 0.1 21 December 2011 Draft for internal review 0.2 4 January 2012 Draft for client review 0.

Similar documents
Cloud Computing and Amazon Web Services

A Level Playing Field for Strategic IT Options - Understanding Energy, Network and other Overheads

2012 JISC Country Update Rachel Bruce, Innovation Director, Digital Infrastructure, JISC

Customer Engagement & The Cloud

Best practice for funding the use of cloud in research

How cloud computing can transform your business landscape

GUIDELINE. on SERVER CONSOLIDATION and VIRTUALISATION. National Computer Board, 7th Floor Stratton Court, La Poudriere Street, Port Louis

Cloud Computing in Higher Education: A Guide to Evaluation and Adoption

Ensuring security the last barrier to Cloud adoption

DISTRIBUTED SYSTEMS AND CLOUD COMPUTING. A Comparative Study

A Gentle Introduction to Cloud Computing

Cloud Adoption Study Cloud computing is gaining momentum

CLOUD COMPUTING An Overview

The NREN s core activities are in providing network and associated services to its user community that usually comprises:

INTRODUCTION TO CLOUD COMPUTING CEN483 PARALLEL AND DISTRIBUTED SYSTEMS

IJRSET 2015 SPL Volume 2, Issue 11 Pages: 29-33

How cloud computing can transform your business landscape.

CHECKLIST FOR THE CLOUD ADOPTION IN THE PUBLIC SECTOR

Cloud 101. Mike Gangl, Caltech/JPL, 2015 California Institute of Technology. Government sponsorship acknowledged

Computing in a virtual world Cloud Computing

The Cloud at Crawford. Evaluating the pros and cons of cloud computing and its use in claims management

Session 2. The economics of Cloud Computing

Cloud, Community and Collaboration Airline benefits of using the Amadeus community cloud

Cloud Computing. Adam Barker

Cloud Computing Submitted By : Fahim Ilyas ( ) Submitted To : Martin Johnson Submitted On: 31 st May, 2009

JISC. Technical Review of Using Cloud for Research. Guidance Notes to Cloud Infrastructure Service Providers. Introduction

Architectural Implications of Cloud Computing

DNA IT - Business IT On Demand

Introduction to AWS Economics

Getting Familiar with Cloud Terminology. Cloud Dictionary

ITL BULLETIN FOR JUNE 2012 CLOUD COMPUTING: A REVIEW OF FEATURES, BENEFITS, AND RISKS, AND RECOMMENDATIONS FOR SECURE, EFFICIENT IMPLEMENTATIONS

Brennan Whitepaper Cloud Computing Part 1 - Facts and Trivia

Who moved my cloud? Part I: Introduction to Private, Public and Hybrid clouds and smooth migration

Cloud computing is a marketing term that means different things to different people. In this presentation, we look at the pros and cons of using

Data Protection Act Guidance on the use of cloud computing

Grid Computing Vs. Cloud Computing

Datacenters and Cloud Computing. Jia Rao Assistant Professor in CS

Manage Your Data: Virtualization for Small Businesses

High Performance Computing Cloud Computing. Dr. Rami YARED

IT Security Risk Management Model for Cloud Computing: A Need for a New Escalation Approach.

How To Run A Cloud Computer System

Cloud Computing. What is Cloud Computing?

The Total Cost of (Non) Ownership of a NoSQL Database Cloud Service

White Paper. Cloud Computing. Effective Web Solution Technology Investment. January

6 Cloud strategy formation. 6.1 Towards cloud solutions

Kent State University s Cloud Strategy

The truth about SaaS for recruiters: How to obtain the full benefit of a web timesheet solution Etz Timesheet Solutions Guide

Emerging Technology for the Next Decade

White Paper on CLOUD COMPUTING

UK Government ICT Storyboard July 2010

CHAPTER 8 CLOUD COMPUTING

Cloud Computing. Bringing the Cloud into Focus

Figure 1 Cloud Computing. 1.What is Cloud: Clouds are of specific commercial interest not just on the acquiring tendency to outsource IT

Bringing the Cloud into Focus. A Whitepaper by CMIT Solutions and Cadence Management Advisors

Managing the Real Cost of On-Demand Enterprise Cloud Services with Chargeback Models

THOUGHT LEADERSHIP. Journey to Cloud 9. Navigating a path to secure cloud computing. Alastair Broom Solutions Director, Integralis

Outline. What is cloud computing? History Cloud service models Cloud deployment forms Advantages/disadvantages

Pioneering Cloud Computing for Clinical Trials

Without the need to purchase hardware, software licenses, or implementation services, an organization can deploy cloud computing rapidly.

DISTRIBUTED SYSTEMS [COMP9243] Lecture 9a: Cloud Computing WHAT IS CLOUD COMPUTING? 2

Cloud Computing and Records Management

Overview. The Cloud. Characteristics and usage of the cloud Realities and risks of the cloud

20 th Year of Publication. A monthly publication from South Indian Bank.

Learning from the Cloud providers to use the CMDB to drive cost savings through automation

IaaS- the sunny side of cloud

Secure Cloud Computing through IT Auditing

Market Maturity. Cloud Definitions

Essential Characteristics of Cloud Computing: On-Demand Self-Service Rapid Elasticity Location Independence Resource Pooling Measured Service

White Paper. Cloud Vademecum

An Early View of Cloud Computing

Part V Applications. What is cloud computing? SaaS has been around for awhile. Cloud Computing: General concepts

Radware ADC-VX Solution. The Agility of Virtual; The Predictability of Physical

Emergence of Cloud. Definition. Service Models. Deployment Models. Software as a Service (SaaS) Public Cloud. Platform as a Service (PaaS)

Cloud Computing in the Enterprise An Overview. For INF 5890 IT & Management Ben Eaton 24/04/2013

High Performance Computing (HPC)

SCALABILITY IN THE CLOUD

The Massachusetts Open Cloud (MOC)

IBM Platform Computing Cloud Service Ready to use Platform LSF & Symphony clusters in the SoftLayer cloud

The Cloud. JL Cabrera LTEC 4550

A Study of Infrastructure Clouds

Dr.K.C.DAS HEAD PG Dept. of Library & Inf. Science Utkal University, Vani Vihar,Bhubaneswar

Cloud Computing. Cloud computing:

10 top tips to reviewing recruitment software (0)

21/09/11. Introduction to Cloud Computing. First: do not be scared! Request for contributors. ToDO list. Revision history

Topics. Images courtesy of Majd F. Sakr or from Wikipedia unless otherwise noted.

Creating Dynamic IT Infrastructure at Reduced Cost with Cloud Computing

View Point. Oracle Applications and the economics of Cloud Computing. Abstract

Cloud Computing Architecture: A Survey

Storm Clouds Ahead? A risk analysis of Cloud Computing

A guide to procuring Accredited Cloud Services

Radware ADC-VX Solution. The Agility of Virtual; The Predictability of Physical

Relocating Windows Server 2003 Workloads

What Factors Determine Cloud Computing Adoption by Colleges and Universities? Bill Klug Instructor, BCIT

Cloud: It s not a nebulous concept

Remote Infrastructure Management Emergence of the Cloud-based Helpdesk

Tamanna Roy Rayat & Bahra Institute of Engineering & Technology, Punjab, India talk2tamanna@gmail.com

Cloud computing: the state of the art and challenges. Jānis Kampars Riga Technical University

WINDOWS AZURE AND WINDOWS HPC SERVER

Transcription:

Version Date Description of Revision 0.1 21 December 2011 Draft for internal review 0.2 4 January 2012 Draft for client review 0.3 24 January 2012 Draft incorporating clients and Lee Gillam s comments for internal review 1.0 27 January 2012 Final version for issue to EPSRC and JISC for comment 1.1 14 February 2012 Final version incorporating minor editorial changes and feedback on the NeuroCloud case study 1.2 22 February 2012 Minor changes to e-science Central and flood modelling case study

1 Curtis+Cartwright Consulting, with support from Dr Lee Gillam of the University of Surrey, undertook this Cost analysis of cloud computing for research on behalf of the Engineering and Physical Sciences Research Council (EPSRC) and JISC. This document is the final report. 2 It is broadly accepted that the cloud computing is potentially a useful resource for researchers. In particular, there is the attraction of gaining access to significant resources on demand. In the future, access to cloud computing may come to be seen as part of the computational resources of any world-class research institution. However, this is a new area, and the costs and performance of cloud computing and comparisons with existing institutional provision are difficult to obtain. Therefore useful advice and guidance specific to the research domain is required. 3 The objective of this report is to help institutions, researchers and reviewers of grant applications get to grips with the use of the cloud for research by providing advice and guidance on the financial implications of cloud computing for research, drawing on real examples where possible. It focuses on public Infrastructure-as-a-Service clouds, where there is the most available evidence and which is likely to be of most interest for research computing at present. Caveat 4 The dynamic and competitive nature of the cloud marketplace, and the effect of such competition on the design, price and performance of traditional research infrastructures, mean that the price information and comparisons made in this report are likely to be outdated in the medium term. Vendors are constantly innovating, and new charging models are occasionally introduced that significantly affect the economics. Public cloud and institutional data centre comparison 5 Research computing in an institution is a complicated mix of funding, policy, process and academic needs and freedoms. It can involve decisions about charging models, cross-subsidies, and carbon-reduction policies, and the availability of academic discounts, all of which make it difficult to understand the costs of research computing whether using the cloud or the institutional data centre. 6 Given the maturing and rapidly changing nature of cloud computing for research, and the wide range of demands placed upon a research infrastructure by different research tasks, there is no simple answer to what research computing might cost using the cloud, especially in comparison with institutional facilities. There are various scales of use of cloud systems that could be considered, from relatively ad hoc one-off tasks, through cloudbursting for a number of tasks when there is insufficient local capacity or to smooth the lumpiness of institutional investments, all the way through to remotely hosted (outsourced) data centres being used to replace institutional infrastructure. These scales of use will have different cost implications. CC497D002-1.2 1

7 It is a common view that cloud computing will save money. This report suggests that this is certainly not clear for research computing. On a pure price comparison, the more powerful cloud computing instances, rented on an hourly basis, appear to be one-and-a-half to two times more expensive per core-hour than well-managed, locally-provided clusters in modern data centres operating at high utilisation levels. However, other purchasing models (such as Reserved Instances ) can reduce the costs to parity or better (paragraph 9). 8 However, it is perhaps more accurate to say that the cost ranges of public cloud providers and institutions overlap, and that under some circumstances cloud will offer a better solution and that for others a traditional data centre will remain more appropriate. Importantly, this comparison takes no account of the performance implications of the different infrastructures for specific research tasks. The degree of overlap will be different for each institution. 9 Longer rental periods and so-called reserved instances where an up-front fee is paid to reduce the per-hour cost provide a more economical approach to buying cloud capacity. With a 3-year rental/reservation, costs are on parity with the very cheapest local provision, assuming similar utilisation levels. Examples of research computing using reserved instances are currently sparse. 10 Moreover, the impression received from the cloud pilots and other researchers working in this area is that, far from merely tolerating a lower level of performance in exchange for reduced costs or greater flexibility, most of the researchers currently engaged in cloud computing are doing so in order to get better performance or a new capability. Therefore, it is possible that increased use of cloud computing will lead to more and better science but with an associated increase in cost. 11 If the procurement of new or additional research computing capability is being considered, especially at the research group scale, then cloud computing merits being considered as an option alongside traditional hardware procurement. How to come to a decision 12 Researchers, research managers and institutions must perform their own price assessment of cloud services, tailored to the specifics of their research tasks and circumstances. It is necessary to benchmark specific codes and research tasks to establish how different cloud offerings will perform against any available or proposed institutional infrastructure, and therefore to assess the relative costs. 13 Assessments will need to be revisited regularly as cloud offerings and prices change, when opportunities arise to make local facilities more efficient, and as research computing requirements change. Sections 3, 4, and 5 of this report provide specific information that will help institutions make these assessments. 14 Any comparison must also take into account the potential advantages that cloud computing can offer in terms of the amount of computing that can be brought to bear on the task, and hence the time to completion for a specific piece of research. This is because the nature and scale of cloud computing can accommodate tasks with large but transient loads. These advantages will be most apparent when a large amount of computing power is required for a short time and/or when the local facilities lack capacity or availability. 2 CC497D002-1.2

Other points to bear in mind 15 Researchers and institutions need to be aware of the different practices used by the vendors to calculate charges (sub-section 4.3) and adopt approaches that mean the best value for money is obtained. 16 Cost is not necessarily at the forefront of researchers minds when using cloud computing for research (although it is more likely to be considered when compared to use of institutional or other resources that are free at the point of use). Exposing researchers to the cost of the resources consumed may increase understanding and encourage efficient use of computing, and computation infrastructure in general, and of cloud computing in particular. 17 Institutional processes may need to adapt before cloud computing becomes an integrated part of the research computing ecosystem (for example, to accommodate the billing models of cloud providers). The Research Councils should treat cloud computing as a strategic planning problem, and should examine the suitability of their policies and processes for a future in which cloud computing is an integrated part of the research computing ecosystem. Cloud computing providers could be encouraged to participate in this process. Barriers to uptake 18 The difficulty in understanding the sheer range of cloud computing options is itself a barrier to uptake. A further barrier is that the use of tendering for cloud resources can reduce its perceived convenience. Future models 19 Looking ahead a few years, the cost-effectiveness of an operating model in which research computing is able to shift dynamically between local and cloud based resources, for the most cost-effective use of both, deserves further investigation, especially by institutions. This model will be supported by the JANET(UK) Data Centre and Cloud Service Brokerage. 20 Sub-section 7.3 provides detailed advice and guidance for institutions. In summary, institutions should: Take care when comparing costs 21 If institutions are thinking about cloud computing, and are comparing the cost with providing institutional infrastructure, it is important to be honest and comprehensive about how much local provision and cloud computing really cost (cf Section 3). A particular issue is in not underestimating the amount of technical support required both for cloud and local provision. Respond appropriately to the dynamics of the cloud computing marketplace 22 Prices charged by the cloud providers frequently change; different providers often change their pricing to compete or to control demand. While this can be generally good news, it potentially makes guaranteeing value for money difficult over the life of a research project. Institutions will need to examine available advice and guidance on pricing such as that is likely to be available from, for example, the JANET(UK) Data Centre and Cloud Service Brokerage. CC497D002-1.2 3

Ensure appropriate support is available for researchers using cloud computing 23 A key message from the community of early adopters working with cloud computing services is that there is at least as great a need for research support services when using cloud as there is with local resources. Institutions should review the skills and capabilities of their research computing support services to ensure they are able to provide the support researchers need. Put in place processes and mechanisms suitable for cloud computing 24 Institutions should review their processes to ensure they are able to accommodate the typical credit card billing model employed by public cloud providers. Particular issues include ensuring that processes and mechanisms are in place to allow delegation of expenditure to the actual researchers, to set appropriate expenditure limits, etc, to provide accurate and current information and to maintain accountability for the expenditure. Consider low-cost ways to get researchers started 25 Researchers should be encouraged to take advantage of offers of free cloud computing, such as the research grants offered by Amazon. 1 26 Institutions may also want to facilitate access to low-cost instances to serve as a training-ground for interested researchers. This will have the added advantage of helping researchers to benchmark codes and make more informed cost estimates for grant applications. Is there an opportunity to take advantage of reduced price options? 27 Institutions should look for opportunities to take advantage of reduced price options such as reserved instances. This might be achieved in some cases through aggregating demand within the institution. 28 The executive summary focuses on what institutions need to do to support their researchers in taking up and using the cloud for research. The body of the report also provides advice and guidance for researchers (sub-section 7.2) and for grant reviewers (sub-section 7.4). 29 There is still much to be done to ensure a sound use of the cloud for research. The following recommendations are made: Recommendation 1: JISC should consider developing or procuring a cloud cost comparison service suitable for research computing. They should also consider whether this could and should be provided by the JANET(UK) Data Centre and Cloud Service Brokerage. Recommendation 2: JISC should work with relevant stakeholders to support institutions in calculating their research computing costs in order to make better comparisons with cloud computing offerings, and encourage the reporting of institutional cost data to support future cost comparisons. 1 Amazon s grants page is available here http://aws.amazon.com/education/ [accessed 2 January 2012] 4 CC497D002-1.2

Recommendation 3: EPSRC and the other Research Councils should review their guidance on requesting funding for research computing and procuring equipment to explicitly include cloud computing. Recommendation 4: JISC should continue to help institutions adapt their processes so that they, and their researchers, can more easily gain access to cloud computing. This should include mechanisms that allow quick and easy setup and payment to cloud providers, accounting to individual research projects, and aggregation of demand to allow more costeffective reserved instance to be used. Recommendation 5: The Research Councils and JISC, together with relevant stakeholders such as the HPC-Special Interest Group and UCISA, should work together and with the JANET(UK) Data Centre and Cloud Service Brokerage to benchmark the cost of institutional research computing, develop and publish a knowledge base of the use of cloud computing for research and investigate further the use of reserved instances. CC497D002-1.2 5

Executive summary 1 List of abbreviations 8 1 Introduction 9 1.1 General 9 1.2 Objective and target audiences 9 1.3 Scope 9 1.4 Approach 11 1.5 Note on prices 11 1.6 Overview of this document 12 2 Background 13 2.1 Introduction 13 2.2 What is cloud? 13 2.3 The economic arguments for cloud computing 14 2.4 Research computing 15 3 Costs of institutional research computing 18 3.1 Introduction 18 3.2 Overall model 18 3.3 Business models for research computing 21 4 Costs of cloud computing for research 22 4.1 Introduction 22 4.2 Overall model 22 4.3 Billing models 25 5 Cost analysis 27 5.1 Introduction 27 5.2 Cost comparison 27 5.3 Why researchers are using cloud computing for research 33 5.4 Comparison of funding and business models 34 6 Conclusions and recommendations 36 6.1 Introduction 36 6.2 What cloud can and cannot do, and why researchers are interested 36 6.3 Research Council processes may need to adapt 36 6.4 The huge range of cloud options is both a barrier and an enabler 37 6.5 Cloud computing can be competitive 37 6.6 Cost is not on researchers radar 38 6.7 Institutional processes may need to adapt 39 6.8 There is an opportunity for coordinated action 40 7 Suggested guidance for target audiences 42 7.1 Introduction 42 7.2 Researchers 42 7.3 Institutions 43 7.4 Grant reviewers 45 6 CC497D002-1.2

A Interviews conducted 47 B Case study: molecular simulation 48 B.1 Introduction 48 B.2 Benchmark suite 1 single server tests 48 B.3 Benchmark suite 2 cluster tests 50 B.4 Conclusions 52 C Case study: e-science Central and flood modelling 53 C.1 Introduction 53 C.2 The Digital Institute 53 C.3 e-science Central 53 C.4 Flood modelling 54 C.5 Conclusions 55 D Case study: NeuroCloud 57 D.1 Introduction 57 D.2 About CINN 57 D.3 Description of the research 57 D.4 Cost comparison 58 D.5 Lessons identified 59 D.6 Conclusions 60 E Cloud prices 61 E.1 Introduction 61 E.2 Compute 61 E.3 Storage 63 E.4 Data transfer 64 CC497D002-1.2 7

CapEx CINN EPSRC GPU HECToR HEI HPC HTC IaaS OpEx PaaS PUE SaaS SSD SLA TCO Capital Expenditure Centre for Integrative Neuroscience and Neurodynamics Engineering and Physical Sciences Research Council Graphics Processing Unit High-End Computing Terascale Resource Higher Education Institution High Performance Computing High-Throughput Computing Infrastructure as a Service Operation Expenditure Platform as a Service Power Usage Effectiveness Software as a Service Solid State Disk Service Level Agreement Total Cost of Ownership Virtual Machine Instance Cluster Grid Supercomputer Workstation Research computing manager A completely isolated computer with its own operating system existing inside another physical machine. Virtual machines can have virtual hardware specifications, including clock speed, and memory. Multiple virtual machines can co-exist independently within the same physical hardware. Terminology used by some IaaS cloud providers to describe the virtual machine being provided. A group of computers linked together to process computing jobs. In this report the term cluster is used to describe systems have high-speed, low-latency interconnects between the processors. Grids differ from clusters in that the individual machines are less closely coupled. Grid hardware is often heterogeneous. Some institutions have grid systems based on unused student computer rooms. An extremely powerful computer that may make use of specialised architecture. Supercomputers are scored and ranked based on LINPACK benchmarks. A stand-alone computer of the type that can be found under the desk. Any person with managerial responsibility for procuring, operating or maintaining research computing services this person may be an academic or non-academic member of staff. 8 CC497D002-1.2

1 1.1 General 1.1.1 Curtis+Cartwright Consulting, with support from Dr Lee Gillam of the University of Surrey, undertook this Cost analysis of cloud computing for research project on behalf of the Engineering and Physical Sciences Research Council (EPSRC) and JISC. This document is the final report. 1.1.2 The study team would like to acknowledge the contribution of the interviewees and stakeholders listed at Annex A, who generously gave their time and input to this project. In particular, we thank the researchers and institutions that provided the case studies. 1.2 Objective and target audiences 1.2.1 The objective of the project was to provide advice and guidance on the financial implications of using the Cloud for research computing. The primary audience is EPSRC and JISC, however, the discussion and findings in this report may be of interest to anyone in the HE community with an interest in cloud computing, and especially: researchers interesting in using cloud computing for their research; Higher Education Institutions (HEIs) and research computing managers who may be considering proving access to cloud computing services for their researchers; funding bodies in addition to EPSRC who fund research computing through grant awards, and the reviewers of grant applications. 1.3 Scope What is cloud? 1.3.1 Definitions of cloud computing abound, but for the purposes of this work it is helpful to focus on key aspects of the business models that are commonly associated with cloud computing: cloud computing providers use the economies of scale, and the technology of virtualisation, to rent computing services to third parties; charging is often done on a pay-per use basis; pricing models tend to be broken down into different computational characteristics (eg transaction counts, core-hours used, data stored, data transferred). 1.3.2 The terminology, classification and business models of cloud computing are explored in more detail in Section 2. Research computing infrastructure 1.3.3 Traditional research computing provision spans several orders of magnitude, from desktop workstations to clusters, supercomputers and grids. The scalability of cloud computing (meaning the illusion of infinite size afforded by public clouds) means it is relevant to consider all of these local infrastructure types. Given the scale of this task, this report should be considered a first step in what is likely to be an on-going area of interest for the sector. CC497D002-1.2 9

1.3.4 Depending on the institution, research computing infrastructure and services may be provided by a central IT Services team, handled by a central academically run service, or handled at the faculty or departmental level. The term research computing manager is used in this report to describe any person with managerial responsibility for procuring, operating or maintaining research computing services this person may be an academic or non-academic member of staff; no assumptions are made about the role beyond this. 1.3.5 A yet-wider consideration of research computing (e-infrastructure) has emerged from a number of Town Meetings involving various stakeholders 2, which inter alia criticizes charging models as driving user behaviour. The document proposes the unification of research computing into (sets of) shared facilities. Recent efforts demonstrate that there can be cost implications of moves towards shared services. 3 Performance vs cost 1.3.6 This project is focused on cost. However, for computing services charged by the hour (as cloud or institutional services may be) cost is very directly linked to the performance in the idea of bang for your buck. We believe that the researchers are the best judges of performance and what is right for them. Our aim is to help them to understand the cost implications of the performance they are purchasing. Other important aspects of cloud computing 1.3.7 The economic and cost aspects of cloud computing are far from the only factors that should influence institutions or researchers decisions about cloud computing, and EPSRC and JISC are well aware of this. These other factors are outside the scope of this report, but are nonetheless important to bear in mind since they will inform the decision making of potential cloud customers indeed, they may actually be the most important factors, depending on use case. Particularly significant aspects include: Security and Data Protection: the arguments and issues surrounding security and data protection in the cloud are varied and complex. They include, but are not necessarily limited to: concerns over data location and legal jurisdiction (for example, Federal acquisition of data under the US Patriot Act); requirements that control how personal data may be handled (for example, a commitment for medical data to remain within an institution s firewall or within UK borders, or in accordance with principle 8 of the Data Protection Act); technical considerations of security in multi-tenancy cloud environments and the overall attack surface of cloud providers. Lock-in: the risk of being trapped by a particular provider into being unable to leave or switch providers. Lock-in can occur if an organisation becomes dependent on proprietary software or infrastructure, or else if so much data is uploaded to the cloud that it is economically or practically impossible to move it elsewhere; so-called data lock-in. Service Level Management: the assurance that you are getting the performance you are paying for off-the-shelf service level agreements are unlikely to offer much in the event of failing to meet a specified level of provision and can be formulated such that an outage is 2 3 For more information, see: http://wiki.esi.ac.uk/w/files/7/7b/researchcomputingv62-final.pdf [accessed 10 January 2012] In particular, the NAO s report on Shared services in the Research Councils suggests being conservative with likely savings and rigorous in planning, see: http://www.nao.org.uk/publications/1012/research_councils.aspx [accessed 10 January 2012] 10 CC497D002-1.2

only deemed to have occurred after a certain period of time has passed, making the user responsible for ensure that SLA is being met. On the other hand, the cloud providers do at least offer assurances, and these may not be available in institutional settings.. 1.3.8 These other aspects should be included as relevant in cost-benefit analyses that prospective cloud computing customers undertake. JISC and EPSRC have funded other work to develop approaches to these issues, in addition to the great deal of complementary activity in the sector. For example, of the Cloud Pilot projects funded by EPSRC and JISC, several addressed issues of privacy, control and access management. 4 1.4 Approach 1.4.1 The approach for this work was to: investigate the costs of local research computing; use the Cloud Pilot projects, and others, to identify suitable case studies to show the differences between cloud offerings and local provision; consult widely, and especially with research computing managers, to solicit viewpoints, particularly on the advice most required by the target audiences. 1.4.2 Some institutional research computing services have shared cost data with the project team (although often of the ballpark figure kind). At the request of the stakeholders concerned we have treated this information in confidence; we have used these figures to help elaborate our understanding and in the creation of the hypothetical cost models presented in this report. 1.5 Note on prices 1.5.1 Section 4 contains prices for a range of current cloud providers. However, the purpose of this work is not to serve as a robust price comparison and, although we have tried to ensure that prices are accurate, we make no guarantees as to their veracity. Readers must perform their own price assessment of cloud services tailored to the specifics of their research tasks and circumstances. However, we hope that the issues highlighted in this report make their task easier. 1.5.2 To further emphasise this point, we would point out the rapid pace of change in cloud computing services. November 2011 alone has seen the introduction of a new larger Cluster Compute instance type in Amazon s EC2 service (accompanied by a reduction in price of the old Cluster Compute instance) 5 as well as the launch of Amazon s US West (Oregon) Region (followed soon after in December by the launch of the South America (Sao Paulo) Region), the launch of the Red Cloud service in Cornell, 6 a framework procurement exercise by the fledgling JANET(UK) Data Centre and Cloud Service Brokerage, 7 and the announcement of the prices for Eduserv s Education Cloud. 8 In a fast-moving environment like this, specific cloud prices in this report will be out of date in a matter of months. 4 5 6 7 8 For more information about the Cloud Pilots, see http://cloudresearch.jiscinvolve.org/wp/ [accessed 8 December 2011] http://aws.amazon.com/ec2/instance-types/ [accessed 7 December 2011] http://www.cac.cornell.edu/redcloud/ [accessed 7 December 2011] http://www.janetbrokerage.ac.uk/ [accessed 7 December 2011] http://umfcloudpilot.eduserv.org.uk/entries/20710168-pricing [accessed 7 December 2011] CC497D002-1.2 11

Currency 1.5.3 Public cloud providers often quote prices in US dollars. For this report we have chosen to give all prices in pounds sterling using a nominal exchange rate of $1.6 to 1. 1.6 Overview of this document 1.6.1 The rest of this report is set out as follows: Section 2 sets out background information for cloud computing and research computing; Section 3 looks in more detail at the costs and business models for research computing on local infrastructure; Section 4 looks in more details at costs and business models for cloud computing; Section 5 analyses the implications of the information in Section 3 and 4; Section 6 sets out the findings and recommendations; Section 7 sets out suggested advice and guidance for a range of stakeholders; Annex A lists the interviews conducted for this review; Annexes B through D contain case studies of cloud computing for research; Annex E contains indicative price data for a range of cloud providers. 12 CC497D002-1.2

2 2.1 Introduction 2.1.1 This section introduces the key concepts of cloud computing, explores common economic aspects of the business model and examines their applicability to research computing. 9 2.2 What is cloud? 2.2.1 One of the most widely used definitions of cloud computing is provided by NIST: 10 Cloud computing is a model for enabling convenient, on-demand network access to a shared pool of configurable computing resources (eg, networks, servers, storage, applications, and services) that can be rapidly provisioned and released with minimal management effort or service provider interaction. 2.2.2 Cloud computing services usually fall into one of three groups: Infrastructure-as-a-Service (IaaS): the user rents one or more virtual machines on physical servers located in the cloud provider s data centre(s). The user has root access to their virtual machines, and is free to install and configure their software. Platform-as-a-Service (PaaS): the user interacts with the cloud-based platform, rather than specific virtual machines. The user is constrained within the confines of the platform environment, which will usually specify operating system and application programming methods. Software-as-a-Service (SaaS): the user interacts with a piece of software running on a remote IaaS or PaaS cloud system through a client often a web-browser. 2.2.3 Cloud computing is also usually categorised as one of: Public cloud: commercially provided cloud computing in shared data centres. Public cloud providers include Amazon, 11 Google, 12 Microsoft, 13 Rackspace, 14 and Eduserv. 15 Private cloud: cloud middleware operated on locally owned hardware 16. Hybrid cloud: a private cloud with the capability to seamlessly integrate with one or more public clouds as demand and the use-case requires. 2.2.4 All these models of cloud computing are relevant to research. However, the bulk of the discussion in this report is focused on public IaaS clouds. This is because: 9 10 11 12 13 14 15 16 In the interest of keeping this report focused on the economic aspects of cloud computing for research, this introduction to cloud computing is brief. For a more thorough explanation, readers may wish to refer to the cloud computing reports published by JISC in 2010: Cloud computing for research and the Technical review of cloud computing for research. http://www.nist.gov/itl/cloud/upload/cloud-def-v15.pdf [accessed 29 November 2011] http://aws.amazon.com/ [accessed 7 December 2011] http://appengine.google.com [accessed 7 December 2011] http://www.microsoft.com/windowsazure/ [accessed 7 December 2011] http://www.rackspace.co.uk/ [accessed 7 December 2011] http://www.eduserv.org.uk/ [accessed 7 December 2011] The US Army, for example, is spending $250M on procuring private Cloud systems: http://www.federalnewsradio.com/240/2696095/army-awards-250m-data-center-contract [accessed 10 January 2012] CC497D002-1.2 13

There is more evidence available for these kinds of clouds. Indeed, we believe it is reasonable to state that most academic interest to date has focused on IaaS clouds, and Amazon s AWS offering in particular. IaaS is most similar to what researchers have now in terms of local infrastructure. 2.2.5 That is not to downplay the potential role of PaaS and SaaS, and it is interesting to note that some projects (such as e-science Central 17 ) are building SaaS-style services on top of IaaS. Other projects are pioneering the use of private and hybrid clouds. We fully expect that as the field matures, future work on cloud computing for research will have a greater evidence base on which to draw. 2.3 The economic arguments for cloud computing 2.3.1 Cloud computing providers make the economic case for cloud computing in a few key points: cloud computing providers are able to use the economies of scale to procure hardware, and run data centres, much more efficiently and cost-effectively than smaller organisations; 18 users of cloud computing do not need to procure their own hardware 19, which removes procurement, maintenance and space costs; 20 the potential for dynamic flexibility and scalability offered by cloud providers means that there is no need to procure hardware in order to meet peaks in demand or rare needs 21, to lose productivity through not being able to accommodate peaks in demand (shown in Figure 2-1), or to retain and keep paying for hardware which is not being used; this capability is often termed right-sizing. 2.3.2 Ultimately, the proposition on offer for consumers of public cloud offerings is that they trade upfront Capital Expenditure (CapEx) for more spread out Operational Expenditure (OpEx). This is a key attraction of cloud computing, especially in situations where capital funding is constrained. Whether this actually results in lower costs over time depends on the specific economics of the infrastructure being considered. 2.3.3 This shift is achieved by on-demand billing models in which the user only pays for the resources they consume. More recently, some providers have begun catering to users with high usage, or less variable and more predictable requirements, by offering the ability to pay an upfront fee in return for reduced hourly charging rates. AWS s term for this is reserved instances. Other variations on this theme are available, including bidding for spot instances which become active 17 18 19 20 21 For more information on e-science Central, see Annex C and http://www.esciencecentral.co.uk/ [accessed 7 December 2011] A Microsoft report on The Economics of the Cloud suggests that scale reduces TCO, with an 80% reduction per server when going from 1,000 to 100,000 servers which should entail lower prices from larger Cloud providers. See http://www.microsoft.com/presspass/presskits/cloud/docs/the-economics-of-the-cloud.pdf [accessed 10 January 2012] Consider, for example, pharmaceuticals company Eli Lilly spending $89 to analyse a new drug instead of buying 25 servers and suffering a concomitant wait in procurement: http://www.tectrends.com/tectrends/article/00178498.html [accessed 10 January 2012] Typically, large savings are reported for SaaS: consider Imagination Group s 125k v 540k 3-year cost comparison of using Google Apps against on premise implementation https://docs.google.com/present/view?id=dcq94j3_633cxsrxcgb (slides 36 and 37) [accessed 10 January 2012], and Hillingdon Council s projected saving of 2.98m over 4 years by moving to Google Apps for Business http://www.hillingdon.gov.uk/index.jsp?articleid=24216 [accessed 10 January 2012] Or even one-off activities such as the conversion of 11 million TIFFs to PDF by the New York Times for just $240: http://open.blogs.nytimes.com/2007/11/01/self-service-prorated-super-computing-fun/ [accessed 10 January 2012] 14 CC497D002-1.2

Percentage utilisation based on a threshold variable price. Standard monthly and annual rental models for hosted systems are also offered by some providers. 160 140 Under provisioning 120 100 80 Over provisioning 60 40 20 0 Data centre demand over time Figure 2-1: hypothetical data centre demand; peaks above 100% utilisation mean lost productivity, troughs below 100% mean wasted resources 2.3.4 A 2009 McKinsey report was one of the first pieces of literature in which the cost benefits of cloud computing to enterprise were challenged. 22 In particular the report found that cloud computing offerings at that time were not cost-effective compared to large enterprise data centres. 23 It also found that there was the potential to reduce staffing by 10-15% by moving a large data centre to a cloud provider. The report caused something of a storm on the internet, drawing criticism for apparent omissions (for example, not factoring in the effect of reserved instances). 2.3.5 It is notable that, while commercial cloud providers still highlight the potential financial savings their service may offer, greater attention is now on the benefits of rapid provisioning and flexibility. 2.3.6 A further potential advantage is the ability to have systems that are tracking improvements in hardware at a similar cost, rather than being performance-depreciated in contrast to the latest technology, and to suffer minimal impacts and outages due to hardware issues such as hard disk failure. 24 2.4 Research computing 2.4.1 Research computing is a broad term that encompasses infrastructure, policy and process, funding, and applies to a range of scales from individual researchers using workstations to groups using terascale supercomputers in grids. In order to examine the applicability of the 22 23 24 http://www.isaca.org/groups/professional-english/cloudcomputing/groupdocuments/mckinsey_cloud%20matters.pdf [accessed 7 December 2011] Based on comparing EC2 instances with a per month CPU equivalent local data centre cost of $45. Estimated at 2-4%, and potentially as high as 13%, in 2007: http://www.usenix.org/events/fast07/tech/schroeder.html [accessed 10 January 2012] CC497D002-1.2 15

economic aspects of cloud computing to research computing, it is first helpful to cover a few background areas. High Performance Computing (HPC) and High-Throughput Computing (HTC) 2.4.2 Two broad areas of computing emerge as traditionally making use of different infrastructure: High Performance Computing (HPC): is delivered by clusters and supercomputers. These machines are designed to perform the maximum number of operations per second, and make use of special architectures to achieve this goal. A key characteristic HPC machines share is a low-latency interconnect between processors, such as InfiniBand, which makes it possible to share data very rapidly between large numbers of processors working on the same problem. High Throughput Computing (HTC): HTC is usually defined by a style of computational task which requires many independent jobs to be run each often having only modest hardware requirements. HTC tasks are sometimes known as being embarrassingly parallel. In order to increase research productivity the focus is on maximising the throughput over a long period in other words, sweating the assets. Access to a large pool of machines, each one modest in its capabilities, can allow researchers to perform tasks such as parameter sweeps much more rapidly than being limited to a single workstation. Batch and interactive computing 2.4.3 Batch computing is the term given to executing a group of computational jobs without manual intervention. A natural consequence of batch computing is the need for job scheduling or queuing. The batch of jobs is submitted to the queue and each job is processed as resources become free. In a cloud, scheduling occurs for the virtual machines and so there does not have to be a queued batch because of the scale at which providers can operate (in practice, this may be limited by cloud providers who do not want individual users to be able to adversely affect the performance of their data centres and queuing may also be a useful means to ensure high utilization and good cost-efficiency of procured resources). Institutional clusters 2.4.4 Many institutions aggregate demand for facilities together with grant income from their researchers in order to procure and operate central research computing infrastructure. This usually includes provision of computing power in the form of an institutional cluster. Doing this confers the following benefits: it is more cost efficient to support a single larger machine in a well-maintained data centre than it is to support many smaller distributed machines; greater capability for large-scale calculations by being able to procure a larger, possibly higher-spec cluster; by removing smaller machine rooms the institution decreases power consumption, and engenders better use of estate. 2.4.5 The extent to which institutions do this varies, with some vetting grant applications for computational requirements before they are submitted. Much work can go into gathering researchers requirements before procuring a new cluster so that new computing infrastructure is appropriately sized although in practice the size and specification of the eventual infrastructure is governed by the size of funds available as well as expressed requirements and any potential conflicts therein 16 CC497D002-1.2

Smaller clusters 2.4.6 Below the level of institutional clusters there is a range of smaller clusters, servers and workstations. Equipment purchased at this scale might meet certain needs of departments, research groups, or specific projects. Some researchers prefer to have direct control over their computational infrastructure; a sense of control which comes from having root access to physical machines that are close at hand for some, this may even mean small clusters under desks. Campus Grids 2.4.7 Grids of workstations, perhaps in undergraduate computer rooms, can be linked together to process computational loads; grids are usually employed in HTC situations. These Grids can be used for research purposes during idle time, either left continuously powered, or triggered into operation using Wake-On-LAN (WOL). Cloud computing may be attractive to researchers using Campus Grids as their scale makes them, in theory, able to handle large HTC sweeps in one shot. CC497D002-1.2 17

3 3.1 Introduction 3.1.1 This section explores in more detail the financial aspects of institutional research computing. It is structured as follows: a three-stage model of research computing that describes the costs associated with getting started, production and end-of-life costs is set out; a simple general approach that can be taken to consider the full costs of research computing is set out, for both cloud and local resources; a high level description of the funding process for institutional research computing and its implications for cloud computing is given. 3.2 Overall model 3.2.1 There are several published methods for calculating the Total Cost of Ownership (TCO) of data centres. 25 These enable institutions to comprehensively cost ICT services, including research computing. However, for the purposes of this report, it is sufficient to take a higher-level view. A general model of the costs involved in institutional research computing is shown in Figure 3-1. This breaks research computing down into three stages: stage one consists of all the activities that may be needed to set up a research calculation; stage two consists of the production calculations; and stage three consists of the long-term storage requirements of the research and decommissioning of the equipment. Stage 1: Getting started Resource selection and the procurement process Capital hardware costs Configuration and OS install Infrastructural costs (buildings) Initial training Stage 2: Production and operational costs Technical support staff Research support staff Software licensing Energy (power and cooling) Hardware maintenance Infrastructure maintenance Stage 3: Storage and end-of-life Storage (10 year requirement) Recycling and disposal Figure 3-1: the costs involved in institutional research computing Costing local research computing 3.2.2 Table 3-1 shows a method based on the overall model in Figure 3-1, illustrated with hypothetical values, for calculating a cost per core-hour for local research computing centres. This simple 25 For example, one such method tailored for institutional IT services is a toolkit for costing IT services based on a JISC project at Oxford University Computing Services. http://www.jisc.ac.uk/media/documents/programmes/flexibleservicedelivery/toolkit_for_costing_itservices.pdf [accessed 16 December 2011] 18 CC497D002-1.2

method does not include a discount factor for future inflation, or indeed account for reduced performance over time institutions may wish to consider net present value methods to take account of this. Table 3-1 offers hypothetical figures and does not include every item in the model in Figure 3-1, although it is relatively easy to add additional line items to the table. However, since clusters do not usually include long term storage infrastructure, and that long term storage requirements will be highly specific to the research being undertaken, we think it best to consider this as a separate cost, and not fold it in to the core-hour cost. Capital cost ( ) Depreciation period (years) Ongoing cost ( ) Annual cost ( ) Staff 120,000 120,000 Infrastructure 1,500,000 10 10,000 160,000 Hardware 800,000 3 266,667 Energy 25,000 25,000 Software 26 5,000 5,000 Total 576,667 Number of cores 2000 Avg. Utilisation 80% Core-hours / yr 14016000 Cost / core-hour 0.041 Table 3-1: cost methodology for local research computing (table entries are hypothetical) 3.2.3 Of critical importance in this is the percentage utilisation: the number of CPU hours actually used as a percentage of the theoretical maximum. CPU hours that are not used are wasted resources whose cost must be added to the hours being used. The effect of utilisation rate on the per core-hour cost of the example system in Table 3-1 is shown in Figure 3-2. This is one of the key aspects of the cloud economic argument: there is no need to over-provision resources to accommodate peak loads that are not used much of the time. However, the research computing managers we have spoken to report very high (80%-90%) utilisation of their facilities. As Figure 3-2 shows, there are rapidly diminishing returns for utilisation rates much above this. 26 Based on an annual licensing cost model. CC497D002-1.2 19

0.33 0.17 0.11 0.08 0.07 0.06 0.05 0.04 0.04 0.03 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% Figure 3-2: effect of utilisation on the /core-hour cost for the example in Table 3-1 3.2.4 Based on discussions with research computing managers, and the information they shared in confidence, an estimate of cluster costs come in the region of 5 to 7p per core-hour however, it is vital to recognise that costs at institutions vary significantly. For example, services such as HECToR or the National Computational Chemistry Software Service cost more than this. Institutions with low computing costs typically operate large, well-populated, modern and efficient data centres that assure low Power Usage Effectiveness (PUE) values, and centralised maintenance. Smaller clusters 3.2.5 The method described in Table 3-1 is most readily applicable to large (institutional) clusters that are managed in data centres with known PUE values and staff requirements. This might not be the case for the smaller clusters described in paragraph 2.4.6, especially if they are of the ad hoc broom cupboard type often maintained by using (uncosted) post doc or academic time. Our understanding is that these clusters are often believed to be cheap, but some institutions are implementing policies to consolidate these clusters into the institutional offering for the reasons described in paragraph 2.4.4. Campus Grids 3.2.6 It is difficult to simply calculate a core-hour cost for Campus Grids, since the machines are usually widely distributed and it is often not possible to accurately understand their power costs. However, since the hardware is often procured for other purposes, the calculation is simplified by requiring only the additional support, software and power requirements incurred by operating the machines as a grid. On the other hand, this makes the hardware more expensive for the purpose for which it was procured for the reasons described in paragraph 3.2.3 and might be considered as a subsidy for research computing. 3.2.7 Campus Grids based on student computer rooms are usually considered very cheap, since they use borrowed cycles we identified one example in the literature that compared the cost of a Campus Grid to Amazon s EC2 service. This calculated the average cost per user to be $365 per 20 CC497D002-1.2

year, while a similarly sized cluster on EC2 would have cost $16,939 per user, 27 with many likely caveats applicable to such figures. 3.3 Business models for research computing 3.3.1 The primary mode of funding for many research computing centres is likely to be through indirect, and perhaps estates, costs. It is up to the individual institution how the indirect costs are handled whether at an institutional or faculty or departmental level. The extent of the costrecovery the institution aims to achieve, and to what degree they are willing to cross-subsidise research computing, are also complicating factors. 3.3.2 There are three main models by which institutional research computing centres charge researchers for using their facilities. These models and charges then in turn affect the costs the Research Councils see on grant applications: Free at the point of use: researchers simply use the computing resources as and when they need them. Most institutions have some form of free-tier in order to provide an easier route into using the service, even if they also have other charging options. Hybrid: resources are free to use, but priority or dedicated access can be purchased as an indirect expense. Some institutions use a model whereby if grant holders are awarded funds to buy computational hardware, this is done in the form of adding extra capacity to an existing centrally managed cluster. Charged: all computing time must be purchased at a rate decided by the research computing centre. The rates charged are a balance of what the service actually costs, what degree of cost recovery is desired, and how much researchers are willing to pay. 27 A cost benefit analysis of a campus computing grid, Preston M Smith, Masters Thesis, Purdue University, May 2011 http://docs.lib.purdue.edu/cgi/viewcontent.cgi?article=1035&context=techmasters [accessed 13 December 2011] CC497D002-1.2 21

4 4.1 Introduction 4.1.1 This section sets out the high level cost model of cloud computing, and discusses how the business models of cloud computing may affect costs. Finally, indicative pricings of cloud offerings likely to be of interest to researchers are given. 4.2 Overall model 4.2.1 Figure 4-1 shows the overall model for research costs using cloud computing. Note that there are additional costs (such as research support) that are not included in the headline cloud costs for compute and data. Headline costs are highlighted in bold. Stage 1: Getting started Stage 2: Production and operational costs Stage 3: Storage and end-of-life Resource selection (may involve procurement) Configuration and setup Code porting Data transfer in Research support staff Software licensing Virtual infrastructure costs: cpu/instance hours Data transfer Data storage Storage (10 year requirement) Data transfer out Figure 4-1: the costs involved in cloud computing for research, entries in bold are costs that public cloud providers often charge 4.2.2 Cloud providers usually have cost calculator tools available on their websites. These can be useful in estimating the cost of research tasks, provided the user knows in sufficient detail the computational characteristics and behaviour of their research. These calculators give a cost for the research that includes hardware and power costs, but our model in Figure 4-1 shows that the total costs should also include support and any associated software costs. 4.2.3 That researchers do benefit from support for using cloud, especially in the early stages, is not in doubt. This is a clear message emerging from the Cloud Pilot projects. However, the actual level of support a regular researcher will require is still unknown. Some of the Cloud Pilots report large amounts of time spent in code porting and in trying to implement software to allow non-technical users to use cloud computing. Others report that the initial learning curve can be steep, but subsequent cloud use is much easier. These problems are of course not necessarily specific to cloud computing but are often acutely felt by researchers new to cloud. The current early adopter community, and especially the computer science focused section of it, can minimise these time costs because of their technical skills, or else they do not care about the costs as they are viewed as part of studying the cloud. 22 CC497D002-1.2