Cloud Computing What Are We Handing Over? Ganesh Shankar Advanced IT Core Pervasive Technology Institute
Why is the Cloud Relevant to In the current research workflow. Medical Research? Data volumes are increasing exponentially Next Generation Sequencing Data storage and transport are at capacity http://genomebiology.com/2010/11/5/207
A Possible Solution Bring data and computational tools together in the cloud http://genomebiology.com/2010/11/5/207
What is it? Introduction Cloud computing is Internet-based computing, whereby shared resources, software and information are provided to computers and other devices on-demand, like the electricity grid. -Wikipedia
Introduction Cloud computing is a model for enabling convenient, ondemand network access to a shared pool of configurable computing resources (e.g., networks, servers, storage, applications, and services) that can be rapidly provisioned and released with minimal management effort or service provider interaction. This cloud model promotes availability and is composed of five essential characteristics, three service models, and four deployment models. - NIST http://csrc.nist.gov/groups/sns/cloud-computing/cloud-def-v15.doc
Cloud Providers and Users Cloud Provider Amazon, Google, Establishes and maintains computing infrastructure and skills eg. Google >20,000 servers/data center Large capital expense Utility pricing model Offers as a service Service as various levels IaaS, PaaS, SaaS
Cloud Providers and Users Cloud User Demand computing resources as needed Storage capacity Computational capacity Elastic, scalable resource Reduces need to design for peak demand Pay only for services used Reduces capital expenditure Pay as you go
We re Giving Up Control.Do We Really Need This? http://genomebiology.com/2010/11/5/207
Cloud Deployment Models Public Clouds Externally owned and operated resources Service offered over the public internet Private/Community Clouds Internally owned and operated resources Eucalyptus, etc. Service restricted to authorized customers eg. Group of Hospitals Hybrid clouds Private cloud interacts with & scales into public cloud
Levels of Service - IaaS Infrastructure as a Service(IaaS) Computing hardware, physical plant, networking, storage, VM eg. Amazon S3 & EC2 Advantage Custom development env. & applications Disadvantage Increased diligence Diag. by Burton Group
Levels of Service - PaaS Platform as a Service(PaaS) All resources needed to build and deploy applications eg. Jboss App. Server Advantage Custom Apps from platform Disadvantage Provider restrictions Greater diligence Diag. by Burton Group
Levels of Service - SaaS Software/Application as a Service(SaaS) Entire application is available on the web eg. Google Apps Advantage Least investment in technology Disadvantage Greatest dependence Little customization Diag. by Burton Group
Cloud Computing Advantages Defines/Optimizes IT mission What is core/strategic asset, what can be outsourced? Defers capital expenditures, other costs Increases application reliability and availability Increases application accessibility (webbased)
Cloud Computing Disadvantages Increases need for diligence Risks need to be properly evaluated HIPAA regulations, intellectual capital, etc. Service Level Agreement needs to be properly defined and executed Costs data upload & download, intercloud Interoperability migration between cloud vendors, PaaS dependency
SAAS Example IU Student Email IU students can use Google or Microsoft hosted email accounts/apps Students are not employees Limited intellectual property, HIPAA exposure, export controls Alumni Email Forwarding address Avoid blacklisting (IU=spammer) 6 month transition
Private Cloud Example - IU Virtual Machines Backup Solutions
2006 2007 IT Production Phase Virtualize all IT supported systems eg. CAS 2007 2008 Business Production Phase Virtualize all Business systems eg. Peoplesoft Saved 90% over traditional provisioning 2010 Private Cloud Example - IU HIPAA aligned NIST 800-53 at low risk
Services on the Private Cloud IAAS Base VM 1 CPU, 1GB RAM, 35 GB Disk, $450/year Amazon 1 CPU, 1,7GB, 160GB Disk, $744/yr PAAS Oracle server IU Simon Cancer Center Oncore System SAAS RedCap Clinical Data Management System for IUSM
Enterprise System Infrastructure 250 Linux VMs 200 Windows VMs Q u a n t i t y 150 100 50 0 Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec Jan Feb Mar Apr May Jun Jul Aug Spt Oct Nov Dec Jan Feb Mar Apr May Jun 2006 2007 2008
Private Cloud Example IU Comparison to NIST Ideal Resource Pooling ~1500 servers Rapid elasticity Half a day Measured service Broad network access IU IP address On-demand self service IT has to create instance - oversight No credit card payment
IU Private Cloud - Characteristics Control over restricted/protected data Cloud exists as long as IU exists Cost savings without exposure to external companies Increased responsibility 2010 Data Center What did we hand over? Nothing to any external entities
What are we handing over? Datasets meant to be shared (non-production) Genbank Ensemble 1000 Genomes Project Public toolsets GBrowse Galaxy UCSC Genome Browser cabig Apps
FISMA Compliance Submitted for GSA for approval Sept. 2009 Certified July 26, 2010 Available to federal, state and local govt. agencies
Conclusion The data deluge is worse than we think New technical approaches are essential Balance accountability vs. technical needs, cost, strategic interests, etc. Start with resources meant to be shared Public providers might be able to step in
Rob Lowden Acknowledgements Director, Enterprise Infrastructure, UITS Adam Walsh Manager, Identity Management Systems, UITS Bill Barnett Senior Manager, Life Sciences, UITS