U"lizing the SDSC Cloud Storage Service PASIG Conference January 13, 2012 Richard L. Moore rlm@sdsc.edu San Diego Supercomputer Center University of California San Diego SAN DIEGO SUPERCOMPUTER CENTER at the UNIVERSITY OF CALIFORNIA, SAN DIEGO
TradiOonal supercomputer center storage systems FuncOonal Systems Tape- based archival system Built for capacity We ve extended the archive beyond HPC simula9on data to experimental data and other digital assets - and as a node in geographically- distributed digital preserva9on systems (e.g. Chronopolis) High- bandwidth parallel file system Built for speed Transient data, single- copy reliability Home directory system (e.g. NFS) Built for robustness and reliability Regular backups LimitaOons Archival data is difficult to access - high latency, lower bandwidth, user interfaces Difficult to share archival data by mulople users All too oxen archived data, parocularly HPC simulaoons, is write- once- read- never Not sustainable and no incen;ves for users to retain only high- value data
AdapOng to emerging requirements and changing technologies ExponenOal data growth - and analysis of that data - are increasingly important to the research enterprise Requires ready access to data, w/ low latency & high bandwidth CollaboraOve team science demands easy data sharing Consumer product development drives prices Disk capaci;es increasing quickly Flash memory becoming more affordable Gordon compute system just now being deployed with 0.25 PB of flash - to fill the latency gap between DRAM and spinning disk For HPC systems with historical byte/flop raoos, storage would be an increasingly significant fracoon of total system cost Can t afford open- ended archival storage must develop methods to place value on data, especially for long- term high- reliability storage
SDSC is deploying a new repertoire of storage systems SDSC Cloud! Storage of Digital Data for Ubiquitous Access and High-Durability" Access: Multi-platform web interface, S3 interfaces, backup SW" Data Oasis (PFS)! High-Performance Transient Parallel File System for HPC " Access: Lustre on HPC Systems (Gordon, Trestles, Triton)" Project Storage! Purpose: Typical Project / User File Server Storage Needs" Access: NFS/CIFS, isci" SAN DIEGO SUPERCOMPUTER CENTER at the UNIVERSITY OF CALIFORNIA, SAN DIEGO
A Paradigm ShiX for Long- Term Storage: Access, Sharing and CollaboraOon SDSC Cloud hcp://cloud.sdsc.edu Launched September 2011 Largest, highest- performance known academic cloud 5.5 Petabytes (raw), 8 GB/sec System can upload 500GB in ~1 min AutomaOc dual- copy and verificaoon Capacity and performance scale linearly to 100 s of petabytes Open source plagorm based on NASA and RackSpace soxware 5
Key Features of SDSC Cloud Always- there disk- based availability of data Tape latency and mul;- user issues addressed High reliability Disk RAID; automa;c dual- copy; con;nuous background checksum verifica;on/ restora;on; offsite replica;on soon Simple data owner user interfaces to data, its management, its access and seing permissions for sharing data Easy access to shared data for any users with permission under range of mechanisms (hcp, APIs, portals, gateways ) EncrypOon readily incorporated and addresses issues of storing HIPAA/ proprietary data TransacOon history is logged track usage, assess uolity, support provenance Scalable system in both capacity and bandwidth Interfaces to commercial and open- source products
ApplicaOons of SDSC Cloud Shared/published/curated data collections " HPC simulation data storage and sharing " Web/portal applications and site hosting " Application integration using supported APIs " Serving images/videos" Backup services "
Why Openstack SwiX Cloud SoXware? Evaluated Software!! OpenStack Swift! Open Source" Community Support" Highly Configurable"! Eucalyptus! Highly Flexible" Compute Focused"! Caringo Castor! Commercial Software" Long Development Cycle" Industry Standard!! More than 100 leading companies from over a dozen countries are participating in OpenStack, including Cisco, Citrix, Dell, Intel and Microsoft." Highly Compatible! Compatibility w/ public OpenStack clouds means itʼs easy to migrate data and apps to public clouds when desired based on security policies, economics, and other key business criteria." Proven Software! Running the OpenStack cloud operating system is same software that powers many large public and private clouds, including RackSpace Cloud Storage." Control & Flexibility" Open source platform means not locked to a proprietary vendor, and modular design can integrate with legacy or 3rd-party technologies. " OpenStack project provided under Apache 2.0 license."
SDSC Cloud Interfaces Data Owners! Traditional Clients! GUI Applications" Command Line" SDSC Web I/F" Load Balanced Proxy Servers" External Users! Web Services API! Amazon S3" Rackspace CloudFiles /" Openstack API" Swift Object Storage Cluster" Commercial Products! Commvault" Amanda Backup Tools" Crashplan"! User- Developed Web Portals/ Gateways!!
SDSC Cloud Explorer
Rates and Funding Mechanisms See h7ps://cloud.sdsc.edu/hp/pricing.php for current pricing; HW costs subject to market vola"lity; contact services@sdsc.edu if interested in service On Demand Cloud Storage Pay monthly per GB used (water- mark) U California users: $X/TB- Year dual- copy + applicable indirect costs + 50% premium for addi;onal off- site copy (when available) Users external to UC: 2*$X/TB- year dual- copy, 3*X for dual- copy + 1 off- site copy Condo Cloud Storage Recipient buys HW that is integrated into the storage service and pays annual opera;ng costs for maintenance and system administra;on Purchase condo HW at $Y market price (pre- configured head node and disk array - currently 2TB drives with 8.5 TB usable dual- copy; space will increase over ;me) Annual opera;ng cost: $Z/year/condo + applicable indirect costs & UC- external factors User has right to use condo for 5 years; TCO/condo = $Y + 5*Z over 5 years *Encryp"on and HIPAA Compliant Storage is available with both op"ons
QuesOons? You can touch the cloud now: Download this presenta"on, publicly shared from my personal account via hcp://onyurl.com/sdsc- PASIG = h7ps://cloud.sdsc.edu/v1/auth_rlm/pasig/pasig- Moore.pdf Or go to cloud.sdsc.edu; login as user PASIG, PW PASIG Get a trial account with an.edu email address cloud.sdsc.edu (no charges first 30 days)