iplant + irods: Enabling data driven collaborations Nirav Merchant iplant Collaborative/Univ. of Arizona nirav@ .arizona.edu VAMP 2012 Utrecht

Size: px
Start display at page:

Download "iplant + irods: Enabling data driven collaborations Nirav Merchant iplant Collaborative/Univ. of Arizona nirav@email.arizona.edu VAMP 2012 Utrecht"

Transcription

1 iplant + irods: Enabling data driven collaborations Nirav Merchant iplant Collaborative/Univ. of Arizona nirav@ .arizona.edu VAMP 2012 Utrecht

2 Topic Coverage About iplant 4 th Paradigm Technology challenges for life sciences iplant Data Store (ids) Challenges of Sharing Data iplant Atmosphere (cloud) Future: Identity+Group+Network with Openflow

3 What is iplant The iplant Cyberinfrastructure Collaborative is building a comprehensive informatics infrastructure for plant biology. Funded by the National Science Foundation (NSF) 2008 (and continuing till 2018) This rapidly evolving infrastructure is sometimes very visible to users (researchers), and sometimes absolutely transparent to them (projects powered by iplant components).

4 The iplant Collabora/ve Cyberinfrastructure Philosophy We have designed iplant to be consistent with the pillars of CIF21* ü High Performance Compu?ng ü Data and Data Analysis ü Virtual Organiza?on ü Learning and Workforce

5 Science Paradigms 1. Thousand years ago: science was empirical describing natural phenomena, observations 2. Last few hundred years: theoretical branch using models, generalizations 3. Last few decades: a computational branch simulating complex phenomena 4. Today: data exploration (escience) unify theory, experiment, and simulation Based on the transcript of a talk given by the late Jim Gray to the National Research Council Computer Science and Telecommunication Board in Mountain View, CA, on January 11,

6 The Fourth Paradigm: Data-Intensive Scientific Discovery Increasingly, scientific breakthroughs will be powered by advanced computing capabilities that help researchers manipulate and explore massive datasets. The speed at which any given scientific discipline advances will depend on how well its researchers collaborate with one another, and with technologists, in areas of escience such as databases, workflow management, visualization, and cloud computing technologies. 6

7 The Discovery Lifecycle The Fourth Paradigm: Data-Intensive Scientific Discovery 7

8 Big Data (yes we have it!)

9 Data- intensive biology necessitates biologists become comfortable with new technology (rather quickly)

10 10

11 One key goal in our infrastructure, training and outreach is to minimize the emphasis on technology and return the focus to biology MaG Meselson & Ultracentrifuge, $500, Sharp, Sambrook, Sugden Gel Electrophoresis Chamber, $250

12 Ways for users to access iplant Atmosphere: cloud compu?ng plaqorm Data Store: secure, cloud- based data storage Discovery Environment: a web portal to many integrated applica?ons (combine data + compute) DNA Subway: genome annota?on, DNA bar- coding (and more) for science educators Founda/on API: For programmers embedding iplant infrastructure capabili?es (Auth, IO, Apps, Jobs, Dir. etc) Command line: for expert access (thru TeraGrid/XSEDE)

13 The iplant Cyberinfrastructure End Users Teragrid XSEDE Computa?onal Users

14 The iplant Discovery Environment A rich web client Consistent interface to bioinforma?cs tools Portal for users who won t want to interact with lower level infrastructure An integrated, extensible system of applica?ons and services Addi?onal intelligence above low level APIs Provenance, Collabora?on, etc.

15 Scalable Computa/on for High- Throughput Inquiry 90,000 Compute Cores Up to 1TB shared memory TACC Lonestar TACC Ranger Growing to ~500,000 cores by end of 2012 PSC Blacklight TACC Corral EBI Web Services

16 iplant Layered Services and Access" End Users iplant Data Store" Scalable" Reliable" Redundant" High-Throughput" Computational Users

17 Powered by iplant The iplant CI is designed as infrastructure. This means it is a platform upon which other projects can build. Use of the iplant infrastructure can take one of several forms: Authentication (~IdM/P, Shib, CAS etc) Storage Computation Application Hosting Web Services Scalability 17

18 Powered by iplant Other major projects are beginning to adopt the iplant CI as their underlying infrastructure (some completely, some in limited ways): BioExtract (computation) CiPRES (authentication, computation) Gates Integrated Breeding Platform (hosting, development, authentication) Galaxy (storage, for now) CoGE (authentication, data store, hosting) Many more (check 18

19 CIPRES Portal Federation

20 irods Developed by Data Intensive Cyber Environments (DICE) Directed by Reagan Moore Developed SRB, the Storage Resource Broker at SDSC, the San Diego Supercomputer Center Most of the group migrated to UNC Chapel Hill in (The group is bi-coastal: DICE-UNC, DICE-UCSD) Released irods, the integrated Rule-Oriented Data System, in 2009 Primary development funding from NSF (and other agencies)

21 irods Data grid middleware Data management infrastructure A framework for procedural implementation of data management policy (policy-driven data management)

22 Resource + Catalogue Server(s)

23 iplant Data Store" Free Your Data" Different Users, " Different Access Needs: " One Data Store"

24 iplant Data Store (ids)" WebDAV DE API i-commands idrop

25 The iplant Data Store Fast data transfers via parallel, file transfer Move large (>2 GB) files with ease Mul?ple, consistent access modes iplant API iplant web apps Desktop mount (FUSE/DAV) Java applet (idrop) Command line (icommands) Tickets and tokens Fine- grained ACL permissions Sharing made simple Access and a storage alloca/on is automa/c with every iplant account

26 Some Challenges Allowing 3 rd party apps to users data Used irods rules for ACL handling E.g. Bisque Image Analysis (updates their web app of data deposition in bisque_data iplant Data Store) SSO allowed jumping between 3 rd party apps (and internal) *Users want to give access to files, directories for download & upload! (anonymous/non iplant/apps) Integrated tickets (tokens) Foundation API (REST access) *Users want fine grain access to permission Restrict access from certain domain (*.arizona.edu) for jobs running on other compute grids (UWisc, OSG) Enhanced tickets to allow host, group, file count, size based control * Funded by NSF OCI : (Klingenstien, Koranda, Merchant)

27 iticket! Niravs-MacBook-Air:$ iticket iticket>ls id: string: X23MQI8I5H70O0e ticket type: write obj type: collection owner name: nirav owner zone: iplant uses count: 0 uses limit: 0 write file count: 0 write file limit: 10 write byte count: 0 write byte limit: 0 expire time: none collection name: /iplant/home/nirav/ticket-incoming No host restrictions No user restrictions No group restrictions

28 irods+shib Building on the ASPiS solution from King's College, which allows web-based applications to be Shib enabled This solution leverages the Apache SP, and manages user accounts based on provided Shib attributes and entitlements Allows customization of behavior by providing a standard set of irods rules and micro services The ASPiS solution is being updated for inclusion in the Java Jargon library, and the out-of-the-box idrop web interface. ASPiS option will run by setting a configuration option that runs a Shibboleth-aware servlet filter. Testing is currently underway on the integration of the ASPiS approach and this integration should be available by end of September Funded by NSF OCI : (Klingenstien, Koranda, Merchant)

29 Customized cloud platform for computing on your terms!

30 Atmosphere: motivation Standalone GUI-based applications are frequently required for analysis GUI apps not easily to transform into web apps Need to handle complex software dependencies (e.g specific bioperl version and R modules) Users needing full control of their software stack (occasional sudo access) All computation does not complete in a 24 hour queue (HPC limitations!) Need to share desktop/applications for collaborative analysis (remote collaborators) Availability of Next Gen map-reduce based algorithms (currently we have limited support)

31 Challenges of existing cloud platforms Amazon Web Services (AWS) Flexible and scalable High level of expertise required for configurations Fairly challenging for biologists to master all steps Limited lifecycle management (cost, time mgmt ) Lack easy desktop integration Lack easy tools for large data transfer

32 Steps to get started!

33 What is Atmosphere? Self-service cloud infrastructure Designed to make underlying cloud infrastructure easy to use by novice user Built on open source Eucalyptus Fully integrated into iplant authentication and storage and HPC capabilities Enables users to build custom images/ appliances and share with community Cross-platform desktop access to GUI applications in the cloud (using VNC) Provide easy web based access to resources

34 The iplant Collabora/ve Project Atmosphere : Custom Cloud Compu?ng API- compa?ble implementa?on of Amazon EC2/S3 interfaces Virtualize the execu?on environment for applica?ons and services Get Up to 12 core / 48 GB instance Access to Cloud Storage + EBS Big data and the desktop are co- local again Bring your data to Atmosphere VM for interac?ve access and analysis Send it back to the DE for transac?onal analysis >60 hosted applica?ons in Atmosphere today, including users from USDA, Forest Service, database providers, etc. (30 more for postdocs and grad students for training classes)

35 Atmosphere: Collaboration iplant Data Store

36 Lifecycle

37 How to Connect

38 Different Ways to Log in to VMs

39 Atmosphere: Launch a new VM

40 Atmosphere: Access a running VM

41 Atmosphere: Log in via shell

42 OpenFlow+OpenStack(Quantum) Allow custom network topology for specific user defined groups Placement of irods resource server and instances Floodlight handles the Openstack integration and openflow switch management Use RESTful interface from Grouper Placement of instance within our distributed infrastructure is easier (prevent vlan pain) Apply group policy, security, data access at network level (not in VM which is easy to override as sudo/ root)

43 Challenges Collaboration as ad-hoc groups/teams is becoming increasingly prevalent Defined groups need to persist across ALL our infrastructure Combining the capabilities of cloud, network virtualization and iplant Data store (with understanding of groups!) Our apps are web + command line (need a happy medium for emerging auth systems!)

44 Will Computers Crash Genomics? Science Vol. 331 Feb 2011

45 Acknowledgement All amazing iplant staff, students, collaborators and community members! Too many to name!!

OpenNebula Open Souce Solution for DC Virtualization. C12G Labs. Online Webinar

OpenNebula Open Souce Solution for DC Virtualization. C12G Labs. Online Webinar OpenNebula Open Souce Solution for DC Virtualization C12G Labs Online Webinar What is OpenNebula? Multi-tenancy, Elasticity and Automatic Provision on Virtualized Environments I m using virtualization/cloud,

More information

OpenNebula Open Souce Solution for DC Virtualization

OpenNebula Open Souce Solution for DC Virtualization 13 th LSM 2012 7 th -12 th July, Geneva OpenNebula Open Souce Solution for DC Virtualization Constantino Vázquez Blanco OpenNebula.org What is OpenNebula? Multi-tenancy, Elasticity and Automatic Provision

More information

OpenNebula The Open Source Solution for Data Center Virtualization

OpenNebula The Open Source Solution for Data Center Virtualization LinuxTag April 23rd 2012, Berlin OpenNebula The Open Source Solution for Data Center Virtualization Hector Sanjuan OpenNebula.org 1 What is OpenNebula? Multi-tenancy, Elasticity and Automatic Provision

More information

OpenNebula Open Souce Solution for DC Virtualization

OpenNebula Open Souce Solution for DC Virtualization OSDC 2012 25 th April, Nürnberg OpenNebula Open Souce Solution for DC Virtualization Constantino Vázquez Blanco OpenNebula.org What is OpenNebula? Multi-tenancy, Elasticity and Automatic Provision on Virtualized

More information

XSEDE Overview John Towns

XSEDE Overview John Towns April 15, 2011 XSEDE Overview John Towns XD Solicitation/XD Program extreme Digital Resources for Science and Engineering (NSF 08 571) Extremely Complicated High Performance Computing and Storage Services

More information

Neptune. A Domain Specific Language for Deploying HPC Software on Cloud Platforms. Chris Bunch Navraj Chohan Chandra Krintz Khawaja Shams

Neptune. A Domain Specific Language for Deploying HPC Software on Cloud Platforms. Chris Bunch Navraj Chohan Chandra Krintz Khawaja Shams Neptune A Domain Specific Language for Deploying HPC Software on Cloud Platforms Chris Bunch Navraj Chohan Chandra Krintz Khawaja Shams ScienceCloud 2011 @ San Jose, CA June 8, 2011 Cloud Computing Three

More information

Clusters in the Cloud

Clusters in the Cloud Clusters in the Cloud Dr. Paul Coddington, Deputy Director Dr. Shunde Zhang, Compu:ng Specialist eresearch SA October 2014 Use Cases Make the cloud easier to use for compute jobs Par:cularly for users

More information

Data Management using irods

Data Management using irods Data Management using irods Fundamentals of Data Management September 2014 Albert Heyrovsky Applications Developer, EPCC a.heyrovsky@epcc.ed.ac.uk 2 Course outline Why talk about irods? What is irods?

More information

Exploring Science Gateway Use Cases for Cloud Computing

Exploring Science Gateway Use Cases for Cloud Computing Exploring Science Gateway Use Cases for Cloud Computing "! M. A. Miller, P. Papadopoulos, A. Majumdar, S. Smallen, N. Wilkins-Diehr, R. P. Wagner, M. Tatineni, R. S. Sinkovits, R. L. Moore, M. L. Norman"

More information

irods Overview Intro to Data Grids and Policy-Driven Data Management!!Leesa Brieger, RENCI! Reagan Moore, DICE & RENCI!

irods Overview Intro to Data Grids and Policy-Driven Data Management!!Leesa Brieger, RENCI! Reagan Moore, DICE & RENCI! irods Overview Intro to Data Grids and Policy-Driven Data Management!!Leesa Brieger, RENCI! Reagan Moore, DICE & RENCI! Renaissance Computing Institute (RENCI) A research unit of UNC Chapel Hill Current

More information

Clodoaldo Barrera Chief Technical Strategist IBM System Storage. Making a successful transition to Software Defined Storage

Clodoaldo Barrera Chief Technical Strategist IBM System Storage. Making a successful transition to Software Defined Storage Clodoaldo Barrera Chief Technical Strategist IBM System Storage Making a successful transition to Software Defined Storage Open Server Summit Santa Clara Nov 2014 Data at the core of everything Data is

More information

SURFsara HPC Cloud Workshop

SURFsara HPC Cloud Workshop SURFsara HPC Cloud Workshop doc.hpccloud.surfsara.nl UvA workshop 2016-01-25 UvA HPC Course Jan 2016 Anatoli Danezi, Markus van Dijk cloud-support@surfsara.nl Agenda Introduction and Overview (current

More information

FREE computing using Amazon EC2

FREE computing using Amazon EC2 FREE computing using Amazon EC2 Seong-Hwan Jun 1 1 Department of Statistics Univ of British Columbia Nov 1st, 2012 / Student seminar Outline Basics of servers Amazon EC2 Setup R on an EC2 instance Stat

More information

SURFsara HPC Cloud Workshop

SURFsara HPC Cloud Workshop SURFsara HPC Cloud Workshop www.cloud.sara.nl Tutorial 2014-06-11 UvA HPC and Big Data Course June 2014 Anatoli Danezi, Markus van Dijk cloud-support@surfsara.nl Agenda Introduction and Overview (current

More information

Big Data and Clouds: Challenges and Opportuni5es

Big Data and Clouds: Challenges and Opportuni5es Big Data and Clouds: Challenges and Opportuni5es NIST January 15 2013 Geoffrey Fox gcf@indiana.edu h"p://www.infomall.org h"p://www.futuregrid.org School of Informa;cs and Compu;ng Digital Science Center

More information

Building Platform as a Service for Scientific Applications

Building Platform as a Service for Scientific Applications Building Platform as a Service for Scientific Applications Moustafa AbdelBaky moustafa@cac.rutgers.edu Rutgers Discovery Informa=cs Ins=tute (RDI 2 ) The NSF Cloud and Autonomic Compu=ng Center Department

More information

The OpenNebula Cloud Platform for Data Center Virtualization

The OpenNebula Cloud Platform for Data Center Virtualization CloudOpen 2012 San Diego, USA, August 29th, 2012 The OpenNebula Cloud Platform for Data Center Virtualization Carlos Martín Project Engineer Acknowledgments The research leading to these results has received

More information

TACC, its Natural History Collections and iplant

TACC, its Natural History Collections and iplant TACC, its Natural History Collections and iplant AIM-UP Santa Fe, NM October 16, 2010 TACC - Mission To enable discoveries that advance science and society through the application of advanced computing

More information

Remote & Collaborative Visualization. Texas Advanced Compu1ng Center

Remote & Collaborative Visualization. Texas Advanced Compu1ng Center Remote & Collaborative Visualization Texas Advanced Compu1ng Center So6ware Requirements SSH client VNC client Recommended: TigerVNC http://sourceforge.net/projects/tigervnc/files/ Web browser with Java

More information

A Service for Data-Intensive Computations on Virtual Clusters

A Service for Data-Intensive Computations on Virtual Clusters A Service for Data-Intensive Computations on Virtual Clusters Executing Preservation Strategies at Scale Rainer Schmidt, Christian Sadilek, and Ross King rainer.schmidt@arcs.ac.at Planets Project Permanent

More information

U"lizing the SDSC Cloud Storage Service

Ulizing the SDSC Cloud Storage Service U"lizing the SDSC Cloud Storage Service PASIG Conference January 13, 2012 Richard L. Moore rlm@sdsc.edu San Diego Supercomputer Center University of California San Diego SAN DIEGO SUPERCOMPUTER CENTER

More information

Early Cloud Experiences with the Kepler Scientific Workflow System

Early Cloud Experiences with the Kepler Scientific Workflow System Available online at www.sciencedirect.com Procedia Computer Science 9 (2012 ) 1630 1634 International Conference on Computational Science, ICCS 2012 Early Cloud Experiences with the Kepler Scientific Workflow

More information

Integrated Rule-based Data Management System for Genome Sequencing Data

Integrated Rule-based Data Management System for Genome Sequencing Data Integrated Rule-based Data Management System for Genome Sequencing Data A Research Data Management (RDM) Green Shoots Pilots Project Report by Michael Mueller, Simon Burbidge, Steven Lawlor and Jorge Ferrer

More information

Cloud Computing Architecture with OpenNebula HPC Cloud Use Cases

Cloud Computing Architecture with OpenNebula HPC Cloud Use Cases NASA Ames NASA Advanced Supercomputing (NAS) Division California, May 24th, 2012 Cloud Computing Architecture with OpenNebula HPC Cloud Use Cases Ignacio M. Llorente Project Director OpenNebula Project.

More information

Accelerate > Converged Storage Infrastructure. DDN Case Study. ddn.com. 2013 DataDirect Networks. All Rights Reserved

Accelerate > Converged Storage Infrastructure. DDN Case Study. ddn.com. 2013 DataDirect Networks. All Rights Reserved DDN Case Study Accelerate > Converged Storage Infrastructure 2013 DataDirect Networks. All Rights Reserved The University of Florida s (ICBR) offers access to cutting-edge technologies designed to enable

More information

Archiving, Indexing and Accessing Web Materials: Solutions for large amounts of data

Archiving, Indexing and Accessing Web Materials: Solutions for large amounts of data Archiving, Indexing and Accessing Web Materials: Solutions for large amounts of data David Minor 1, Reagan Moore 2, Bing Zhu, Charles Cowart 4 1. (88)4-104 minor@sdsc.edu San Diego Supercomputer Center

More information

Assignment # 1 (Cloud Computing Security)

Assignment # 1 (Cloud Computing Security) Assignment # 1 (Cloud Computing Security) Group Members: Abdullah Abid Zeeshan Qaiser M. Umar Hayat Table of Contents Windows Azure Introduction... 4 Windows Azure Services... 4 1. Compute... 4 a) Virtual

More information

Cluster, Grid, Cloud Concepts

Cluster, Grid, Cloud Concepts Cluster, Grid, Cloud Concepts Kalaiselvan.K Contents Section 1: Cluster Section 2: Grid Section 3: Cloud Cluster An Overview Need for a Cluster Cluster categorizations A computer cluster is a group of

More information

globus online Cloud-based services for (reproducible) science Ian Foster Computation Institute University of Chicago and Argonne National Laboratory

globus online Cloud-based services for (reproducible) science Ian Foster Computation Institute University of Chicago and Argonne National Laboratory globus online Cloud-based services for (reproducible) science Ian Foster Computation Institute University of Chicago and Argonne National Laboratory Computation Institute (CI) Apply to challenging problems

More information

GTC Presentation March 19, 2013. Copyright 2012 Penguin Computing, Inc. All rights reserved

GTC Presentation March 19, 2013. Copyright 2012 Penguin Computing, Inc. All rights reserved GTC Presentation March 19, 2013 Copyright 2012 Penguin Computing, Inc. All rights reserved Session S3552 Room 113 S3552 - Using Tesla GPUs, Reality Server and Penguin Computing's Cloud for Visualizing

More information

HDF5-iRODS Project. August 20, 2008

HDF5-iRODS Project. August 20, 2008 P A G E 1 HDF5-iRODS Project Final report Peter Cao The HDF Group 1901 S. First Street, Suite C-2 Champaign, IL 61820 xcao@hdfgroup.org Mike Wan San Diego Supercomputer Center University of California

More information

Scientific and Technical Applications as a Service in the Cloud

Scientific and Technical Applications as a Service in the Cloud Scientific and Technical Applications as a Service in the Cloud University of Bern, 28.11.2011 adapted version Wibke Sudholt CloudBroker GmbH Technoparkstrasse 1, CH-8005 Zurich, Switzerland Phone: +41

More information

DataGrids 2.0 irods - A Second Generation Data Cyberinfrastructure. Arcot (RAJA) Rajasekar DICE/SDSC/UCSD

DataGrids 2.0 irods - A Second Generation Data Cyberinfrastructure. Arcot (RAJA) Rajasekar DICE/SDSC/UCSD DataGrids 2.0 irods - A Second Generation Data Cyberinfrastructure Arcot (RAJA) Rajasekar DICE/SDSC/UCSD What is SRB? First Generation Data Grid middleware developed at the San Diego Supercomputer Center

More information

Getting Started Hacking on OpenNebula

Getting Started Hacking on OpenNebula LinuxTag 2013 Berlin, Germany, May 22nd Getting Started Hacking on OpenNebula Carlos Martín Project Engineer Acknowledgments The research leading to these results has received funding from Comunidad de

More information

Cornell University Center for Advanced Computing A Sustainable Business Model for Advanced Research Computing

Cornell University Center for Advanced Computing A Sustainable Business Model for Advanced Research Computing Cornell University Center for Advanced Computing A Sustainable Business Model for Advanced Research Computing David A. Lifka lifka@cac.cornell.edu 4/20/13 www.cac.cornell.edu 1 My Background 2007 Cornell

More information

Cloud Computing. Lecture 24 Cloud Platform Comparison 2014-2015

Cloud Computing. Lecture 24 Cloud Platform Comparison 2014-2015 Cloud Computing Lecture 24 Cloud Platform Comparison 2014-2015 1 Up until now Introduction, Definition of Cloud Computing Pre-Cloud Large Scale Computing: Grid Computing Content Distribution Networks Cycle-Sharing

More information

Utilizing the SDSC Cloud Storage Service

Utilizing the SDSC Cloud Storage Service Utilizing the SDSC Cloud Storage Service PASIG Conference January 13, 2012 Richard L. Moore rlm@sdsc.edu San Diego Supercomputer Center University of California San Diego Traditional supercomputer center

More information

Data Services for Campus Researchers

Data Services for Campus Researchers Data Services for Campus Researchers Research Data Management Implementations Workshop March 13, 2013 Richard Moore SDSC Deputy Director & UCSD RCI Project Manager rlm@sdsc.edu SDSC Cloud: A Storage Paradigm

More information

Stanford SDN-Based Private Cloud. Johan van Reijendam (jvanreij@stanford.edu) Stanford University

Stanford SDN-Based Private Cloud. Johan van Reijendam (jvanreij@stanford.edu) Stanford University Stanford SDN-Based Private Cloud (jvanreij@stanford.edu) Stanford University Executive Summary The Web and its infrastructure continue to make phenomenal progress, allowing the creation and scaling of

More information

MyCloudLab: An Interactive Web-based Management System for Cloud Computing Administration

MyCloudLab: An Interactive Web-based Management System for Cloud Computing Administration MyCloudLab: An Interactive Web-based Management System for Cloud Computing Administration Hoi-Wan Chan 1, Min Xu 2, Chung-Pan Tang 1, Patrick P. C. Lee 1 & Tsz-Yeung Wong 1, 1 Department of Computer Science

More information

OpenStack @ Cisco. Daneyon Hansen 3/28/2012. 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confiden+al 1

OpenStack @ Cisco. Daneyon Hansen 3/28/2012. 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confiden+al 1 OpenStack @ Cisco Daneyon Hansen 3/28/2012 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confiden+al 1 Customer value is moving up into sogware and web services Virtualiza+on and internet

More information

Exploring Software Defined Federated Infrastructures for Science

Exploring Software Defined Federated Infrastructures for Science Exploring Software Defined Federated Infrastructures for Science Manish Parashar NSF Cloud and Autonomic Computing Center (CAC) Rutgers Discovery Informatics Institute (RDI 2 ) Rutgers, The State University

More information

Astrophysics with Terabyte Datasets. Alex Szalay, JHU and Jim Gray, Microsoft Research

Astrophysics with Terabyte Datasets. Alex Szalay, JHU and Jim Gray, Microsoft Research Astrophysics with Terabyte Datasets Alex Szalay, JHU and Jim Gray, Microsoft Research Living in an Exponential World Astronomers have a few hundred TB now 1 pixel (byte) / sq arc second ~ 4TB Multi-spectral,

More information

CYBERINFRASTRUCTURE FRAMEWORK FOR 21 st CENTURY SCIENCE AND ENGINEERING (CIF21)

CYBERINFRASTRUCTURE FRAMEWORK FOR 21 st CENTURY SCIENCE AND ENGINEERING (CIF21) CYBERINFRASTRUCTURE FRAMEWORK FOR 21 st CENTURY SCIENCE AND ENGINEERING (CIF21) Goal Develop and deploy comprehensive, integrated, sustainable, and secure cyberinfrastructure (CI) to accelerate research

More information

idigbio Technology, Cloud and Appliances

idigbio Technology, Cloud and Appliances idigbio Technology, Cloud and Appliances Jose Fortes (on behalf of the idigbio IT team) idigbio External Advisory Board Meeting 2012 (Project Year 1) Supported by NSF Award EF-1115210 CI Stakeholders ALA

More information

irods and Metadata survey Version 0.1 Date March Abhijeet Kodgire akodgire@indiana.edu 25th

irods and Metadata survey Version 0.1 Date March Abhijeet Kodgire akodgire@indiana.edu 25th irods and Metadata survey Version 0.1 Date 25th March Purpose Survey of Status Complete Author Abhijeet Kodgire akodgire@indiana.edu Table of Contents 1 Abstract... 3 2 Categories and Subject Descriptors...

More information

Cloud Computing and Amazon Web Services

Cloud Computing and Amazon Web Services Cloud Computing and Amazon Web Services Gary A. McGilvary edinburgh data.intensive research 1 OUTLINE 1. An Overview of Cloud Computing 2. Amazon Web Services 3. Amazon EC2 Tutorial 4. Conclusions 2 CLOUD

More information

The National Consortium for Data Science (NCDS)

The National Consortium for Data Science (NCDS) The National Consortium for Data Science (NCDS) A Public-Private Partnership to Advance Data Science Ashok Krishnamurthy PhD Deputy Director, RENCI University of North Carolina, Chapel Hill What is NCDS?

More information

STeP-IN SUMMIT 2013. June 18 21, 2013 at Bangalore, INDIA. Performance Testing of an IAAS Cloud Software (A CloudStack Use Case)

STeP-IN SUMMIT 2013. June 18 21, 2013 at Bangalore, INDIA. Performance Testing of an IAAS Cloud Software (A CloudStack Use Case) 10 th International Conference on Software Testing June 18 21, 2013 at Bangalore, INDIA by Sowmya Krishnan, Senior Software QA Engineer, Citrix Copyright: STeP-IN Forum and Quality Solutions for Information

More information

2) Xen Hypervisor 3) UEC

2) Xen Hypervisor 3) UEC 5. Implementation Implementation of the trust model requires first preparing a test bed. It is a cloud computing environment that is required as the first step towards the implementation. Various tools

More information

Building Storage Service in a Private Cloud

Building Storage Service in a Private Cloud Building Storage Service in a Private Cloud Sateesh Potturu & Deepak Vasudevan Wipro Technologies Abstract Storage in a private cloud is the storage that sits within a particular enterprise security domain

More information

Moving beyond Virtualization as you make your Cloud journey. David Angradi

Moving beyond Virtualization as you make your Cloud journey. David Angradi Moving beyond Virtualization as you make your Cloud journey David Angradi Today, there is a six (6) week SLA for VM provisioning it s easy to provision a VM, the other elements change storage, network

More information

How To Create A Plant Science Cloud Platform

How To Create A Plant Science Cloud Platform iplant Atmosphere: A Gateway to Cloud Infrastructure for the Plant Sciences Edwin Skidmore edwin@iplantcollaborative.org Sriramu Singaram sriramus@iplantcollaborative.org Seung-jin Kim seungjin@iplantcollaborative.org

More information

Copyright 2012, Oracle and/or its affiliates. All rights reserved.

Copyright 2012, Oracle and/or its affiliates. All rights reserved. 1 Oracle Big Data Appliance Releases 2.5 and 3.0 Ralf Lange Global ISV & OEM Sales Agenda Quick Overview on BDA and its Positioning Product Details and Updates Security and Encryption New Hadoop Versions

More information

Data Management in the Cloud: Limitations and Opportunities. Annies Ductan

Data Management in the Cloud: Limitations and Opportunities. Annies Ductan Data Management in the Cloud: Limitations and Opportunities Annies Ductan Discussion Outline: Introduc)on Overview Vision of Cloud Compu8ng Managing Data in The Cloud Cloud Characteris8cs Data Management

More information

SAP Crystal Reports & SAP HANA: Integration & Roadmap Kenneth Li SAP SESSION CODE: 0401

SAP Crystal Reports & SAP HANA: Integration & Roadmap Kenneth Li SAP SESSION CODE: 0401 SAP Crystal Reports & SAP HANA: Integration & Roadmap Kenneth Li SAP SESSION CODE: 0401 LEARNING POINTS Learn about Crystal Reports for HANA Glance at the road map for the product Overview of deploying

More information

irods Overview Introduction to Data Grids, Policy-Driven Data Management, and Enterprise irods

irods Overview Introduction to Data Grids, Policy-Driven Data Management, and Enterprise irods irods Overview Introduction to Data Grids, Policy-Driven Data Management, and Enterprise irods Renaissance Computing Institute (RENCI) A research unit of UNC Chapel Hill Directed by Stan Ahalt, formerly

More information

BSC vision on Big Data and extreme scale computing

BSC vision on Big Data and extreme scale computing BSC vision on Big Data and extreme scale computing Jesus Labarta, Eduard Ayguade,, Fabrizio Gagliardi, Rosa M. Badia, Toni Cortes, Jordi Torres, Adrian Cristal, Osman Unsal, David Carrera, Yolanda Becerra,

More information

HP Virtualization Performance Viewer

HP Virtualization Performance Viewer HP Virtualization Performance Viewer Efficiently detect and troubleshoot performance issues in virtualized environments Jean-François Muller - Principal Technical Consultant - jeff.muller@hp.com HP Business

More information

GeoCloud Project Report GEOSS Clearinghouse

GeoCloud Project Report GEOSS Clearinghouse GeoCloud Project Report GEOSS Clearinghouse Qunying Huang, Doug Nebert, Chaowei Yang, Kai Liu 2011.12.06 Description of Application GEOSS clearinghouse is a FGDC, GEO, and NASA project that connects directly

More information

IBM Platform Computing Cloud Service Ready to use Platform LSF & Symphony clusters in the SoftLayer cloud

IBM Platform Computing Cloud Service Ready to use Platform LSF & Symphony clusters in the SoftLayer cloud IBM Platform Computing Cloud Service Ready to use Platform LSF & Symphony clusters in the SoftLayer cloud February 25, 2014 1 Agenda v Mapping clients needs to cloud technologies v Addressing your pain

More information

The Mantid Project. The challenges of delivering flexible HPC for novice end users. Nicholas Draper SOS18

The Mantid Project. The challenges of delivering flexible HPC for novice end users. Nicholas Draper SOS18 The Mantid Project The challenges of delivering flexible HPC for novice end users Nicholas Draper SOS18 What Is Mantid A framework that supports high-performance computing and visualisation of scientific

More information

ENABLING GLOBAL HADOOP WITH EMC ELASTIC CLOUD STORAGE

ENABLING GLOBAL HADOOP WITH EMC ELASTIC CLOUD STORAGE ENABLING GLOBAL HADOOP WITH EMC ELASTIC CLOUD STORAGE Hadoop Storage-as-a-Service ABSTRACT This White Paper illustrates how EMC Elastic Cloud Storage (ECS ) can be used to streamline the Hadoop data analytics

More information

Storage Virtualization

Storage Virtualization Section 2 : Storage Networking Technologies and Virtualization Storage Virtualization Chapter 10 EMC Proven Professional The #1 Certification Program in the information storage and management industry

More information

On Enabling Hydrodynamics Data Analysis of Analytical Ultracentrifugation Experiments

On Enabling Hydrodynamics Data Analysis of Analytical Ultracentrifugation Experiments On Enabling Hydrodynamics Data Analysis of Analytical Ultracentrifugation Experiments 18. June 2013 Morris Reidel, Shahbaz Memon, et al. Outline Background Ultrascan Application Ultrascan Software Components

More information

Science Gateways What are they and why are they having such a tremendous impact on science? Nancy Wilkins- Diehr wilkinsn@sdsc.edu

Science Gateways What are they and why are they having such a tremendous impact on science? Nancy Wilkins- Diehr wilkinsn@sdsc.edu Science Gateways What are they and why are they having such a tremendous impact on science? Nancy Wilkins- Diehr wilkinsn@sdsc.edu What is a science gateway? science gateway /sī əәns gāt wā / n. 1. an

More information

An enterprise- grade cloud management platform that enables on- demand, self- service IT operating models for Global 2000 enterprises

An enterprise- grade cloud management platform that enables on- demand, self- service IT operating models for Global 2000 enterprises agility PLATFORM Product Whitepaper An enterprise- grade cloud management platform that enables on- demand, self- service IT operating models for Global 2000 enterprises ServiceMesh 233 Wilshire Blvd,

More information

VMware on VMware: Private Cloud Case Study Customer Presentation

VMware on VMware: Private Cloud Case Study Customer Presentation VMware on VMware: Private Cloud Case Study Customer Presentation 2009 VMware Inc. All rights reserved Agenda VMware IT landscape Motivations for the Cloud Private Cloud Stack Impact of moving to the Cloud

More information

CloudCenter Full Lifecycle Management. An application-defined approach to deploying and managing applications in any datacenter or cloud environment

CloudCenter Full Lifecycle Management. An application-defined approach to deploying and managing applications in any datacenter or cloud environment CloudCenter Full Lifecycle Management An application-defined approach to deploying and managing applications in any datacenter or cloud environment CloudCenter Full Lifecycle Management Page 2 Table of

More information

INTEGRATED RULE ORIENTED DATA SYSTEM (IRODS)

INTEGRATED RULE ORIENTED DATA SYSTEM (IRODS) INTEGRATED RULE ORIENTED DATA SYSTEM (IRODS) Todd BenDor Associate Professor Dept. of City and Regional Planning UNC-Chapel Hill bendor@unc.edu http://irods.org/ SESYNC Model Integration Workshop Important

More information

Managing Next Generation Sequencing Data with irods

Managing Next Generation Sequencing Data with irods Managing Next Generation Sequencing Data with irods Presented by Dan Bedard // danb@renci.org at the 9 th International Conference on Genomics Shenzhen, China September 12, 2014 Managing NGS Data with

More information

Scalable Services for Digital Preservation

Scalable Services for Digital Preservation Scalable Services for Digital Preservation A Perspective on Cloud Computing Rainer Schmidt, Christian Sadilek, and Ross King Digital Preservation (DP) Providing long-term access to growing collections

More information

Comparing Open Source Private Cloud (IaaS) Platforms

Comparing Open Source Private Cloud (IaaS) Platforms Comparing Open Source Private Cloud (IaaS) Platforms Lance Albertson OSU Open Source Lab Associate Director of Operations lance@osuosl.org / @ramereth About me OSU Open Source Lab Server hosting for Open

More information

Simplifying Big Data Deployments in Cloud Environments with Mellanox Interconnects and QualiSystems Orchestration Solutions

Simplifying Big Data Deployments in Cloud Environments with Mellanox Interconnects and QualiSystems Orchestration Solutions Simplifying Big Data Deployments in Cloud Environments with Mellanox Interconnects and QualiSystems Orchestration Solutions 64% of organizations were investing or planning to invest on Big Data technology

More information

WOS Cloud. ddn.com. Personal Storage for the Enterprise. DDN Solution Brief

WOS Cloud. ddn.com. Personal Storage for the Enterprise. DDN Solution Brief DDN Solution Brief Personal Storage for the Enterprise WOS Cloud Secure, Shared Drop-in File Access for Enterprise Users, Anytime and Anywhere 2011 DataDirect Networks. All Rights Reserved DDN WOS Cloud

More information

Data Semantics Aware Cloud for High Performance Analytics

Data Semantics Aware Cloud for High Performance Analytics Data Semantics Aware Cloud for High Performance Analytics Microsoft Future Cloud Workshop 2011 June 2nd 2011, Prof. Jun Wang, Computer Architecture and Storage System Laboratory (CASS) Acknowledgement

More information

Amazon EC2 Product Details Page 1 of 5

Amazon EC2 Product Details Page 1 of 5 Amazon EC2 Product Details Page 1 of 5 Amazon EC2 Functionality Amazon EC2 presents a true virtual computing environment, allowing you to use web service interfaces to launch instances with a variety of

More information

PASIG May 12, 2012. Jacob Farmer, CTO Cambridge Computer

PASIG May 12, 2012. Jacob Farmer, CTO Cambridge Computer Adding Intelligence to Conventional NAS and File Systems: Metadata, Backups, and Data Life Cycle Management PASIG May 12, 2012 Presented by: Jacob Farmer, CTO Cambridge Computer Copyright 2009-2011, Cambridge

More information

THE CCLRC DATA PORTAL

THE CCLRC DATA PORTAL THE CCLRC DATA PORTAL Glen Drinkwater, Shoaib Sufi CCLRC Daresbury Laboratory, Daresbury, Warrington, Cheshire, WA4 4AD, UK. E-mail: g.j.drinkwater@dl.ac.uk, s.a.sufi@dl.ac.uk Abstract: The project aims

More information

RESPONSES TO QUESTIONS AND REQUESTS FOR CLARIFICATION Updated 7/1/15 (Question 53 and 54)

RESPONSES TO QUESTIONS AND REQUESTS FOR CLARIFICATION Updated 7/1/15 (Question 53 and 54) RESPONSES TO QUESTIONS AND REQUESTS FOR CLARIFICATION Updated 7/1/15 (Question 53 and 54) COLORADO HOUSING AND FINANCE AUTHORITY 1981 BLAKE STREET DENVER, CO 80202 REQUEST FOR PROPOSAL Intranet Replacement

More information

Your Data, Your Way. The iplant Foundation API Data Services

Your Data, Your Way. The iplant Foundation API Data Services Your Data, Your Way The iplant Foundation API Data Services Rion Dooley, Matthew Vaughn, Steven Terry, Dan Stanzione Texas Advanced Computing Center University of Texas at Austin Austin, USA [dooley, vaughn,

More information

A national science & engineering cloud. Kurt A. Seiffert Enterprise Architect, Indiana University Research Technologies

A national science & engineering cloud. Kurt A. Seiffert Enterprise Architect, Indiana University Research Technologies A national science & engineering cloud Prepared for GlobusWorld 2015 14 April 2015 Kurt A. Seiffert Enterprise Architect, Indiana University Research Technologies funded by the National Science Foundation

More information

About Network Data Collector

About Network Data Collector CHAPTER 2 About Network Data Collector The Network Data Collector is a telnet and SNMP-based data collector for Cisco devices which is used by customers to collect data for Net Audits. It provides a robust

More information

Denis Caromel, CEO Ac.veEon. Orchestrate and Accelerate Applica.ons. Open Source Cloud Solu.ons Hybrid Cloud: Private with Burst Capacity

Denis Caromel, CEO Ac.veEon. Orchestrate and Accelerate Applica.ons. Open Source Cloud Solu.ons Hybrid Cloud: Private with Burst Capacity Cloud computing et Virtualisation : applications au domaine de la Finance Denis Caromel, CEO Ac.veEon Orchestrate and Accelerate Applica.ons Open Source Cloud Solu.ons Hybrid Cloud: Private with Burst

More information

Release Notes. CTERA Portal 3.2.43. May 2013. CTERA Portal 3.2.43 Release Notes 1

Release Notes. CTERA Portal 3.2.43. May 2013. CTERA Portal 3.2.43 Release Notes 1 Release Notes CTERA Portal 3.2.43 May 2013 CTERA Portal 3.2.43 Release Notes 1 1 Release Contents Copyright 2009-2013 CTERA Networks Ltd. All rights reserved. No part of this document may be reproduced

More information

Hadoop on the Gordon Data Intensive Cluster

Hadoop on the Gordon Data Intensive Cluster Hadoop on the Gordon Data Intensive Cluster Amit Majumdar, Scientific Computing Applications Mahidhar Tatineni, HPC User Services San Diego Supercomputer Center University of California San Diego Dec 18,

More information

GenomeSpace Architecture

GenomeSpace Architecture GenomeSpace Architecture The primary services, or components, are shown in Figure 1, the high level GenomeSpace architecture. These include (1) an Authorization and Authentication service, (2) an analysis

More information

Amazon-Free Big Data Analysis. Michael R. Crusoe the GED Lab @ MSU @JKhedron #NGS2013 2013-06- 18

Amazon-Free Big Data Analysis. Michael R. Crusoe the GED Lab @ MSU @JKhedron #NGS2013 2013-06- 18 Amazon-Free Big Data Analysis Michael R. Crusoe the GED Lab @ MSU @JKhedron #NGS2013 2013-06- 18 Overview Dedicated vs Shared computing Evaluating Computing Resources XSEDE Mason Lonestar Stampede Blacklight

More information

Science Gateways in the US. Nancy Wilkins-Diehr wilkinsn@sdsc.edu

Science Gateways in the US. Nancy Wilkins-Diehr wilkinsn@sdsc.edu Science Gateways in the US Nancy Wilkins-Diehr wilkinsn@sdsc.edu NSF vision for cyberinfrastructure in the 21st century Software is critical to today s scientific advances Science is all about connections

More information

Grid Computing @ Sun Carlo Nardone. Technical Systems Ambassador GSO Client Solutions

Grid Computing @ Sun Carlo Nardone. Technical Systems Ambassador GSO Client Solutions Grid Computing @ Sun Carlo Nardone Technical Systems Ambassador GSO Client Solutions Phases of Grid Computing Cluster Grids Single user community Single organization Campus Grids Multiple user communities

More information

Automated and Scalable Data Management System for Genome Sequencing Data

Automated and Scalable Data Management System for Genome Sequencing Data Automated and Scalable Data Management System for Genome Sequencing Data Michael Mueller NIHR Imperial BRC Informatics Facility Faculty of Medicine Hammersmith Hospital Campus Continuously falling costs

More information

How To Understand Cloud Computing

How To Understand Cloud Computing Overview of Cloud Computing (ENCS 691K Chapter 1) Roch Glitho, PhD Associate Professor and Canada Research Chair My URL - http://users.encs.concordia.ca/~glitho/ Overview of Cloud Computing Towards a definition

More information

Introduc)on of Pla/orm ISF. Weina Ma Weina.Ma@uoit.ca

Introduc)on of Pla/orm ISF. Weina Ma Weina.Ma@uoit.ca Introduc)on of Pla/orm ISF Weina Ma Weina.Ma@uoit.ca Agenda Pla/orm ISF Product Overview Pla/orm ISF Concepts & Terminologies Self- Service Applica)on Management Applica)on Example Deployment Examples

More information

Clouds vs Grids KHALID ELGAZZAR GOODWIN 531 ELGAZZAR@CS.QUEENSU.CA

Clouds vs Grids KHALID ELGAZZAR GOODWIN 531 ELGAZZAR@CS.QUEENSU.CA Clouds vs Grids KHALID ELGAZZAR GOODWIN 531 ELGAZZAR@CS.QUEENSU.CA [REF] I Foster, Y Zhao, I Raicu, S Lu, Cloud computing and grid computing 360-degree compared Grid Computing Environments Workshop, 2008.

More information

An Open Dynamic Big Data Driven Applica3on System Toolkit

An Open Dynamic Big Data Driven Applica3on System Toolkit An Open Dynamic Big Data Driven Applica3on System Toolkit Craig C. Douglas University of Wyoming and KAUST This research is supported in part by the Na3onal Science Founda3on and King Abdullah University

More information

Open Source Cloud Computing Management with OpenNebula

Open Source Cloud Computing Management with OpenNebula CloudCamp Campus Party July 2011, Valencia Open Source Cloud Computing Management with OpenNebula Javier Fontán Muiños dsa-research.org Distributed Systems Architecture Research Group Universidad Complutense

More information

An Advanced Performance Architecture for Salesforce Native Applications

An Advanced Performance Architecture for Salesforce Native Applications An Advanced Performance Architecture for Salesforce Native Applications TABLE OF CONTENTS Introduction............................................... 3 Salesforce in the Digital Transformation Landscape...............

More information

RED HAT ENTERPRISE VIRTUALIZATION

RED HAT ENTERPRISE VIRTUALIZATION Giuseppe Paterno' Solution Architect Jan 2010 Red Hat Milestones October 1994 Red Hat Linux June 2004 Red Hat Global File System August 2005 Red Hat Certificate System & Dir. Server April 2006 JBoss April

More information

CUMULUX WHICH CLOUD PLATFORM IS RIGHT FOR YOU? COMPARING CLOUD PLATFORMS. Review Business and Technology Series www.cumulux.com

CUMULUX WHICH CLOUD PLATFORM IS RIGHT FOR YOU? COMPARING CLOUD PLATFORMS. Review Business and Technology Series www.cumulux.com ` CUMULUX WHICH CLOUD PLATFORM IS RIGHT FOR YOU? COMPARING CLOUD PLATFORMS Review Business and Technology Series www.cumulux.com Table of Contents Cloud Computing Model...2 Impact on IT Management and

More information

Amazon Cloud Storage Options

Amazon Cloud Storage Options Amazon Cloud Storage Options Table of Contents 1. Overview of AWS Storage Options 02 2. Why you should use the AWS Storage 02 3. How to get Data into the AWS.03 4. Types of AWS Storage Options.03 5. Object

More information