idigbio Technology, Cloud and Appliances

Similar documents
Digitization in the Pacific. Larry M. Page PD, idigbio Curator, FLMNH

EXTENDING SINGLE SIGN-ON TO AMAZON WEB SERVICES

Cloud Computing. Implementation, Management, and Security. John W. Rittinghouse James F. Ransome

AppDev OnDemand Cloud Computing Learning Library

Cloud Computing and Big Data What Technical Writers Need to Know

Envirofacts API Cory Wagner, US EPA

Infrastructure as a Service

Orbiter Series Service Oriented Architecture Applications

An enterprise- grade cloud management platform that enables on- demand, self- service IT operating models for Global 2000 enterprises

Leveraging Cloud-Based Mapping Solutions

Portal for ArcGIS. Satish Sankaran Robert Kircher

XSEDE Overview John Towns

Cloud Computing for Architects

Course 10978A Introduction to Azure for Developers

SharePoint & Azure: Digital Asset Management

Petroleum Web Applications to Support your Business. David Jacob & Vanessa Ramirez Esri Natural Resources Team

Public Cloud Offerings and Private Cloud Options. Week 2 Lecture 4. M. Ali Babar

CUMULUX WHICH CLOUD PLATFORM IS RIGHT FOR YOU? COMPARING CLOUD PLATFORMS. Review Business and Technology Series

APP DEVELOPMENT ON THE CLOUD MADE EASY WITH PAAS

Taking Big Data to the Cloud. Enabling cloud computing & storage for big data applications with on-demand, high-speed transport WHITE PAPER

Extend and Enhance AD FS

ORACLE APPLICATION EXPRESS 5.0

Cloud Computing Deja Vu

GigaSpaces Real-Time Analytics for Big Data

Windows Azure platform What is in it for you? Dominick Baier Christian Weyer

Assignment # 1 (Cloud Computing Security)

API Architecture. for the Data Interoperability at OSU initiative

ITP 342 Mobile App Development. APIs

MS 10978A Introduction to Azure for Developers

UForge 3.4 Release Notes

How To Compare Cloud Computing To Cloud Platforms And Cloud Computing

Introduction to Azure: Microsoft s Cloud OS

Federated single sign-on (SSO) and identity management. Secure mobile access. Social identity integration. Automated user provisioning.

How To Create An E Signature System On Cloud On Windows Azure (Windows) And A Client (For A Large Logistics Organization)

Building COBOL applications for Microsoft Azure. Jim Lane Senior Solution Engineer

CLOUD COMPUTING & WINDOWS AZURE

This module provides an overview of service and cloud technologies using the Microsoft.NET Framework and the Windows Azure cloud.

Google Cloud Platform The basics

Easily Connect, Control, Manage, and Monitor All of Your Devices with Nivis Cloud NOC

Introduction to the Cloud OS Windows Azure Overview Visual Studio Tooling for Windows Azure Scenarios: Dev/Test Web Mobile Hybrid

Deploy. Friction-free self-service BI solutions for everyone Scalable analytics on a modern architecture

The Challenges of Integrating Structured and Unstructured Data

Aspera Direct-to-Cloud Storage WHITE PAPER

How To Extend An Enterprise Bio Solution

Competitive Comparison Between Microsoft and VMware Cloud Computing Solutions

Cloud-based Data Logging, Monitoring and Analysis

How To Manage Your Digital Assets On A Computer Or Tablet Device

WebSphere Integration Solutions. IBM Day Minsk Anton Litvinov WebSphere Connectivity Professional Central Eastern Europe

Cisco Intelligent Automation for Cloud

Michelle Metzger TLG Learning. Support:

Visualize your World. Democratization i of Geographic Data

Databases & Data Infrastructure. Kerstin Lehnert

ediscovery and Search of Enterprise Data in the Cloud

A Survey on Cloud Storage Systems

Developing Microsoft Azure Solutions 20532B; 5 Days, Instructor-led

Cloud computing - Architecting in the cloud

Architecture Overview

2) Xen Hypervisor 3) UEC

WHITEPAPER. SECUREAUTH 2-FACTOR AS A SERVICE 2FaaS

Certified Cloud Computing Professional VS-1067

How To Understand Cloud Computing

IDEAL INSTITITE OF MANAGEMENT AND TECHNOLOGY In association with IIT MADRAS Presents SAARANG 2015 National Level CLOUD COMPUTING Championship

University of Messina, Italy

Jitterbit Technical Overview : Microsoft Dynamics CRM

Adobe ColdFusion 11 Enterprise Edition

HP Converged Cloud Cloud Platform Overview. Shane Pearson Vice President, Portfolio & Product Management

Karl Lum Partner, LabKey Software Evolution of Connectivity in LabKey Server

ITP 140 Mobile Technologies. Mobile Topics

Contents. Overview 1 SENTINET

fpafi/tl enterprise Microsoft Silverlight 5 and Windows Azure Enterprise Integration Silverlight Enterprise Applications on the Windows

Solution Showcase Session. Enterprise 2.0 Computing Services

Harnessing the Power of the Microsoft Cloud for Deep Data Analytics

Technical. Overview. ~ a ~ irods version 4.x

StruxureWare TM Center Expert. Data

Composite Data Virtualization Composite Data Virtualization And NOSQL Data Stores

Connected Product Maturity Model

About Client. Business Need

The governance IT needs Easy user adoption Trusted Managed File Transfer solutions

SPT2013: Developing Solutions with. SharePoint DAYS AUDIENCE FORMAT COURSE DESCRIPTION STUDENT PREREQUISITES

NCTA Cloud Operations

Introduction to IBM Worklight Mobile Platform

Cloud Computing Trends

Users VM A A A. Application. Compute/Storage/Network. VM Virtual Machine. On-Premises Data Center

Capitalize on Big Data for Competitive Advantage with Bedrock TM, an integrated Management Platform for Hadoop Data Lakes

New Features in Neuron ESB 2.6

July 2016 Price List

SharePoint Comparison of Features

Using Social Networking Sites as a Platform for E-Learning

MENDIX FOR MOBILE APP DEVELOPMENT WHITE PAPER

Deploying ArcGIS for Server Using Esri Managed Services

The SparkWeave Private Cloud & Secure Collaboration Suite. Core Features

Enterprise 2.0 and SharePoint 2010

Cloud Courses Description

Data Lab Operations Concepts

Gladinet Cloud Access Solution Simple, Secure Access to Online Storage

Cloud Computing. Chapter 2 Software as a Service (SaaS)

Storing and Processing Sensor Networks Data in Public Clouds

Single Sign On. SSO & ID Management for Web and Mobile Applications

The Cloud as a Computing Platform: Options for the Enterprise

Taxi Service Design Description

Transcription:

idigbio Technology, Cloud and Appliances Jose Fortes (on behalf of the idigbio IT team) idigbio External Advisory Board Meeting 2012 (Project Year 1) Supported by NSF Award EF-1115210

CI Stakeholders ALA GBIF EOL BISON iplant Museums National/Global Data Aggregators Domain Data Producers idigbio TCNs Collectors Infrastructure Providers Amazon WS Amazon Turk Google DataONE TCNs Microsoft Azure Data Conservancy Researchers Teachers Citizens TCNs Domain Data Consumers Government Domain Service Providers Mapping TCNs Georeferencing Imaging services Data quality NESCent Translation OCR iplant 2

Stakeholders APIs ALA GBIF EOL BISON Researchers Teachers Citizens Museums National/Global Data Aggregators TCNs Updates Notification Domain Data Consumers Government Domain-level data Domain data Domain Data Producers idigbio Query results Updates Notification Usage track Customer Requests TCNs Collectors BLOBs Appliances Domain Service Providers Mapping Processed data Infrastructure Providers TCNs Amazon WS Amazon Turk Google DataONE TCNs Microsoft Azure Data Conservancy Georeferencing Imaging services Data quality NESCent Translation OCR iplant 3

Interface Model for idigbio and TCNs TCNs SQL UTF-8... REST WS SAML WS-I TAPIR... TDWG XML Archiving Learning Modules Virtual Appliances Data Collections Structured Data Services Machines Wiki Storage idigbio + Resources Workshop Resources Non-structured Data Services Networking Workflow Engines Geographical Mapping Taxonomic Validation Collaboration Tools Data Conversion TCP OCCIWG HTTP RDF JPEG2000 X.509 OpenID XMPP ODBC National History Museums Federal Collections Applied Innovations Microsoft Azure Amazon EC2/S3 Google Apps Microsoft Live Google App Engine iplant NCBI EOL LifeMapper ALA XSEDE TCNs DataONE Academic Clouds NESCent Infrastructure Providers, National/Global Data Aggregators, Domain Service Providers, Domain Data Consumers 4

Building the idigbio Cloud Cloud-based strategy Providing useful services/apis (programmatic and web-based) Federated scalable object storage and information processing Digitization-oriented virtual appliances Reliance on standards, proven solutions and sustainable software Continuous consultation with stakeholders Surveys, workgroups, summit/workshops, person-to-person

Unique UF+FSU record Track record of building cyberinfrastructure PUNCH and In-VIGO Nanohub, Netcare, In-VIGOBlast Morphbank AFRESH Telecenter Archer

Keeping our eyes on the ball Common/frequent needs: archival storage, server hosting, feedback on the data, data intensive transformations 10-year tsunami of requirements: from being on Facebook to multilingual search-and-compute across multiple data sets 7

Evolution of idigbio capabilities Data ingestion Data access, provision and visualization Provide and enable data feedback Data linking and federation Process and visualize integrated data Time Q3/2012 Q3/2013 Q3/2014 Q3/2015 Increasing storage and server hosting in support of the above Increasing number of appliances in support of the above Web site for interaction with public, community, education and above 8

Textual data Internet access o JSON document database o Data ingestion via DwC-a files o Get / Set API API Gateway Image Data o Internet-accessible object storage o Upload appliance o Limited access to low-level APIs Textual Data (RIAK) Image Data (SWIFT)

Textual Data o JSON document database o Data Ingestion via DwC-a files o Rich RESTful API Image Data o Web-accessible object storage o Upload appliance o Fully abstracted storage Indexing and Search Textual o Extract EXIF data from images Data Filter o Limited but useful set of indexes (RIAK) Set o Intuitive search UI Query o Search available via API interfa Portal ce o Consumes and interfaces text, image and search APIs (minimal server side code) o Web-based mapping - client side javascript limits useable record count to about 50k records at a time. Internet access idigbio Portal API Gateway EXIF extrac tion Image Data (SWIFT)

Virtual Appliances in idigbio Packaging of software and dependences in virtual machines End user/desktop (e.g. VMware, Virtualbox) Infrastructure-as-a-Service clouds (e.g. OpenStack) Enhance user experience, facilitate integration with cloud Image ingestion appliances (short term) Batch upload of images from a local storage to cloud Generate GUID/URLs for later processing Reliable transfers using cloud APIs (e.g. Swift/iDigBio) Post-processing appliances (OCR tools; end-user or batch) Geo-referencing appliances (Training/verification) Research appliances (Data-intensive/batch workflows)

Archer cyber-infrastructure Custom appliance image for computer architecture community Hundreds of distributed compute/routers nodes 24/7 operation, 650+ cores Job scheduling across participating institutions

Now: appliance proposal process By users/developers through the idigbio Web portal Requirements demonstrates usage/buy-in, software license, documentation, etc Queue of appliances for integration idigbio will prioritize and work with developers Leverage expertise in appliance development Focus on images that users can download and run on VMware, Virtualbox Application, in addition to appliance, if applicable/desirable

Short term Facilitate data ingestion, interface with idigbio Tools identified by community in workshops/groups Web-based UI Web server Ingestion appliance Cloud client File interface Batch upload, Cloud APIs /1/100.tif /1/101.tif GUID1 GUID2 Images captured (e.g. HD/flash media) /images/1/100.tif /1/101.tif /2/200.tif idigbio object Storage cloud (Swift)

Medium-term Marketplace End users Users/ Developers Community appliances idigbio appliances Proposals idigbio Portal idigbio Personnel

Long-term information processing Users/ Developers End users Community appliances idigbio Portal idigbio Personnel Specimen Database

Summary idigbio cloud Service-oriented standards-based cyberinfrastructure focused on the ADBC community needs Scalable data management and information processing using standard interfaces, data formats, protocols, tools Toolboxes as appliances Evolving collection of community-selected tools Built-in interfaces for effortless idigbio integration Embedded best practices and standards in biocollections work Software re-use when open-source, well maintained, manageable, sustainable and efficient to re-purpose Feedback and suggestions welcome fortes@ufl.edu and Contacts at idigbio.org

Acknowledgments National Science Foundation Judith Skog and Anne Maglia IDigBio team at University of Florida and Florida State University 19

Extras 20

idigbio IT Vision Cyberinfrastructure to enable the collaborative creation, integration and management of digitized biocollections, their use in scientific research, education and outreach Visible as a collection of persistent Internet-accessible services, data and resources For biocollection producers For biocollection consumers For biocollection service providers For cyberinfrastructure providers For national/global data aggregators