FEDORA @ AWI Fedora User Meeting Copenhagen, Denmark 28 September, 2005

Similar documents
CLASS and Enterprise Solutions Rick Vizbulis. CLASS and Enterprise Solutions

The Arctic Observing Network and its Data Management Challenges Florence Fetterer (NSIDC/CIRES/CU), James A. Moore (NCAR/EOL), and the CADIS team

Metadata for Data Discovery: The NERC Data Catalogue Service. Steve Donegan

EXPLORING AND SHARING GEOSPATIAL INFORMATION THROUGH MYGDI EXPLORER

- a Humanities Asset Management System. Georg Vogeler & Martina Semlak

Conceptualizing Policy-Driven Repository Interoperability (PoDRI) Using irods and Fedora

Notes about possible technical criteria for evaluating institutional repository (IR) software

Building integration environment based on OAI-PMH protocol. Novytskyi Oleksandr Institute of Software Systems NAS Ukraine

E-Content Service Group Virtual Meeting. Digital Preservation: How to Get Started

Interagency Science Working Group. National Archives and Records Administration

The Czech Digital Library and Tools for the Management of Complex Digitization Processes

Ex Libris Rosetta: A Digital Preservation System Product Description

EUDAT - Open Data Services for Research

2009 ikeep Ltd, Morgenstrasse 129, CH-3018 Bern, Switzerland (

PAPER Data retrieval in the PURE CRIS project at 9 universities

CDI/THREDDS Interoperability: the SeaDataNet developments. P. Mazzetti 1,2, S. Nativi 1,2, 1. CNR-IMAA; 2. PIN-UNIFI

The NERC DataGrid (NDG)

Digital Assets Repository 3.0. PASIG User Group Conference Noha Adly Bibliotheca Alexandrina

Adding Robust Digital Asset Management to Oracle s Storage Archive Manager (SAM)

DSpace: An Institutional Repository from the MIT Libraries and Hewlett Packard Laboratories

GeoNetwork, The Open Source Solution for the interoperable management of geospatial metadata

Product Navigator User Guide

DIVER (Data Integration Visualization Exploration and Reporting)

GeoNetwork, The Open Source Solution for the interoperable management of geospatial metadata

Institute of Computational Modeling SB RAS

The ORIENTGATE data platform

CERN Document Server

Implementing an Integrated Digital Asset Management System: FEDORA and OAIS in Context

Building An Institutional Repository With DSpace

Long-term Archiving of Relational Databases with Chronos

Taking full advantage of the medium does also mean that publications can be updated and the changes being visible to all online readers immediately.

OPENGREY: HOW IT WORKS AND HOW IT IS USED

Status of the World Radiation Monitoring Center

Canadian National Research Data Repository Service. CC and CARL Partnership for a national platform for Research Data Management

Evolving Infrastructure: Growth and Evolution of Spatial Portals

DATA ACCESS AT EUMETSAT

WHY DIGITAL ASSET MANAGEMENT? WHY ISLANDORA?

Invenio: A Modern Digital Library for Grey Literature

Marine Network for Integrated Data Access (MaNIDA) Portal Deutsche Meeresforschung

Assessing a Scientific Data Center as a Trustworthy Digital Repository

MultiMimsy database extractions and OAI repositories at the Museum of London

dati.culturaitalia.it a Pilot Project of CulturaItalia dedicated to Linked Open Data

Geospatial Data Stewardship at an Interdisciplinary Data Center

The biggest challenges of Life Sciences companies today. Comply or Perish: Maintaining 21 CFR Part 11 Compliance

Preservation and Dissemination Policy of the LISS Data Archive

REACCH PNA Data Management Plan

Service Cloud for information retrieval from multiple origins

Implementing an Institutional Repository for Digital Archive Communities: Experiences from National Taiwan University

Functional Requirements for Digital Asset Management Project version /30/2006

OCLC CONTENTdm. Geri Ingram Community Manager. Overview. Spring 2015 CONTENTdm User Conference Goucher College Baltimore MD May 27, 2015

EED Task Order. Contract: NNG10HP02C Contractor: Raytheon Task Type:

How To Harvest For The Agnic Portal

The Rutgers Workflow Management System. Workflow Management System Defined. The New Jersey Digital Highway

Nevada NSF EPSCoR Track 1 Data Management Plan

OCLC CONTENTdm and the WorldCat Digital Collection Gateway Overview

James Hardiman Library. Digital Scholarship Enablement Strategy

Lecture 8. Online GIS

THE HELMHOLTZ INVENIO REPOSITORY PROJECT :

Institutional Repositories: Staff and Skills Set

How To Monitor Radiation

SAP BW Connector for BIRT Technical Overview

European Data Infrastructure - EUDAT Data Services & Tools

New Developments in Data Sharing, Remote Access, Secure Data, and Documentation at the Cornell Institute for Social and Economic Research (CISER)

The FAO Open Archive: Enhancing Access to FAO Publications Using International Standards and Exchange Protocols

Open Source Software for Use in Learning Object Repositories. A Review and Assessment

All You Wanted To Know About the Management of Digital Resources in Alma

Enhancing the Quality and Trust of Citizen Science Data. Abdul Alabri eresearch Lab School of ITEE, UQ

Data Management in Science and the Legacy of the International Polar Year

data.bris: collecting and organising repository metadata, an institutional case study

NASA s Big Data Challenges in Climate Science

Data Management using irods

Transcription:

Photo: L. Tadday FEDORA @ AWI Fedora User Meeting Copenhagen, Denmark 28 September, 2005-1- Ana Ana Macario, Macario, Computer Center Center Alfred Wegener Alfred Wegener Institute Institute, for Polar Bremerhaven, and Marine Research Germany European Fedora User Meeting, Germany Copenhagen, Denmark, 2005-09-28

Overview AWI and its research scope SOA at AWI Rationale for choosing FEDORA Long-term issues -2-

About AWI 1980 Establishment of the institute To date Mastertitelformat in Bremerhaven a foundation bearbeiten under public law; AWI is one out 15 centers belonging to Helmholtz Society - Budget: 103 Mill. Euro - 800 Employees Funding - 90% Federal Ministry of Education and Research (BMBF) - 8% Bremen state - 1% Brandenburg and Schleswig-Holstein states - external funds -3-

Our mission Mastertitelformat Wadden Sea bearbeiten Station Sylt Biologische Anstalt Helgoland Alfred-Wegener-Institut für Polar- und Meeresforschung Bremerhaven To contribute to polar and marine research in order to advance insights into the changeability of the global environment and the earth system Research Unit Potsdam -4-

Research platforms Primary data: observations acquired in diverse research platforms, long-time series monitoring (observatories) numerical models lab. experiments photographs, maps/charts Publications Events Intelectual property rights Technology transfer -5-

Simplified Overview (2004) ISO 19115 DublinCore Relational Databases PANGAEA/WDC-Mare Meteorology,Oceanography Diatom collections GIS, Polarstern expeditions Backups Middleware Services Internet2/ eduperson eduorg DublinCore AuthN&AuthZ Directory People, Organizational Publications Events Technology transfer Expeditions Backups Examples: Directory services MapServer Examples: Web-based interfaces for searching primary datasets, publications, expeditions, etc File and Storage systems Publications full-text Model runs Large datasets -6- Backups

In practice Staging Publication Versionning and trace-ability relevant to scientists (data calibration, validation, processing, etc) Distributed data storage Role tailored access policy to assure data rights Spatial, temporal and thematic search/visualization (GIS mapping services) PI turns in post-print Fedora as active workspace PI removes data access restrictions Long-term archival of qualitycontrolled digital objects in IR IR exposed via OAI-PMH and SOAP Export functionality to international agencies (GCMD, NGDC, NOAA, GBIF, etc) -7-

Why AWI chose to test FEDORA? Flexible, extensible digital object model Open source; good documentation and tutorials Allows for metadata description other than Dublin Core record; relevant for geo-referenced objects (ISO 19115), bio-diversity objects (Darwin Core), objects of type people (Internet2/eduPerson), organizational units (Internet2/eduOrg),etc Able to distribute load and object storage among several IR instances ( Virtual Repository concept) Standards compliant: XML storage, OAI-PMH and web services -8-

Why AWI chose to test FEDORA? cont. Promising scalability; Fedora@AWI currently archives 15,000 Mastertitelformat objects bearbeiten Object preservation through content versionning; includes audit trail record for preserving event history XML ingest/export assures interoperability with existing in house information systems -9-

Simplified Overview (2005) Backend Directory & File systems Publications Events Technology transfer People Organizational Units 15,000 objects Backups FOXML ingest Fedora Repository System Manage soap Access soap Search soap OAI Provider http Frontend Sybase Relational PANGAEA/WDC- MARE 245,000 objects Backups WDC-specific XML Sybase BLOBs PANGAEA/WDC-MARE Search soap OAI Provider http -10- OAI Harvester (PKP)

SOAP client -11-

SOAP client cont. -12-

SOAP client cont. -13-

A few technical remarks on Fedora 2.0... Web services APIs are great; suggested improvements: - findobjects: browsing list backwards is not possible yet, totalnumberofresults is missing - adddatastream: file uploads: could it be done with SOAP-attachments? Timestamp resolution in miliseconds has raised problems in conformance tests under www.openarchives.org DeletedRecords set to Transient in order to allow for incremental harvesting by modified date -14-

Next steps... Set up new services: naming, full-text indexing & search, large-scale Mastertitelformat content ingestion bearbeiten (bulk load) together with metadata Metadata transformation services as disseminator relevant for data supply to external service providers (e.g., NGDC, GCMD, NOAA, GBIF) Set up collections (and respective granularity policies) - relevant for object-to-object relationship metadata -15-

DC-hardwired relation OAI-PMH identifier DOI Resource Item Dublin Core Descriptive metadata Pangaeaspecific Descriptive + Administrative metadata Dataset-to-Publication relationship metadata should be expressed in RDF/XML and placed in the ISO 19115 Descriptive + Administrative metadata OAI-PMH records Relations datastream DC metadata <dc.source> locator for content <dc.relation> locator for publication(s) -16-

Testing triple store query performance Backend Directory & File systems People Organizational Units Publications Events Technology Transfer 15,000 records Sybase Relational PANGAEA/WDC- MARE 245,000 records Backups Backups FOXML ingest 2006: FOXML ingest Pangaea-XML Fedora Repository System We need the XACML-based module in order to add live data! Sybase BLOBs PANGAEA/WDC-MARE Manage http/soap Access http/soap Search http/soap OAI Provider http Search http/soap OAI Provider http -17- Frontend OAI Harvester (PKP)

Long-term issues for AWI Benchmarking for large number of files; we fear scalability breakpoint Mastertitelformat related to the bearbeiten size of the filesystem-based LLStorage area Out-of-box web-based client relevant for acceptance by other Helmholtz centers Fine-grained access control policies and Shibboleth based AuthN relevant in DataGRID context Support for sets -18-

Long-term issues for AWI cont. Federation model Collaboration Mastertitelformat and support bearbeiten infra-structure - disseminators for specific visualizations services (e.g. NetCDF data and LiveAcessServer, GIS data and OpenMapServer); relevant for DataGRID - ECLIPSE project to facilitate plug-in development? - Google strategy - Seminars, tutorials for advanced FEDORA users -19-

Photo: L. Tadday hanks for your attention! fedora-admin@awi-bremerhaven.de http://www.awi-bremerhaven.de http://web.awi-bremerhaven.de/fedora/oai -20- Ana Ana Macario, Macario, Computer Center Center Alfred Wegener Alfred Wegener Institute Institute, for Polar Bremerhaven, and Marine Research Germany European Fedora User Meeting, Germany Copenhagen, Denmark, 2005-09-28