Metadata for Data Discovery: The NERC Data Catalogue Service. Steve Donegan



Similar documents
Use of ISO standards by NERC (a snapshot!)

The NERC DataGrid (NDG)

CDI/THREDDS Interoperability: the SeaDataNet developments. P. Mazzetti 1,2, S. Nativi 1,2, 1. CNR-IMAA; 2. PIN-UNIFI

Data dissemination best practice and STAR experience

Using standards for ocean data

GeoNetwork, The Open Source Solution for the interoperable management of geospatial metadata

NERC Data Policy Guidance Notes

Enabling embedded maps

Sextant. Spatial Data Infrastructure for Marine Environment. C. Satra Le Bris, E. Quimbert, M. Treguer

EXPLORING AND SHARING GEOSPATIAL INFORMATION THROUGH MYGDI EXPLORER

UKOARP Data Management. Rob Thomas, British Oceanographic Data Centre

Archive I. Metadata. 26. May 2015

UK Location Programme

The ORIENTGATE data platform

Environment Canada Data Management Program. Paul Paciorek Corporate Services Branch May 7, 2014

Progress in Creating a Global Polar Metadata Interoperability Network

What s new in Carmenta Server 4.2

13 th EC GI & GIS Workshop WIN: A new OGC compliant SOA. for risk management. GMV, 2007 Property of GMV All rights reserved

Norwegian Satellite Earth Observation Database for Marine and Polar Research USE CASES

GeoNetwork, The Open Source Solution for the interoperable management of geospatial metadata

INSPIRE Metadata Survey Results

Pan-European infrastructure for management of marine and ocean geological and geophysical data

Cloud-based Infrastructures. Serving INSPIRE needs

Task AR-09-01a Progress and Contributions

GeoNetwork User Manual

D.5.2: Metadata catalogue for drought information

DISMAR implementing an OpenGIS compliant Marine Information Management System

An Esri White Paper June 2011 ArcGIS for INSPIRE

European Forest Information and Communication Platform

STFC Centre for Environmental Data Archival (CEDA) Annual Report 2011

The NERC Data Policy. atapolicy-guidance.pdf

MSDI: Workflows, Software and Related Data Standards

INSPIRE Dashboard. Technical scenario

SDI National to Global: perspectives from the UK academic sector

Recent Developments at WDC Climate: Limitation of Long-term Archiving at DKRZ

The data landscape lessons from UK

Archiving of Simulations within the NERC Data Management Framework: BADC Policy and Guidelines.

Data Publication and Paradigm Mapping Solutions

Product Navigator User Guide

DISMAR: Data Integration System for Marine Pollution and Water Quality

SeaDataNet pan-european infrastructure for ocean and marine data management. Dick M.A. Schaap MARIS

Research Data Management Guide

Using Message Brokering and Data Mediation to use Distributed Data Networks of Earth Science Data to Enhance Global Maritime Situational Awareness.

University Bremen (UniHB) PANGAEA

GENESIS Employing Web Processing Services and Sensor Web Technology for Environmental Management

Applying the OAIS standard to CCLRC s British Atmospheric Data Centre and the Atlas Petabyte Storage Service

The Arctic Observing Network and its Data Management Challenges Florence Fetterer (NSIDC/CIRES/CU), James A. Moore (NCAR/EOL), and the CADIS team

Cite My Data M2M Service Technical Description

REACCH PNA Data Management Plan

Portal Version 1 - User Manual

INTEROPERABLE IMAGE DATA ACCESS THROUGH ARCGIS SERVER

UK-EOF Data Solutions Workshop

Data documentation and metadata for data archiving and sharing. Data Management and Sharing workshop Vienna, April 2010

National Snow and Ice Data Center A brief overview and data management projects

Outcomes of the CDS Technical Infrastructure Workshop

GeoMedia Product Update. Title of Presentation. Lorilie Barteski October 15, 2008 Edmonton, AB

Quality Assessment for Geographic Web Services. Pedro Medeiros (1)

How To Use The Alabama Data Portal

Building a SDI for small countries the Portuguese example

INSPIRE support in GeoNetwork opensource

CURSO Inspire INSPIRE. SPEAKER: Pablo Echamendi Lorente. JEUDI 23/ THURSDAY 23 rd W S V : G E O S P A T I A L D A T A A C C E S S

--Preliminary-- Science Data Access Architectures Mike Martin, 11/20/06

MyOcean Copernicus Marine Service Architecture and data access Experience

Data and data product visualization in EMODNET Chemistry

FreeGIS.net, INSPIRE, Open Source Software and OGC standards

Flexible and modular visualisation and data discovery tools for environmental information

GGOS Portal EXECUTIVE SUMMARY

From Geoportal to Spatial Data Service Platform. Jani Kylmäaho National Land Survey of Finland Development Centre

ISO and OGC Service Architecture

1. Introduction ABSTRACT

EEOS Spatial Databases and GIS Applications

PDOK Kaart, the Dutch Mapping API

On the way to best practice in Data Management: Approaches of the UFZ and the LTER- Europe network (Long Term Ecosystem Research)

Oklahoma s Open Source Spatial Data Clearinghouse: OKMaps

The ORIENTGATE data platform

Sharing field spectroscopy data within large data sharing systems

Catalogue or Register? A Comparison of Standards for Managing Geospatial Metadata

OGC at KNMI: Current use and plans Available products

CatMDEdit Metadata editor

Interoperable Solutions in Web-based Mapping

European Soil Data Centre (ESDAC) Marc Van Liedekerke Land Management and Natural Harzards Unit

Integrating Research Information: Requirements of Science Research

Andrea Buffam, Natural Resources Canada Canadian Metadata Forum National Library of Canada Ottawa, Ontario September 19 20, 2003

SIP Expert GUI Generic Use Cases and Requirements

Approaches to Making Data Citeable Recommendations of the RDA Working Group. Andreas Rauber, Ari Asmi, Dieter van Uytvanck Stefan Pröll

GetLOD - Linked Open Data and Spatial Data Infrastructures

Access to ESPON Database by third- party applications An expertise on the potentialities of web services for accessing the ESPON metadata and data

WP6. e-soter Web Services: Status and Way Ahead to a Global Soil Information Service Yusuf YIGINI EU Joint Research Centre

British Library DataCite Workshop University of Glasgow, 13 June 2014 Programme

HARNESSING DATA CENTRE EXPERTISE TO DRIVE FORWARD INSTITUTIONAL RESEARCH DATA MANAGEMENT

Data Models For Interoperability. Rob Atkinson

Big Data Volume & velocity data management with ERDAS APOLLO. Alain Kabamba Hexagon Geospatial

geoxwalk A Gazetteer Server and Service for UK Academia J.S.Reid

The distribution of marine OpenData via distributed data networks and Web APIs. The example of ERDDAP, the message broker and data mediator from NOAA

Levels of Archival Stewardship at the NOAA National Oceanographic Data Center: A Conceptual Model 1

Earth Science Academic Archive


DRIVER Providing value-added services on top of Open Access institutional repositories

Leveraging Metadata Standards in ArcGIS for Interoperability

THE CCLRC DATA PORTAL

Transcription:

Metadata for Data Discovery: The NERC Data Catalogue Service Steve Donegan

Introduction NERC, Science and Data Centres NERC Discovery Metadata The Data Catalogue Service NERC Data Services Case study: generating Metadata and doing something useful with it!

Main UK body for funding research, training, knowledge exchange in environmental sciences Annual budget 388m (2011) Covers atmosphere, earth, terrestrial, aquatic sciences Research ships and aircraft, satellite technology

What sort of data do we deal with? A variety of environmental measurements, along with the results of model simulations

NERC Designated Data Centres The UK s Natural Environment Research Council (NERC) funds eight data centres which between them have responsibility for the long-term management of NERC's environmental data holdings.

NERC funds research projects, which produce data. It is essential that these data are properly managed to ensure their long-term availability. NERC s network of data centres provide support and guidance in data management to those funded by NERC, are responsible for the long-term curation of data and provide access to NERC's data holdings. The role of the data centres The NERC Data Policy details their commitment to support the long-term management of data and also outlines the roles and responsibilities of all those involved in the collection and management of data. We are also involved in externally funded projects in informatics, e-science and domain specific areas.

Changing and conflicting user demands There is a tension between the requirements of different users. Scientists / NERC Want raw data in its original format Require long-term stewardship of data Want as much contextual detail as possible Government Agencies / Knowledge Exchange: Use environmental information to drive policy making Prefer real time data delivery Require derived products that address specific questions Need to synthesise data from many different sources in order to reach a decision Quality control is critical!

Legislation and technical changes Open standards for geospatial data and services promises a new level of interoperability between data providers EU INSPIRE directive requires us to provide data discovery, view and download services INSPIRE is an Infrastructure for Spatial Information within Europe for the purposes of Community environmental policies and policies or activities which may have an impact on the environment. As NERC data is within the UK public domain and many of its data holdings have a geospatial component, then by law NERC must produce metadata that is compliant with the EU INSPIRE directive (http://inspire.jrc.ec.europa.eu). Data interoperability and data sharing are prime objectives for INSPIRE and these are underpinned by a specification for metadata used for Data Discovery within INSPIRE. I INSPIRE discovery metadata is based on the ISO19115/19119 Application Profile (metadata for geographic information) with a definition of core metadata elements from this required for INSPIRE compliance

Discovery Metadata to satisfy all requirements NERC requires research/data to be able to generate on demand consistent discovery metadata describing NERC s data assets. Compliance to this standard helps to ensure that NERC s data assets are consistently discoverable, and aids in the generation and operation of services that utilise these assets across the NERC disciplines. NERC metadata must also accommodate and comply with international standards and directives. Metadata providers must have the capability to produce metadata conforming to this standard, under

Discovery Metadata to satisfy all requirements NERC Data Management Advisory Group (DMAG) The ISO standards 19115 and 19119 define metadata schema definitions adequate for describing data resources held by NERC. For communication purposes, the ISO19115/19119 metadata can be serialised and encoded as XML using the ISO standard 19139. NERC produced a profile of the ISO19115 But before official adoption... NERC SIS Group review MEDIN Discovery Metadata Standard: MEDIN: Marine Environment Data Information Network: Some NERC DDC s MEDIN partners MEDIN largely conformant with INSPIRE and Gemini2 but with specialism's for the marine community (i.e. Seadatanet keywords Decided to base NERC standard on MEDIN Discovery Standard but with exceptions/additions for NERC specific areas (i.e. How do you define vertical extent for Butterfly counts?) MEDIN community has published schematron and metadata tools to support standard Datasets, Series and Services! Adopt straight UK Gemini?

The NERC Data Catalogue Service The NERC DCS aims to provide a searchable interface to published discovery records from NERC DDC s Provides the ability to conduct a simple text, geographic and/or temporal search. Advanced search option allows structuring of complex queries: search for the term ozone but NOT if associated with the term depletion Results returned with basic information rendered from the discovery metadata links back to DDC, further information, download service etc Currently datasets & series.. Uses NERC Vocabulary Service for added content/dissemination

Data Services: Services need discovering too!

Developing a Portal NERC needed to replace the previous NERC Discovery Service: limited by metadata content (GCMD DIF, interoperability issues services etcs) Developed as part of the NERC Data Grid (NDG) activity consisted of a portal connected to a metadata catalogue all located at NEODC/CEDA NERC SIS recommended not only adopting the MEDIN Discovery Standard but also using the existing MEDIN Discovery Portal and web service MEDIN portal uses the Discovery Web Service (DWS) developed by NEODC/CEDA to search a metadata catalogue derived from discovery metadata harvested from data providers Based on previous generation NERC Discovery Portal but adapted for ISO19139 rather than GCMD DIF More powerful targeted keyword and text searches Distributed architecture: DWS runs on catalogue at NEODC/CEDA whilst portal located at Geodatain Southampton NERC Data Catalogue Service adapted for NERC style MEDIN records but with added targeted text search etc DWS/Catalogue runs at NEODC/CEDA and DCS portal at BODC

Developing a Portal (NERC model) Metadata Catalogue (PostgreSql) Discovery Web Service (DWS) Data Providers Web Service (DPWS) OAI-PMH OGC CSW WAF Metadata Providers

Harvesting the metadata.. OAI-PMH (Open Archive Initiative: Protocol for Metadata Harvesting): Providers and Harvesters A harvester takes full XML metadata and returns a copy to the local environment Any format however, Dublin Core must be provided to be OAI-PMH compliant Support for deleted records, detection of changed records, regular harvesting Works via HTTP

Developing a Portal (MEDIN model) Metadata Catalogue (PostgreSql) Discovery Web Service (DWS) Data Providers Web Service (DPWS) OGC CSW (Geonetworks) OAI-PMH OGC CSW WAF Metadata Providers

Portal future developments UK Location is implementing the UK s response for INSPIRE All in-scope records must be published to the UK Location Portal OGC CSW (CatalogService for the Web) or via WAF (Web Accessible Folder) CSW: TheCatalogServicedefines common interfaces to discover, browse, and query metadata about data, services, and other potential resources. Opensource solution: Geonetworks MEDIN solution for compliance is to run a parallel CSW to the MEDIN DWS with identical content NERC solution is for all DDC s to replace OAI-PMH with local Geonetworkswith one core CSW that supports a Discovery portal using a federated search. The core CSW is also the publishing point to UK Location

Developing a Portal (future NERC model) Metadata Providers OGC CSW Metadata Providers OGC CSW Federated Searches OGC CSW (Geonetworks) Metadata Providers OGC CSW Metadata Providers OGC CSW

CEDA Case Study CEDA: Centre for Environmental Data Archival: NERC Earth Observation Data Centre (NEODC) British Atmospheric Data Centre (BADC) UK Solar System Data Centre (UKSSDC) Located at STFC Rutherford Appleton Laboratory, Oxfordshire Actively participates in NERC e-infrastructure projects: NERC Data grid INSPIRE LMO, OGC, ISIC, and much much more. Data Centres publish data to NERC DCS but also runs the Harvesting, catalogue and DWS operations supporting the portal (BODC) But how does a data centre generate metadata and get it published?

CEDA Metadata Catalogue All of CEDA data holdings are catalogued in a database according to a data model (MOLES2/3). This model quantifies various aspects of the data: What is it? (i.e. instrument, format, model, service) Where and when is it? (i.e. spatial coverage, date range/times) Who owns it/where did it come from? (i.e. Who created the dataset? Restrictions on usage UK only?) What can it do? (i.e. Is it available in a visualisation service? Any legal aspects?) Any associated resources? (i.e. Keyword or Parameter names, Links to original data provider site, documentation, Web Service Endpoints)

CEDA Metadata Catalogue Information in the data catalogue is created by a combination of manual entry by Data Scientists as well as information taken from the data itself during the ingestion process and placement on the CEDA archive. Metadata in the catalogue is used for a variety of purposes: Provide a resource to generate metadata for external consumption i.e. to aid data discovery, allow data/ceda services to be used in external resources (i.e. WFS, WMS etc) Provide an accurate up to date description of each dataset and any related issues as a resource for the community Reference allow citation of dataset (DOI) Dataset management

Data Suppliers Archive Catalogue -MOLES -CEDA Info 3 rd Party Data providers Archive XML Generation Archive Discovery XML DataCite XML CSML/WMS /WFS Service metadata OAI PMH OGC CSW Web Accessible Folder (TBC) Publicly Visible External Users NERC Catalogue Service, DataCite, UK Location Portal, Go-Geo, MEDIN Portal, INSPIRE. All use metadata from CEDA metadata publishing layer