Archive I. Metadata. 26. May 2015

Similar documents
How To Useuk Data Service

OpenAIRE Research Data Management Briefing paper

A Java Tool for Creating ISO/FGDC Geographic Metadata

Long Term Preservation of Earth Observation Space Data. Preservation Workflow

DATA CITATION. what you need to know

Metadata for Data Discovery: The NERC Data Catalogue Service. Steve Donegan

British Library DataCite Workshop University of Glasgow, 13 June 2014 Programme

Research Data Management Guide

Interagency Science Working Group. National Archives and Records Administration

Working with the British Library and DataCite A guide for Higher Education Institutions in the UK

data.bris: collecting and organising repository metadata, an institutional case study

Documenting the research life cycle: one data model, many products

Working with the British Library and DataCite Institutional Case Studies

Data Management Exercises

Research Data Management

ODUM INSTITUTE ARCHIVE SERVICES OVERVIEW IASSIST 2015

Research Data Archival Guidelines

UNINETT Sigma2 AS: architecture and functionality of the future national data infrastructure

INSTITUTIONALIZE METADATA BEFORE IT INSTITUTIONALIZES YOU

Best Practices for Data Management. RMACC HPC Symposium, 8/13/2014

REACCH PNA Data Management Plan

EPSRC Research Data Management Compliance Report

RESEARCH DATA MANAGEMENT POLICY

Archimedes Palimpsest Transcription Metadata Standard 22 October 2007 DRAFT

ESRC Research Data Policy

Action full title: Universal, mobile-centric and opportunistic communications architecture. Action acronym: UMOBILE

Information and documentation The Dublin Core metadata element set

How To Write An Nccwsc/Csc Data Management Plan

Towards research data cataloguing at Southampton using Microsoft SharePoint and EPrints: a progress report

Data documentation and metadata for data archiving and sharing. Data Management and Sharing workshop Vienna, April 2010

STORRE: Stirling Online Research Repository Policy for etheses

DataShare & Data Audit. Lessons Learned. Robin Rice. Digital Curation Practice, Promise and Prospects

Data Management Plans - How to Treat Digital Sources

LIBER Case Study: University of Oxford Research Data Management Infrastructure

Queensland recordkeeping metadata standard and guideline

Guidelines for filling in the Excel Template for Monitoring INSPIRE by the contributing authorities

Data Management Best Practices for Landscape Conservation Cooperatives Part 1: LCC Funded Science

How To Write A Blog Post On Globus

INTRODUCTION TO THE DATAVERSE NETWORK

How To Write An Inspire Directive

Data Management Plan. Name of Contractor. Name of project. Project Duration Start date : End: DMP Version. Date Amended, if any

Checklist for a Data Management Plan draft

Research Data Management PROJECT LIFECYCLE

Managing and Sharing research Data

NATIONAL CLIMATE CHANGE & WILDLIFE SCIENCE CENTER & CLIMATE SCIENCE CENTERS DATA MANAGEMENT PLAN GUIDANCE

HARNESSING DATA CENTRE EXPERTISE TO DRIVE FORWARD INSTITUTIONAL RESEARCH DATA MANAGEMENT

NERC Biodiversity and Ecosystem Service Sustainability (BESS) Data Management Strategy

Environment Canada Data Management Program. Paul Paciorek Corporate Services Branch May 7, 2014

Boulder Creek Critical Zone Observatory Data Management Plan

Solution for ETC programmes GENERAL CHARACTERISTICS

Creating a Data Management Plan for your Research

CatMDEdit Metadata editor

Digital Preservation. OAIS Reference Model

Data Publishing Workflows with Dataverse

European Data Infrastructure - EUDAT Data Services & Tools

Data Management Implementation Plan

Enhanced Research Data Management and Publication with Globus

Nevada NSF EPSCoR Track 1 Data Management Plan

The Data Management Plan with. Dataverse. Mercè Crosas, Ph.D. Director of Product Development

VGIS HANDBOOK PART 2 - STANDARDS SECTION D METADATA STANDARD

Data management plan

Globus Research Data Management: Introduction and Service Overview

Writing a Wellcome Trust Data Management & Sharing Plan

Data Management Considerations for the Data Life Cycle

The NERC Data Policy. atapolicy-guidance.pdf

Exercise 1 : Branding with Confidence

SRS BIO OPTICAL WORKFLOW

Service Road Map for ANDS Core Infrastructure and Applications Programs

The cross-disciplinary Roots of the British collaboration between scholars in humanities and

WHAT SHOULD NSF DATA MANAGEMENT PLANS LOOK LIKE

Functional Requirements for Digital Asset Management Project version /30/2006

Research Data Collection Data Management Plan

Making Data Citable The German DOI Registration Agency for Social and Economic Data: da ra

GeoNetwork User Manual

CIP s Open Data & Data Management Guidelines and Procedures

Second EUDAT Conference, October 2013 Data Management Plans and Certification Motivation: increasing importance of Data Management Planning

EndNote Beyond the Basics

LJMU Research Data Policy: information and guidance

Integrating Research Information: Requirements of Science Research

The ADS and SWORD-ARM: Deposit charges, costing tools and e-repository sustainability

CLARIN-NL Third Call: Closed Call

Research Data Management For Researchers

The Metadata in the Data Catalogue DBK

Twente Grants Week: Data management. Maarten van Bentum (Library & Archive)

Notes about possible technical criteria for evaluating institutional repository (IR) software

A Guide to the Research Data Service

SharePoint Document and Data Control

Earth Science Academic Archive

Research Data Management - The Essentials

Recent Developments at WDC Climate: Limitation of Long-term Archiving at DKRZ

An Introduction to Managing Research Data

Research Data Alliance: Current Activities and Expected Impact. SGBD Workshop, May 2014 Herman Stehouwer

Cite My Data M2M Service Technical Description

OPENGREY: HOW IT WORKS AND HOW IT IS USED

Managing Idaho s Landscapes for Ecosystem Services (MILES) Idaho EPSCoR Research Data Management Plan Version 5.0 (October 12, 2015)

Data Management Resources at UNC: The Carolina Digital Repository and Dataverse Network

Canadian National Research Data Repository Service. CC and CARL Partnership for a national platform for Research Data Management

Research Data Management Services. Katherine McNeill Social Sciences Librarians Boot Camp June 1, 2012

THE BRITISH LIBRARY. Unlocking The Value. The British Library s Collection Metadata Strategy Page 1 of 8

Linking raw data with scientific workflow and software repository: some early

Transcription:

Archive I Metadata 26. May 2015

2

Norstore Data Management Plan To successfully execute your research project you want to ensure the following three criteria are met over its entire lifecycle: You are able to collect your data You are able to use your data You are able to share your data with an intended audience Norstore are developing a data management plan template that will help you to address these criteria for norstore projects. Makes use of existing best practice (for example using the UK Digital Curation Centre template that synthesizes plans from UK research agencies and the US NSF). 3

Norstore Data Management Plan Template being designed along the lines of the phases of the project lifecycle. Each phase will address each of the three criteria Will help you to ensure the resources your project needs are available when needed. Will help in planning and provisioning of resources. 4

The importance of Metadata What is this? What was it used for? 5

Metadata Over the course of a project you generate and use metadata Essential to use the data correctly Different phases of the project require different sets of metadata 6

Metadata Can be divided into 3 types: Descriptive what the data is, how to use it, special features in the data, workflows, etc. Structural how the data is arranged (e.g. files in directory X are configuration files, in directory Y are documentation and in directory Z are the raw data), also includes the formats descriptions. Administrative covers information to manage the data such as checksums, rights information, significant properties etc. Most domains have complex metadata 7

Metadata Seeing Standards: A Visualisation of the Metadata Universe. J. Riley, D. Becker 8

Norstore Archive Metadata Creation Archive Reuse 9

Norstore Archive Metadata Many metadata schemes have Dublin Core either as a basis or have a strong overlap with Dublin Core. Dublin Core is an ISO standard. The standard has 15 terms, extended Dublin Core has more terms. The Norstore Archive uses Dublin Core as a basis. Additional metadata terms added that are not covered by DC, but are generic enough for all communities. OAI-PMH based on DC so automatically compliant. 10

Norstore Archive Metadata Descriptive Information Category Description Identifier Internal Identifier Journal Article Language Phase State Subject Title Administrative Information Access Rights Contributor Created Creator Data Manager License Lifetime Preservation Level Published on Publisher Rights Rights Holder Submitted Terms and Conditions for Deposit Structural Information File Checksum File Name File Size File Type Descriptive Information (optional) Bibliographic Citation Conforms to Comment Geolocation Label Project Provenance Source Temporal Coverage Bold terms are Dublin Core recommended terms. Top 3 boxes contain mandatory metadata. Terms in italics are automatically filled in by archive. Only ~14 terms to be defined by the user 11

Norstore Archive Metadata You may find many of the optional terms very useful: Geolocation used to define geographic locations (WGS84 and UTM supported). Temporal Coverage used to define durations. Bibliographic Citation how the dataset should be cited in follow-up research that uses this dataset. E.g. Authors/Rights Holders, Title, Year Published, DOI Provenance the history of the dataset (ie how it was created). Source the source dataset from which this dataset was derived. Conforms to which standards does the dataset follow. 12

Norstore Archive Metadata Norstore metadata designed to be as generic as possible. Sufficient to locate data and understand how to use the data. More detailed information can be contained in domain-specific catalogues. Can have domain specific catalogues use the DOI as a handle to the data (resolving the link will provide access to the data). Could think about the archive referencing an external domain specific catalogue. 13

Norstore and Domain Metadata Data Service Norstore Archive DOI Resolver Domain Metadata Service Metadata Service Domain Metadata catalogue can have DOI registered. Could then invoke DOI resolver to provide access to archive metadata and data. 14

How to fill in Norstore Metadata 15

Norstore Metadata Currently metadata must be supplied via the web interface. Metadata divided into 2 phases: Before uploading the data consists of mandatory metadata that is needed by the archive for managing the data (eg contact information, title of the dataset etc) After uploading the data consists of remaining mandatory descriptive information and optional information. There is a 3 month time limit to fill in metadata User will be reminded during this period of need to complete metadata. At the discretion of the Archive Manager the dataset may be deleted if the metadata remains incomplete after this limit. Typically metadata is completed within 2 weeks. 16

Tips for Norstore Metadata Try to avoid duplicating information. Think what information you would need in order to re-analyse your data. What applications, libraries, workflows, manuals would you need (versions etc)? Any other data that would be needed? Any other information (such as auxiliary files or databases) required? Are there any features or peculiarities of the data that are worth noting? Is the environment the data was collected in described? Date, location the data was collected. If using additional instruments the configuration of those instruments. 17

Tips for Norstore Metadata Try to use the description field to describe what the dataset is and how to use it. If you have a lot of documentation that covers this you could think about including it as part of the dataset and describe where to find it in the description. Try to make use of the Journal metadata to provide a reference to an article that describes the dataset. In this case your description can be more succinct. If your dataset has temporal or spatial information consider using the optional metadata to capture that information. 18

Tips for Norstore Metadata Links to external references with more information are good But, beware of durability Will the reference last the lifetime of the dataset? Beware of jargon or terminology Perhaps run your description by novice users to see if it s clear 19

Example Project to model deposition of Saharan sand in country X Project develops a model for the summer aerial transport of sand from Sahara to country X. The model consists of a C++ computer program that makes use of GNU Scientific Library and a series of input parameter files. The parameter files are structured ASCII files. The model output are structured ASCII files. The project also contains recordings of the actual amount of sand deposited in 20 regions in country X. The recordings are stored in structured ASCII files. 20

Example Metadata Description metadata should contain: What the dataset is, what project it is associated to, when it ran, assumptions in the model any notable features in the data. Layout of the data (ie directory structure what s where). This dataset consists of a model for the summer aerial deposition of Saharan sand in country X. The model has been developed for Summer 2014 deposition. The dataset is arranged in the following manner: src contains the model as C++ programs; input contains the sample input configuration files needed by the model; output contains the sample model output; data contains the sand deposition in soil samples taken in regions X1, X2, X20; doc contains detailed documentation on building, configuring and running the model as well as required libraries and documentation on interpreting the results. Use the Journal metadata to reference the article based on the dataset. Use the geospatial coverage to identify the country studied. Optionally use the temporal coverage if the studies span a range in time. 21

Norstore Metadata Plans Recognise that many projects have metadata catalogues. Ability to extract subset that matches some of the Norstore metadata terms would be good. Working on a REST API for the metadata catalogue. Currently looking at the Search, but can be extended to the ingest of metadata. Allows you to script the extraction and loading of some of the norstore metadata automatically. Useful for projects with many datasets. Implement metadata errata: Allow traceable corrections of metadata 22