data.bris: collecting and organising repository metadata, an institutional case study



Similar documents
Research Data Management Guide

Working with the British Library and DataCite A guide for Higher Education Institutions in the UK

Cite My Data M2M Service Technical Description

LIBER Case Study: University of Oxford Research Data Management Infrastructure

University of Bristol. Research Data Storage Facility (the Facility) Policy Procedures and FAQs

EPSRC Research Data Management Compliance Report

The overall aim for this project is To improve the way that the University currently manages its research publications data

Towards research data cataloguing at Southampton using Microsoft SharePoint and EPrints: a progress report

OpenAIRE Research Data Management Briefing paper

SHared Access Research Ecosystem (SHARE)

DATA CITATION. what you need to know

RESEARCH DATA MANAGEMENT POLICY

Stewarding Big Data: Perspectives on Public Access to Federally Funded Scientific Research Data

How To Write A Blog Post On Globus

THE BRITISH LIBRARY. Unlocking The Value. The British Library s Collection Metadata Strategy Page 1 of 8

Environment Canada Data Management Program. Paul Paciorek Corporate Services Branch May 7, 2014

Data Management Resources at UNC: The Carolina Digital Repository and Dataverse Network

Jochen Schirrwagen, Najko Jahn. Bielefeld University Library, Germany. Research in Context

Research Data Storage and the University of Bristol

Notes about possible technical criteria for evaluating institutional repository (IR) software

DRIVER Providing value-added services on top of Open Access institutional repositories

A collaborative platform for knowledge management

ESRC Research Data Policy

A Guide to the Research Data Service

JISC Project Plan. Leeds RoaDMaP (Leeds Research Data Management Pilot) #leedsrdm. University of Leeds

Introduction to the Research Data Center PsychData

infokit JISC infonet is a JISC Advance Service

- a Humanities Asset Management System. Georg Vogeler & Martina Semlak

Expanding Metadata Reuse with an Islandora Metadata Extraction Utility

Data Publishing Workflows with Dataverse

Project Acronym: CRM ACCORD Version: 2 Contact: Joanne Child, Doncaster College Date: 30 April JISC Final Report CRM ACCORD

RE: OSTP RFI: Public Access to Peer-Reviewed Scholarly Publications Resulting From Federally Funded Research

Building Semantic Content Management Framework

DDI Lifecycle: Moving Forward Status of the Development of DDI 4. Joachim Wackerow Technical Committee, DDI Alliance

Image Galleries: How to Post and Display Images in Digital Commons

Checklist for a Data Management Plan draft

Affiliation: University of Massachusetts Amherst, University Libraries

Metadata Repositories in Health Care. Discussion Paper

Functional Requirements for Digital Asset Management Project version /30/2006

Research Data Management Policy

PAPER Data retrieval in the PURE CRIS project at 9 universities

Supporting Change-Aware Semantic Web Services

Exploring the roles and responsibilities of data centres and institutions in curating research data a preliminary briefing.

Project Information Project Acronym Kaptur

The Institutional Repository at West Virginia University Libraries: Resources for Effective Promotion

BYODs & FAIR Data Stewardship

Statement of Work (SOW) for Web Harvesting U.S. Government Printing Office Office of Information Dissemination

How To Useuk Data Service

Research Data Management (RDM) Roadmap August 2012 January 2014

REACCH PNA Data Management Plan

STORRE: Stirling Online Research Repository Policy for etheses

How To Build A Connector On A Website (For A Nonprogrammer)

OPENGREY: HOW IT WORKS AND HOW IT IS USED

Best Practices for Data Management. RMACC HPC Symposium, 8/13/2014

Canadian National Research Data Repository Service. CC and CARL Partnership for a national platform for Research Data Management

Check Your Data Freedom: A Taxonomy to Assess Life Science Database Openness

ELPUB Digital Library v2.0. Application of semantic web technologies

DIGITAL STEWARDSHIP SUPPLEMENTARY INFORMATION FORM

Open Data and Information Sharing Michael Simcock, Chief Data Architect OCIO/EDMO

Checklist and guidance for a Data Management Plan

The PIRUS Code of Practice for recording and reporting usage at the individual article level

2013 Research Publications Repository Survey Report September 2013

Master Class 4 - Introduction to Intellectual Property and Copyright Wilna Macmillan, Director of Client Services, Library

Collecting and archiving tweets: a DataPool case study

Cambridge University Library. Working together: a strategic framework

Report of the DTL focus meeting on Life Science Data Repositories

Service Guidelines. This document describes the key services and core policies underlying California Digital Library (CDL) s EZID Service.

Metadata for Data Discovery: The NERC Data Catalogue Service. Steve Donegan

Transcription:

Describe, disseminate, discover: metadata for effective data citation. DataCite workshop, no.2.. data.bris: collecting and organising repository metadata, an institutional case study David Boyd data.bris project

data.bris context Funded by JISC, one of several RDM infrastructure projects, October 2011 March 2013. Central aim to establish a research data repository service Service to be piloted in the Faculty of Arts (School of Arts; School of Humanities; School of Modern Languages) Develop a business plan to extend the model across the University Building on the investment already made in research data storage through the Research Data Storage Facility

Repository service some key requirements Must be fit for purpose and meet the needs of the researcher Must fit within the wider University-wide support service structure, i.e., successfully interface with existing systems and processes, e.g., Research Data Storage Facility, and new RIS Must be able to be embedded in future systems, without conflicting with other University systems

6 July 2012 Research Data Storage Facility Bristol has the infrastructure for research data storage, though requires some way to publish it! Employees of the University complete an application form to use the Facility. Application form asks for details about the project and the data being created. Data Steward is held responsible for the proper stewardship of the data stored in the Facility. Existing service offers an upload facility by simply mapping storage to individual machines around the campus. Provision of metadata is not mandatory.

6 July 2012 data.bris architecture Some of the things we want: A deposit facility that will allow a user to select a set of files from the RDSF for deposit and ultimately wider publication. The ability to document that deposit in some detail. The metadata will be manually entered, but also gathered from other institutional systems and mechanically derived from the dataset itself. A process of assigning a DOI to the dataset and making it publicly available. To create and manage records in the University RIS

Metadata - key requirements Generate metadata that is of good quality, and enables discovery, retrieval, and sharing of data Establish a process for collecting metadata that is lightweight, efficient, and where possible automated Provide depositors with a standard citation for their dataset - identify a minimum set of metadata for the purposes of data citation, including a persistent identifier Generate metadata that is interoperable and can be shared across different systems, i.e., standards based

Deposit process - collecting metadata manually In-house system that directly interacts with users RDSF file share Lightweight approach - encourage the creators of data to deposit by developing an intuitive, simple browser-based user interface Develop efficient, streamlined procedures for manual data entry and metadata creation Engage researchers with the repository deposit process in an attempt to embed the processes involved within their working cycle to further encourage deposit

6 July 2012 Deposit facility: project area

6 July 2012 Deposit facility: project area with data contents

6 July 2012 Deposit facility: new deposit (initial information)

6 July 2012 Deposit facility: record created

6 July 2012 Deposit facility: choosing content

6 July 2012 Deposit facility: data deposited

6 July 2012 Deposit facility: data published

Generating and capturing metadata automatically Import, and integrate metadata held in other University systems into the data deposit process where relevant, e.g., Title of data; names of data creators; name of research project (extracted and re-used from original RDSF registration form) Staff profile information from central administrative systems Automatically generate metadata where relevant, e.g., Identifier (DOI), Publisher name Apache Tika - automatically extract metadata from data files where relevant, e.g., when created, who created it, when last updated, file size, file extension etc.

Data citation Provide researchers with a sample citation (and persistent identifier) for their dataset, one that can be used in academic publications. Enables impact of data to be tracked Implement a minimum (mandatory) set of metadata whose elements can be used for citation purposes DataCite recommended format with mapping to Dublin Core qualified terms is being used as a point of reference, i.e., Name of creator/s (PublicationYear): Title. Publisher. Identifier (DOI). Aim to use DataCite Metadata Store service for minting DOIs and register associated metadata.

Metadata interoperability and implementation Use Dublin Core (DC) element set as a basis for defining and structuring metadata. DC is well understood, widely used, and highly interoperable Use of DC terms follows the Resource Description Framework (RDF) model Implementation - provide serialisation in RDF/XML, turtle, JSON and XML. May also provide RDF embedded in html Integration with University RIS...

Research Information System (RIS): University is currently implementing a new research information system (RIS) that aims to provide a complete and definitive record of researchers, research outputs, grants and projects RIS will also serve as the full-text institutional repository though it is assumed that (for the present at least) it won t store large datasets Need to avoid duplication of metadata and ensure that users do not re-enter metadata Need to adopt the same identifiers for users, outputs, and datasets

Research Information System (RIS) proposed interaction: Aim to push data.bris dataset records (summary metadata records not the data itself) to the RIS, to provide a complete picture of research. Requirement for a metadata crosswalk Aim to minimise the amount of data entry and duplication in data.bris system by using metadata that originates in the RIS Longer term - as RIS's ability to describe research data evolves it may be possible for the RIS to take over from data.bris as the definitive holder of metadata about research data.

Ongoing investigation related resources Appropriate use of metadata properties and attributes to describe the relationships between resources, e.g., the relationship between an article and the dataset on which the research is based the relationship between different versions (revisions, updates, additions) of a dataset the relationship between the various parts (supplements, subsets) of a dataset What constitutes a complete (citable) set of data?

6 July 2012 Ongoing investigations stub deposits Provide for stub deposits, i.e., those consisting of metadata only and no associated research data, for when the data is held in an external repository or domain specific data centre Use of SWORD deposit protocol to enable this

Ongoing investigation domain/subject specific metadata Collecting descriptive information about data, e.g., annotations, notes, etc. plugin support for a variety of tools Subject classification, e.g., JACS coding, or LoC subject headings

Thank you. David Boyd. University of Bristol, Library Services/IT Services R&D/ILRT. d.j.boyd@bristol.ac.uk