The ISPS Data Archive: Mission, Work, and Some Reflections

Similar documents
AN OPEN SOURCE DDI-BASED DATA CURATION SYSTEM FOR SOCIAL SCIENCE DATA. NADDI Vancouver, Canada

Developing software to prepare social science research data and code for sharing and preservation

Benefits of managing and sharing your data

NIH Commons Overview, Framework & Pilots - Version 1. The NIH Commons

LJMU Research Data Policy: information and guidance

Checklist for a Data Management Plan draft

Strategic Plan

Integrated Information Services (IIS) Strategic Plan

DDI Lifecycle: Moving Forward Status of the Development of DDI 4. Joachim Wackerow Technical Committee, DDI Alliance

Research Data Management Services. Katherine McNeill Social Sciences Librarians Boot Camp June 1, 2012

Invenio: A Modern Digital Library for Grey Literature

DIGITAL STEWARDSHIP SUPPLEMENTARY INFORMATION FORM

EUROPEAN COMMISSION Directorate-General for Research & Innovation. Guidelines on Data Management in Horizon 2020

The World Agroforestry centre Policy Series. Research Data Management Policy

OpenAIRE Research Data Management Briefing paper

Globus Research Data Management: Introduction and Service Overview

Canadian National Research Data Repository Service. CC and CARL Partnership for a national platform for Research Data Management

Data Management Resources at UNC: The Carolina Digital Repository and Dataverse Network

ODUM INSTITUTE ARCHIVE SERVICES OVERVIEW IASSIST 2015

New Developments in Data Sharing, Remote Access, Secure Data, and Documentation at the Cornell Institute for Social and Economic Research (CISER)

Releasing High Quality Applications More Quickly with vrealize Code Stream

Print Management customers have a print first mentality Print management tools are utilized daily by the majority of users in an organization

Enhanced Research Data Management and Publication with Globus

Data Publishing Workflows with Dataverse

How To Useuk Data Service

SHared Access Research Ecosystem (SHARE)

Packrat: A Dependency Management System for R

Globus Research Data Management: Introduction and Service Overview. Steve Tuecke Vas Vasiliadis

ESRC Research Data Policy

PsychData Sharing research data in Psychology

E-Content Service Group Virtual Meeting. Digital Preservation: How to Get Started

Research and the Scholarly Communications Process: Towards Strategic Goals for Public Policy

INFORMATION MANAGEMENT STRATEGIC FRAMEWORK GENERAL NAT OVERVIEW

Intro to Data Management. Chris Jordan Data Management and Collections Group Texas Advanced Computing Center

Open Access and Open Research Data in Horizon 2020

A Component of Professional Skills Workshops for Graduate Research Students

THE BRITISH LIBRARY. Unlocking The Value. The British Library s Collection Metadata Strategy Page 1 of 8

Data Management in NeuroMat and the Neuroscience Experiments System (NES)

escidoc: una plataforma de nueva generación para la información y la comunicación científica

Research Data Archival Guidelines

Open Access to publications and research data in Horizon 2020

CROSS PLATFORM AUTOMATIC FILE REPLICATION AND SERVER TO SERVER FILE SYNCHRONIZATION

Digital Forensics Tutorials Acquiring an Image with FTK Imager

Introduction to the Research Data Center PsychData

How To Transform It Risk Management

Best Practices for Data Management. RMACC HPC Symposium, 8/13/2014

DataShare & Data Audit. Lessons Learned. Robin Rice. Digital Curation Practice, Promise and Prospects

Ex Libris Rosetta: A Digital Preservation System Product Description

Glassfish Architecture.

Bioinformatics for programmers

Business Startups - Advantages of Using Automation

The Data Management Plan with. Dataverse. Mercè Crosas, Ph.D. Director of Product Development

Table of contents. Enterprise Resource Planning (ERP) functional testing best practices: Ten steps to ERP systems reliability

THE UNIVERSITY OF LEEDS. Vice Chancellor s Executive Group Funding for Research Data Management: Interim

WHAT SHOULD NSF DATA MANAGEMENT PLANS LOOK LIKE

ENTERPRISE CONTENT MANAGEMENT. Which one is best for your organisation?

BioMed Central s position statement on open data

IBM Enterprise Content Management: Streamlining operations for environmental compliance

Validating Enterprise Systems: A Practical Guide

How To Write A Blog Post On Globus

Meister Going Beyond Maven

PSCIOC IM Sub-Committee E-Records Project

Small is beautiful: the value proposition for libraries as publishers using open source systems

Amazon Web Services. Lawrence Berkeley LabTech Conference 9/10/15. Jamie Baker Federal Scientific Account Manager AWS WWPS

Open Access to Manuscripts, Open Science, and Big Data

Edinburgh Napier University. Research Data Management Policy

Enterprise Report Management CA View, CA Deliver, CA Dispatch, CA Bundl, CA Spool, CA Output Management Web Viewer

Digital preservation a European perspective

Strategic Plan University Libraries Virginia Tech

Extracting and Preparing Metadata to Make Video Files Searchable

ICT Strategy for the Secretariat

Digital Repository Services for Managing Research Data: What Do Oxford Researchers Need?

EverSuite. Enterprise Content Management Solutions. Capture. Manage. Store. Preserve. Publish

Texas State University University Library Strategic Plan

Quality Review of Energy Data:

A Binary Tree SMART CTO Webinar. Analyzing and Rationalizing Big Data in Messaging

Realizing business flexibility through integrated SOA policy management.

OPERATIONALIZING EXCELLENCE IN THE GLOBAL REGULATORY SUBMISSIONS PROCESS

Contemporary Composers Web Archive (CCWA): Progress in Collaboratively Collecting Composers' Websites

Protecting Data-at-Rest with SecureZIP for DLP

Introduction to Research Data Management. Tom Melvin, Anita Schwartz, and Jessica Cote April 13, 2016

TIMESTORM " MIND AND TIME: INVESTIGATION OF THE TEMPORAL TRAITS OF HUMAN-MACHINE CONVERGENCE"

In 2014, the Research Data Purdue University

Interoperable Cloud Storage with the CDMI Standard

Working with the British Library and DataCite Institutional Case Studies

Data Management Plan. Name of Contractor. Name of project. Project Duration Start date : End: DMP Version. Date Amended, if any

DevOps: Roll out new software and functionality quicker with high velocity DevOps

A Service for Data-Intensive Computations on Virtual Clusters

Research Data Management Guide

Report of the DTL focus meeting on Life Science Data Repositories

Virginia Systems Repository (VSR): Data Repositories DHS/FEMA/PIA 038(a)

DICTATION & TRANSCRIPTION. INFO@DOLBEY.com

Research Data Management - The Essentials

Introduction to the ITIL Service Management Framework

Connecting Basic Research and Healthcare Big Data

DIGITAL ARCHIVES & PRESERVATION SYSTEMS

Global Headquarters: 5 Speen Street Framingham, MA USA P F

Extending Microsoft SharePoint Environments with EMC Documentum ApplicationXtender Document Management

Resource Management. Resource Management

Digital Asset Management Developing your Institutional Repository

Transcription:

The ISPS Data Archive: Mission, Work, and Some Reflections

http://isps.yale.edu

Archive Embedded in ISPS Website Limor Peer Yale University April 2016 http://isps.yale.edu/research/data

ISPS Data Archive: Basic Facts A digital repository for research produced by scholars affiliated with ISPS, with special focus on experimental design and methods. Associated data are primarily quantitative and gathered from a combination of administrative records, surveys, and observations. The Archive accepts research outputs underlying scientific publications including data, code, output, documentation, and other research artefacts. The Archive provides free and public access to research materials and accepts content for distribution only under a Creative Commons license. The Archive was launched in September 2010 as a pilot at Yale. (We learned a lot, and still are.) The Archive is managed by a full time professional with knowledge of the specific research domain and graduate student research assistants who handle most of the data curation. http://isps.yale.edu/research/data In: Curating Research Data Volume 2, (forthcoming)

ISPS Data Archive: Mission The ISPS Data Archive was built to create an open access digital collection of social science experimental data, metadata, code, and associated files produced by ISPS researchers, for the purpose of replication of research findings, further analysis, and teaching.

Curation for Quality & Reproducibility Data Quality Review Framework Peer, Green, and Stephenson. 2014. Committing to Data Quality Review. International Journal of Digital Curation, 19(1): 263-291. doi:10.2218/ijdc.v9i1.317

Curation Tasks for Quality & Reproducibility Performed by the ISPS Data Archive pre-publication: 1. Check for missing variable labels and value codes. 2. Review observation count 3. Identify potential data errors 4. Compare questionnaire, codebook, & data 5. Ensure there is no personally-identifiable information (PII) 6. Confirm code executes 7. Confirm code produces reported results 8. Create open formats In: IASSIST Quarterly (January 2016)

New Curation Tool: Curator @Yale Structures and tracks the curation workflow Helps automate parts of the review pipeline Captures all metadata throughout the process Pushes out relevant information to predetermined destinations o i.e., a user, the archive administrators, a Web based dissemination system, or preservation systems Can fit into repository and research workflows

New Curation Tool: Curator @Yale Leverages DDI Lifecycle Machine executable, open structured format supports research transparency Plays a part in review tasks study, file & variable levels Modular, open-source Could be adapted to changing needs, research methods, dissemination platforms, and preservation solutions Could be used by repositories, researchers, and research staff Flexible deployment and configuration options

New Curation Tool: Curator @Yale

New Curation Tool: Curator @Yale

New Curation Tool: Curator @Yale

New Curation Tool: Curator @Yale

New Curation Tool: Curator @Yale

Reflections: What s Next? Max Weber German sociologist, philosopher, political economist Inevitable and linear trend toward increasing rationalization, systemization, and routinization Will curation for quality & reproducibility become routinized?

Curation for Quality & Reproducibility Will Become Routinized When 1. Research data policies & culture mature Wide recognition that curating for quality and reproducibility is a valued goal Researchers are rewarded for quality of materials shared (e.g., badges) o OSF, ACM-DL Gatekeeping is acknowledged as essential function an

Curation for Quality & Reproducibility Will Become Routinized When 2. It is easier and cheaper to do It can be better automated o Curator tool, APIs Curator @Yale Computational methods evolve to readily capture the entire workflow and preserve it o R knitr, R Sweave, Docker, ReproZip

Curation for Quality & Reproducibility Will Become Routinized When 3. It is a part of active research process http://www.thesleuthjournal.com/military-co-option-media-entertainment/

Curation for Quality & Reproducibility Will Become Routinized When 1. Research data policies & culture mature 2. It is easier and cheaper to do 2. It is a part of active research process Can Curation Prevent the Next Data Sharing Disaster? http://isps.yale.edu/news/blog/2016/06/can-curation-prevent-the-next-data-sharing-disaster

Thank you! Limor Peer limor.peer@yale.edu @l_peer