Image Data, RDA and Practical Policies
|
|
|
- Amelia James
- 10 years ago
- Views:
Transcription
1 Image Data, RDA and Practical Policies Rainer Stotzka and many others KIT University of the State of Baden-Württemberg and National Laboratory of the Helmholtz Association
2 Data Life Cycle Lab Key Technologies Mostly scientific imaging projects with common needs in Data and metadata management High throughput analysis in near real time Data archiving and sharing High performance data ingest Automatic metadata extraction Visualization 2
3 Light Sheet Microscopy Novel microscope to image living Zebrafish embryos High 3D resolution Extreme short data acquisition time Growth of an embryo with in 16 h 16 TB raw data Acknowledgements: Andrey Kobitskiy, G. Ulrich Nienhaus Jens C. Otte, Masanari Takamiya, Uwe Strähle Johannes Stegmaier, Ralf Mikut Francesca Rindone, Volker Hartmann, Thomas Jejkal Achim Streit, Christopher Jung, Jos van Wezel Institute of Applied Physics Institute of Toxicology and Genetics Institute for Applied Computer Science Steinbuch Centre for Computing 3
4 Image Data Workflow Data Transfer Storage Preservation and Curation Sharing and Reuse 4
5 Image Data Production Data Transfer Storage Preservation and Curation Sharing and Reuse Experiment design Big Data: variety Data acquisition and quality control 200 MB/s 16 TB/d 5
6 Image Data Transfer Data Transfer Storage Data disposal : Data description Metadata Transfer to storage: 400 MB/s Preservation and Curation Sharing and Reuse 6
7 Image Data Storage Data Transfer Storage Big Data: volume, velocity Data access Security Preservation and Curation Sharing and Reuse 7
8 Image Data Data Transfer Storage Preservation and Curation Sharing and Reuse Big Data: value 8
9 Image Data Data Transfer Storage Big Data: value costly to be reproduced Worth to be preserved for future generations Sharing: Reproducible science Re-use: New (interdisciplinary) scientific outcomes Big Data: veracity Trust, well maintained Open Data Preservation and Curation Sharing and Reuse 9
10 Image Workflow Data Transfer Storage Preservation and Curation Sharing and Reuse Problems: HELP: Where is my data? HOW to manage, preserve, and to curate the data? 10
11 Image Data Repository Data Transfer Storage Preservation and Curation Repository Sharing and Reuse 11
12 Repository Repository: Managed location/destination/directory/bucket where digital data objects are Registered Permanently stored Made accessible and retrievable Curated E.g.: libraries & museums Digital data object (dataset): Consists of Data Description for re-use Digital Data Object (Dataset) Data (Data) Storage Organization Metadata 12
13 Repository System: KIT Data Manager Administrative Metadata Base MD Curation MD Sharing MD Policy MD MD OID Data Organization Metadata Content Metadata Repository System: Optimized for large scale and heterogeneous research data Internals are based on metadata Users & data curators don t see the complexity: Visible: Content metadata Discovery tool: search Representation of data 13
14 Repository Design Discover Access Curate Repository HELP: Where is my data? HOW to manage, preserve, and to curate the data? HOW to design the repository? Need for content metadata Need for rules for automation practical policies 14
15 RDA Working Group: Practical Policies Chairs: Reagan Moore, Rainer Stotzka Catalog with 11 important policy areas: Contextual metadata extraction Data access control Data backup Data format control Data retention Disposition Integrity (including replication) Notification Restricted searching Storage cost reports Use agreements 15
16 Example: Contextual Metadata Extraction Scenario: Extract metadata from an associated document, e.g. DICOM, OME,... Extract metadata from Discover Access Curate Repository 16
17 Example: Contextual Metadata Extraction Policy MD enable Contextual MD metadata extraction On file File_name On digital data object OID On user User_ID Attribute_name Extract metadata Attribute_value enable Data access control enable Data backup enable Data format control enable Data retention enable Disposition Expert knowledge required Data Life Cycle Lab Minimize the effort of manually recording metadata! 17
18 Policy Examples Policy: Contextual metadata extraction Policy: Restricted searching Policy: Data access control Discover Access Curate Repository Policy: Data backup Policy: Data integrity 18
19 Interaction with RDA Important groups for imaging repositories WG Practical Policies WG Data Foundation & Terminology PID groups Metadata groups IG Data Fabric Repository groups Domain related groups Typical PhD researcher Cooperates with domain partners Solves their data problems Specific research topic Monitors the activities of groups Transfer to the domain partners If funding available RDA plenaries Inhibition threshold: world experts Active in RDA groups Results: Motivation booster Networking Future State of the art 19
20 Conclusions Goal: Better and easier management of research data Sustain the value Enable sharing, reproducibility and re-use Enable better research RDA has real value and impact RDA is alive and growing, efficiency needs further improvement Technical Advisory Board, Organizational Advisory Board, Council, Secretariat, and individuals Plenaries (group sessions) produce results: 2 plenaries per year, low registration fee Need to invest: contributions gain results RDA is producing internationally connected data experts 20
Software Entwicklungen für das LSDF Datenmanagement
Software Entwicklungen für das LSDF Datenmanagement Rainer Stotzka, V. Hartmann, T. Jejkal,, P. Neuberger, S. Ochsenreither, F. Rindone, T. Schmidt, H. Pasic J. van Wezel, A. Garcia, R. Kupsch, S. Bourov,
Data Management at UT
Data Management at UT Maria Esteva, TACC, [email protected] Colleen Lyon, UT Libraries, [email protected] Angela Newell, ITS, [email protected] What is data management? systematic organization
Management of Research Data Procedure
Management of Research Data Procedure Related Policy Management of Research Data Policy Responsible Officer Deputy Vice Chancellor (Research) Approved by Deputy Vice Chancellor (Research) Approved and
OpenAIRE Research Data Management Briefing paper
OpenAIRE Research Data Management Briefing paper Understanding Research Data Management February 2016 H2020-EINFRA-2014-1 Topic: e-infrastructure for Open Access Research & Innovation action Grant Agreement
Globus Research Data Management: Introduction and Service Overview
Globus Research Data Management: Introduction and Service Overview Kyle Chard [email protected] Ben Blaiszik [email protected] Thank you to our sponsors! U. S. D E P A R T M E N T OF ENERGY 2 Agenda
The Australian War Memorial s Digital Asset Management System
The Australian War Memorial s Digital Asset Management System Abstract The Memorial is currently developing an Enterprise Content Management System (ECM) of which a Digital Asset Management System (DAMS)
USGS Guidelines for the Preservation of Digital Scientific Data
USGS Guidelines for the Preservation of Digital Scientific Data Introduction This document provides guidelines for use by USGS scientists, management, and IT staff in technical evaluation of systems for
Survey of Canadian and International Data Management Initiatives. By Diego Argáez and Kathleen Shearer
Survey of Canadian and International Data Management Initiatives By Diego Argáez and Kathleen Shearer on behalf of the CARL Data Management Working Group (Working paper) April 28, 2008 Introduction Today,
EUDAT. Towards a pan-european Collaborative Data Infrastructure. Willem Elbers
EUDAT Towards a pan-european Collaborative Data Infrastructure Willem Elbers EUDAT / MPI-TLA Focus meeting: Data repositories SURF, Utrecht March 3, 2014 Outline EUDAT project EUDAT services Summary and
Long-term archiving and preservation planning
Long-term archiving and preservation planning Workflow in digital preservation Hilde van Wijngaarden Head, Digital Preservation Department National Library of the Netherlands The Challenge: Long-term Preservation
Cloud computing based big data ecosystem and requirements
Cloud computing based big data ecosystem and requirements Yongshun Cai ( 蔡 永 顺 ) Associate Rapporteur of ITU T SG13 Q17 China Telecom Dong Wang ( 王 东 ) Rapporteur of ITU T SG13 Q18 ZTE Corporation Agenda
DMBI: Data Management for Bio-Imaging.
DMBI: Data Management for Bio-Imaging. Questionnaire Data Report 1.0 Overview As part of the DMBI project an international meeting was organized around the project to bring together the bio-imaging community
DIGITAL STEWARDSHIP SUPPLEMENTARY INFORMATION FORM
DIGITAL STEWARDSHIP SUPPLEMENTARY INFORMATION FORM Introduction The Institute of Museum and Library Services (IMLS) is committed to expanding public access to federally funded research, data, software,
Research Data Management Guide
Research Data Management Guide Research Data Management at Imperial WHAT IS RESEARCH DATA MANAGEMENT (RDM)? Research data management is the planning, organisation and preservation of the evidence that
Best Practices for Data Management. RMACC HPC Symposium, 8/13/2014
Best Practices for Data Management RMACC HPC Symposium, 8/13/2014 Presenters Andrew Johnson Research Data Librarian CU-Boulder Libraries Shelley Knuth Research Data Specialist CU-Boulder Research Computing
The Concept of Big Data Reference Model
2013-11-14 ISO/IEC JTC1/SC32/WG2N1853 The Concept of Reference Model Sungjoon Lim, KoDB*, [email protected] Dongwon Jeong, KNU**, [email protected] Jangwon Gim, KISTI***, [email protected] Hanmin Jung,
Interagency Science Working Group. National Archives and Records Administration
Interagency Science Working Group 1 National Archives and Records Administration Establishing Trustworthy Digital Repositories: A Discussion Guide Based on the ISO Open Archival Information System (OAIS)
Records Management and SharePoint 2013
Records Management and SharePoint 2013 SHAREPOINT MANAGEMENT, ARCHITECTURE AND DESIGN Bob Mixon Senior SharePoint Architect, Information Architect, Project Manager Copyright Protected by 2013, 2014. Bob
SURFsara Data Services
SURFsara Data Services SUPPORTING DATA-INTENSIVE SCIENCES Mark van de Sanden The world of the many Many different users (well organised (international) user communities, research groups, universities,
Lesson 3: Data Management Planning
Lesson 3: CC image by Joe Hall on Flickr What is a data management plan (DMP)? Why prepare a DMP? Components of a DMP NSF requirements for DMPs Example of NSF DMP CC image by Darla Hueske on Flickr After
THE BRITISH LIBRARY. Unlocking The Value. The British Library s Collection Metadata Strategy 2015-2018. Page 1 of 8
THE BRITISH LIBRARY Unlocking The Value The British Library s Collection Metadata Strategy 2015-2018 Page 1 of 8 Summary Our vision is that by 2020 the Library s collection metadata assets will be comprehensive,
How To Write A Blog Post On Globus
Globus Software as a Service data publication and discovery Kyle Chard, University of Chicago Computation Institute, [email protected] Jim Pruyne, University of Chicago Computation Institute, [email protected]
Report of the DTL focus meeting on Life Science Data Repositories
Report of the DTL focus meeting on Life Science Data Repositories Goal The goal of the meeting was to inform and discuss research data repositories for life sciences. The big data era adds to the complexity
Euro-BioImaging European Research Infrastructure for Imaging Technologies in Biological and Biomedical Sciences
Euro-BioImaging European Research Infrastructure for Imaging Technologies in Biological and Biomedical Sciences WP11 Data Storage and Analysis Task 11.1 Coordination Deliverable 11.2 Community Needs of
Ingrid Hsieh Yee, Professor School of Library & Information Science Catholic University of America
Ingrid Hsieh Yee, Professor School of Library & Information Science Catholic University of America Best Practices Exchange, Dec. 4, 2012, Annapolis, MD 21 st Century Information Professionals Manage data,
Data Management Plan. Name of Contractor. Name of project. Project Duration Start date : End: DMP Version. Date Amended, if any
Data Management Plan Name of Contractor Name of project Project Duration Start date : End: DMP Version Date Amended, if any Name of all authors, and ORCID number for each author WYDOT Project Number Any
Addressing Risk Data Aggregation and Risk Reporting Ben Sharma, CEO. Big Data Everywhere Conference, NYC November 2015
Addressing Risk Data Aggregation and Risk Reporting Ben Sharma, CEO Big Data Everywhere Conference, NYC November 2015 Agenda 1. Challenges with Risk Data Aggregation and Risk Reporting (RDARR) 2. How a
Why long time storage does not equate to archive
Why long time storage does not equate to archive Jos van Wezel HUF Toronto 2015 STEINBUCH CENTRE FOR COMPUTING - SCC KIT University of the State of Baden-Württemberg and National Laboratory of the Helmholtz
Research Data Management Policy
Research Data Management Policy Version Number: 1.0 Effective from 06 January 2016 Author: Research Data Manager The Library Document Control Information Status and reason for development New as no previous
The Data Grid: Towards an Architecture for Distributed Management and Analysis of Large Scientific Datasets
The Data Grid: Towards an Architecture for Distributed Management and Analysis of Large Scientific Datasets!! Large data collections appear in many scientific domains like climate studies.!! Users and
Data Management in NeuroMat and the Neuroscience Experiments System (NES)
Data Management in NeuroMat and the Neuroscience Experiments System (NES) Kelly Rosa Braghetto Department of Computer Science Institute of Mathematics and Statistics - University of São Paulo 1st NeuroMat
Data Management Plans & the DMPTool. IAP: January 26, 2016
Data Management Plans & the DMPTool IAP: January 26, 2016 Data Management Services @ MIT Libraries Workshops/Webinars Web guide: http://libraries.mit.edu/data-management Individual consultations includes
IST687 Scientific Data Management
1 IST687 Scientific Data Management Spring 2012 Instructor: Jian Qin Email: [email protected] Office: 311 Hinds Hall Phone: 315-443-5642 Time: any time Location: anywhere Course Description The Scientific Data
Archiving the Web: the mass preservation challenge
Archiving the Web: the mass preservation challenge Catherine Lupovici Chargée de Mission auprès du Directeur des Services et des Réseaux Bibliothèque nationale de France 1-, Koninklijke Bibliotheek, Den
National Earth System Science Data Sharing Platform of China
RDA Fifth Plenary Session: Large Scale Projects meet RDA National Earth System Science Sharing Platform of China Yunqiang Zhu, Jia Song, Mo Wang, Xiaoxuan Wang Department of Geo-data Science and Sharing
Exploitation of ISS scientific data
Cooperative ISS Research data Conservation and Exploitation Exploitation of ISS scientific data Luigi Carotenuto Telespazio s.p.a. Copernicus Big Data Workshop March 13-14 2014 European Commission Brussels
E-Content Service Group Virtual Meeting. Digital Preservation: How to Get Started
E-Content Service Group Virtual Meeting Digital Preservation: How to Get Started Slide 2 of 17 Agenda Committee Members E-Content Purpose Presentation by Greg Zick Questions Discussion for May Meeting
Data Management Resources at UNC: The Carolina Digital Repository and Dataverse Network
Data Management Resources at UNC: The Carolina Digital Repository and Dataverse Network November 16, 2010 Data Management Short Course Series Sponsored by the Odum Institute and the UNC Libraries Campus
North Carolina Digital Preservation Policy. April 2014
North Carolina Digital Preservation Policy April 2014 North Carolina Digital Preservation Policy Page 1 Executive Summary The North Carolina Digital Preservation Policy (Policy) governs the operation,
Checklist: Persistent identifiers
Checklist: Persistent identifiers Persistent Identifiers (PID) are unique character strings 1 attached to various items. Attached to digital items, they are a prerequisite for linked data. Persistent identifiers
LabArchives Electronic Lab Notebook:
Electronic Lab Notebook: Cloud platform to manage research workflow & data Support Data Management Plans Annotate and prove discovery Secure compliance Improve compliance with your data management plans,
Library Strategic Planning
Library Strategic Planning Develop campus partnerships to collect, manage, share, and preserve Georgia Tech digital research data 1. Needs Assessment Research data survey & interviews NSF DMP content analysis
How To Useuk Data Service
Publishing and citing research data Research Data Management Support Services UK Data Service University of Essex April 2014 Overview While research data is often exchanged in informal ways with collaborators
A grant number provides unique identification for the grant.
Data Management Plan template Name of student/researcher(s) Name of group/project Description of your research Briefly summarise the type of your research to help others understand the purposes for which
Towards a Domain-Specific Framework for Predictive Analytics in Manufacturing. David Lechevalier Anantha Narayanan Sudarsan Rachuri
Towards a Framework for Predictive Analytics in Manufacturing David Lechevalier Anantha Narayanan Sudarsan Rachuri Outline 2 1. Motivation 1. Why Big in Manufacturing? 2. What is needed to apply Big in
Data Management Exercises
Data Management Exercises Exercise 1 Defining Post-Graduate Research Data 1. Discuss your research project and research data in groups of 3-4 Think about the following questions: What is your research
Policy Policy--driven Distributed driven Distributed Data Management (irods) Richard M arciano Marciano marciano@un marciano @un.
Policy-driven Distributed Data Management (irods) Richard Marciano [email protected] Professor @ SILS / Chief Scientist for Persistent Archives and Digital Preservation @ RENCI Director of the Sustainable
Why archiving erecords influences the creation of erecords. Martin Stürzlinger scopepartner Vienna, Austria
Why archiving erecords influences the creation of erecords Martin Stürzlinger scopepartner Vienna, Austria Electronic Records In a Productive System Created Used Changed Deleted In an Archival System No
Digital Preservation Strategy, 2012-2015
Digital Preservation Strategy, 2012-2015 Preface This digital preservation strategy sets out what the National Library of Wales (NLW) intends to do to preserve digital materials over the next three years.
INTEGRATED RULE ORIENTED DATA SYSTEM (IRODS)
INTEGRATED RULE ORIENTED DATA SYSTEM (IRODS) Todd BenDor Associate Professor Dept. of City and Regional Planning UNC-Chapel Hill [email protected] http://irods.org/ SESYNC Model Integration Workshop Important
On the Radar: Tessella
On the Radar: Tessella Creating an archive for the long-term preservation of digital content Reference Code: IT014-002789 Publication Date: 04 Sep 2013 Author: Sue Clarke SUMMARY Catalyst Ensuring that
Data Publishing Workflows with Dataverse
Data Publishing Workflows with Dataverse Mercè Crosas, Ph.D. Twitter: @mercecrosas Director of Data Science Institute for Quantitative Social Science, Harvard University MIT, May 6, 2014 Intro to our Data
Strategies and New Technology for Long Term Preservation of Big Data
Strategies and New Technology for Long Term Preservation of Big Data Presenter: Sam Fineberg, HP Storage Co-Authors: Simona Rabinovici-Cohen, IBM Research Haifa Roger Cummings, Antesignanus Mary Baker,
How To Understand And Understand The Science Of Astronomy
Introduction to the VO [email protected] ESAVO ESA/ESAC Madrid, Spain The way Astronomy works Telescopes (ground- and space-based, covering the full electromagnetic spectrum) Observatories Instruments
RESEARCH DATA MANAGEMENT POLICY
Document Title Version 1.1 Document Review Date March 2016 Document Owner Revision Timetable / Process RESEARCH DATA MANAGEMENT POLICY RESEARCH DATA MANAGEMENT POLICY Director of the Research Office Regular
Outline of a Research Data Management Policy for Australian Universities / Institutions
Outline of a Research Data Management Policy for Australian Universities / Institutions This documents purpose: This document is intended as a basic starting point for institutions that are intending to
Are You Big Data Ready?
ACS 2015 Annual Canberra Conference Are You Big Data Ready? Vladimir Videnovic Business Solutions Director Oracle Big Data and Analytics Introduction Introduction What is Big Data? If you can't explain
Harvard Library Preparing for a Trustworthy Repository Certification of Harvard Library s DRS.
National Digital Stewardship Residency - Boston Project Summaries 2015-16 Residency Harvard Library Preparing for a Trustworthy Repository Certification of Harvard Library s DRS. Harvard Library s Digital
A Service for Data-Intensive Computations on Virtual Clusters
A Service for Data-Intensive Computations on Virtual Clusters Executing Preservation Strategies at Scale Rainer Schmidt, Christian Sadilek, and Ross King [email protected] Planets Project Permanent
NIDA Big Data Strategic Planning Workgroup
Big Data Strategic Planning Workgroup June 17, 2015 Co- Chairs: Roger Little, Ph.D. () Massoud Vahabzadeh, Ph.D. () Today s Agenda Welcome Review of timeline, work group priorities/activities and June
Using the Grid for the interactive workflow management in biomedicine. Andrea Schenone BIOLAB DIST University of Genova
Using the Grid for the interactive workflow management in biomedicine Andrea Schenone BIOLAB DIST University of Genova overview background requirements solution case study results background A multilevel
Long Term Preservation of Earth Observation Space Data. Preservation Workflow
Long Term Preservation of Earth Observation Space Data Preservation Workflow CEOS-WGISS Doc. Ref.: CEOS/WGISS/DSIG/PW Data Stewardship Interest Group Date: March 2015 Issue: Version 1.0 Preservation Workflow
