Data Archiving in Energy Research. Iver Lauermann Helmholtz-Zentrum Berlin für Materialien und Energie (HZB)



Similar documents
Problems in research data archiving

Enabling a data management system to support the good laboratory practice Masterthesis Status Report Miriam Ney (13.01.

Enabling a data management system to support the good laboratory practice Master Thesis Final Report Miriam Ney ( )

Technical concepts of kopal. Tobias Steinke, Deutsche Nationalbibliothek June 11, 2007, Berlin

A Life Cycle Model for Collaborative Research Environments

Introduction to the Research Data Center PsychData

LJMU Research Data Policy: information and guidance

Experience with Knowledge Management at GRS Part 1: Document and Information Management

Research Data Management in the Lab

SharePoint is Not an ECM System. Jason Lamon

Considerations for Management of Laboratory Data

GOOD LABORATORY PRACTICE (GLP)

Management von Forschungsprimärdaten und DOI Registrierung. Dr. Matthias Lange (Bioinformatics & Information Technology) June 19 th, 2013

Content Management in Web Based Education

Berlin, 20 August 2010

EUROLAB Cook Book Doc No. 13 ELECTRONIC RECORDS

ETD The application of Persistent Identifiers as one approach to ensure long-term referencing of Online-Theses

Dr. phil. Karin Ludewig, Humboldt Universität zu Berlin:

Newspaper Preservation. by H.R. Mohan Associate VP (Systems) The Hindu Chennai

University of Louisiana System

General principles and architecture of Adlib and Adlib API. Petra Otten Manager Customer Support

2011 FileTek, Inc. All rights reserved. 1 QUESTION

Long-term archiving and preservation planning

ETD 2003, Berlin. Persistent Identifier Management at. Die Deutsche Bibliothek

SCIENTIFIC RECORD KEEPING

nestor - Network of Expertise in Long-Term Storage and Long-Term availability of Digital Resources in Germany

WILL THE REAL ELECTRONIC DOCUMENT MANAGEMENT SYSTEM PLEASE BE INSTALLED!

Enhancing Web Publishing with Digital Asset Management - Using Open Text Artesia DAM to enhance your Open Text WCMS (Red Dot) web sites

June Advanced Data Reproduction and Storage System Data Center in a Box

LINKED HERITAGE kick-off meeting Rome, 29/4/2011

Johns Hopkins University Data Management Services

How To Preserve Digital Objects

The Max Planck Society and the Max Planck Digital Library

Extended Abstract Advancement through technology? The analysis of journalistic online-content by using automated tools 1

Streamline Enterprise Records Management. Laserfiche Records Management Edition

Cloud JPL Science Data Systems

Libraries and Large Data

Introduction. Chapter 1. Introducing the Database. Data vs. Information

How To Scan A Document

Taming a Wild Workflow and Domesticating our Digital Data

Creating corporate knowledge with the PADDLE system

IBM Archiving Solutions for SAP

CiRM to NPM Change

Managing explicit knowledge using SharePoint in a collaborative environment: ICIMOD s experience

Will Object Stores Dominate Storage of Unstructured Data?

Archiving. Paper Documents

Mediacenter / Das Informations-, Kommunikations- und Medienzentrum (ICMC / IKMZ) of Brandenburg Technical University of Cottbus

EMR. The Easiest Path to. White Paper. Healthcare Document Management for Provider Organizations

Records Management Self-Evaluation Guide

Discovery of Electronically Stored Information ECBA conference Tallinn October 2012

Clinical database/ecrf validation: effective processes and procedures

Berufsakademie Mannheim University of Co-operative Education Department of Information Technology (International)

Introduction Thanks Survey of attendees Questions at the end

SRS BIO OPTICAL WORKFLOW

Archiving Your Photo Collection I

Metropolitan New York Library Council (METRO) 2006 Digitization Survey Final Report

Laserfiche. and SharePoint Integration. Your potential, realized. Learn More Inside. Include imaged documents in collaborative processes.

estatistik.core: COLLECTING RAW DATA FROM ERP SYSTEMS

Archival Data Format Requirements

INFORMATION UPDATE: Removable media - Storage and Retention of Data - Research Studies

File Shares to SharePoint: 8 Keys to a Successful Migration

Southwest Solutions Group Virtual Doxx Software Application Overview. The software can be configured to model:

APPLICATION FOR LOUISIANA STATE ARCHIVES IMAGING EXCEPTION TO LA. R.S. 44:39 SSARC 790

How To Write A Book On The Digital Age Of Science

EERQI. European Educational Research Quality Indicators. WP 14 Project Portal - Project information & dissemination

inforouter Version 8.0 Administrator s Backup, Restore & Disaster Recovery Guide

EFFICIENT ELEARNING COURSE DESIGN AND MEDIA PRODUCTION

Storage Options. February 24, 2011

Unveiling Research Data Stocks: A Case of Humboldt-Universität zu Berlin

Guidelines for Digital Imaging Systems

GEOTHERMAL ERA-NET: WP3: Towards a European Geothermal Database

Planning and Infrastructure for Analog to Digital Preservation Projects

Part 2: Records and Information Management: Creation and Use

Recordkeeping for Good Governance Toolkit. GUIDELINE 14: Digital Recordkeeping Choosing the Best Strategy

Visual Sociology for Research and Concept Development

Managing Shared Electronic Workspace

Media: Conception and Production (B.A.) English taught courses

An Introduction to TextGrid

Considerations for Research Data Management

Sharepoint vs. inforouter

Planning, Provisioning and Deploying Enterprise Clouds with Oracle Enterprise Manager 12c Kevin Patterson, Principal Sales Consultant, Enterprise

Object Storage A Dell Point of View

Tivoli Storage Manager - Produktübersicht

15 Jahre Software-Produktlinien: Einführung und aktueller Stand

The Libraries Role in Research Data Management: A Case Study from the University of Minnesota

Plagiarism. Dr. M.G. Sreekumar UNESCO Coordinator, Greenstone Support for South Asia Head, LRC & CDDL, IIM Kozhikode

Why archiving erecords influences the creation of erecords. Martin Stürzlinger scopepartner Vienna, Austria

EHR 101. A Guide to Successfully Implementing Electronic Health Records

Unterstützung datenintensiver Forschung am KIT Aktivitäten, Dienste und Erfahrungen

Using SharePoint to Manage Project Documentation

EERQI Introduction (from the technical point of view)

Report of the DTL focus meeting on Life Science Data Repositories

NSLS-II Data Management Framework Arman Arkilic Brookhaven National Lab NSLS-II

Copyright 2010, Oracle. All rights reserved.

Policy Management Build vs. Buy: Why Policy Management Software Makes Sense

Informatized Work and its Socio-Cultural Implications Caroline Roth-Ebner

Document Imaging Services

Recent Progress with RusDML

User Guide of edox Archiver, the Electronic Document Handling Gateway of

GSICS Working Group on Data Management

Transcription:

Data Archiving in Energy Research Iver Lauermann Helmholtz-Zentrum Berlin für Materialien und Energie (HZB)

Outline What is archiving? Why archiving of scientific data? Rules at HZB for archiving Recent history of archiving at HZB Discussion: what, who, when, where Data formats Obstacles to archiving in everyday work Examples for archiving at HZB Electronic lab-book Scanning of paper lab-book Storage of large amounts of data Conclusions 2

What is archiving of data? Archiving Secure, long-time storage of scientific data and methods secure: self-burned cds or dvds are not durable long-time: electronic data-formats change quickly and hardware fails with time, formats need to be adapted Requires meta-data, i.e. a description of the stored data that allows (expert) users the interpretation of the archived data 3

Good reasons to archive data Makes later use of data by yourself or others possible! usually only fraction of available data published new knowledge makes later use worthwhile for publishing of data, archiving is necessity condition Preservation of evidence for data from a publication In many alleged cases of scientific misconduct original data had disappeared 4

Bad examples: the NASA effect Data from space explorer Pioneer were stored in 4 different formats in 1979, however, in 1994 no hardware was available at NASA to retrieve the data from any of them. 1 Data from 30 years of space exploration were stored on 1.2 million magnetic tapes in 1976 but in the mid-nineties many were useless due to insufficient description on the tapes. 2 1 Hilmar Schmundt: Im Dschungel der Formate. In: Der Spiegel 26/2000. URL: http://www.spiegel.de/druckversion/0,1588,82510,00.html 2 Archimedes. Wir verlieren unser Gedächntnis, vom 04.05.1999. URL: http://www.artetv.com/hebdo/archimed/19990504/dtext/sujet1.html

History Initiated by Wolfgang Fritsch (HMI library), 2 working groups (structure of matter department and energy department) on archiving established early in 2007 In 2007/2008 discussed current situation and wrote 2 papers with recommendations July 2008: 2 day Workshop on archiving at HMI with participation of BESSY and external speakers Final paper December 2010: announcement of archive server by IT department

Recommendations: what, who, when, where, how What: initially only data, on which a publication is based, has to be archived but in the long run all research data should be archived Who: head of organizational unit (i.e. institute director) is responsible but each scientist should be educated and be responsible for archiving When: on a regular basis; after a publication has appeared Where: central storage and support by IT is necessary How: As easy for users as possible, as flexible as possible, including current methods

Why archiving of data? THE RULES FOR ARCHIVING SCIENTIFIC DATA AT HZB Any scientific (raw) data on which some publication is based is documented and securely stored, for at least 10 years, at the organisational (department, institute) unit of its origin (from Regeln zur Sicherung guter wissenschaftlicher Praxis und zum Verfahren bei wissenschaftlichem Fehlverhalten 14.06.2002. Based on the Proposals for Safeguarding Good Scientific Practice, January, 1998 by the Deutsche Forschungsgemeinschaft (DFG)

Where? Since Dec. 2010 official archive server at HZB Data are stored on magnetic tape, can not be accessed unless removed from archive Path \\home\nb-archiv\dauer\ordnername can be addressed by every internal HZB computer and filled with data Content of folders is copied on several magnetic tapes when no more changes detected, removed from hard drive Archiving over a period of 5 or 10 years Archives are checked once a year to ensure legibility Only members of respective department can access data (from FM-D info, Neues und Wissenswertes aus der zentralen Datenverarbeitung Mai 2011)

Publication amendment form

Rules for orphaned data The existing rules have been amended, late in 2010, with regulations about orphaned data or rather orphaned-to-be data, i.e. data from scientific organisational units which are being discontinued, in case there is no other scientific organisational unit which can take care of such data. In such case data should be documented and taken care of by the library for data in paper form by IT services for digital data (HZB Intranet Bibliothek 23. Mai 2011)

Problems with archiving in the energy department Experiments are manifold and complex, i.e. multi-step preparation of materials and thin films (absorbers, buffer and window layers, characterization of materials and devices by a multitude of methods Data are generated in many different electronic formats or on paper Volume of data rather small but large amount of variable meta data 12

Workflow in the PV department Planung/Simulation Externe Proben Probenpreparation ( Messungen (in-situ Teststruktur Grundlagen- Forschung Processing (... ARC (Kontakte, Solarzelle Qualitätscheck (... I-V,IPCE ) 13

Problems with archiving in the energy department Research is largely based on graduate students, large fluctuation makes archiving difficult but especially necessary No common format or specific rules, quality of archived data depends on individuals Extra workload for scientists, who often don t see the necessity Archiving of raw data without link to a publication is often not regarded as necessary nor feasible 14

Archiving of all data vs. only publication data Current Archiving of publication data Mid-long term Archiving of all data Disadvantage: Archiving of only publication relvant data is extra overhead' because not integrated in a normal work flow of acquiring and automatically archiving data However, it is debatable if complete archiving of all data and meta data is practical and desirable 15

Examples for formats of meta data Electronic lab book: data table 16

Examples for formats of meta data Electronic lab book: search mask 17

Examples for formats of meta data Scanned lab book for spectrocopic meta data 18

Situation at the Berlin Neutron Scattering Center BENSC current state/ BENSC-instrumentation BADGE portal for BENSC-user sample environment experiment control: CARESS, MAD, IGOR E-Hall: E1,E2,E3,E4,E5,E6,E7,E9,E10 V-Hall: V1,V3,V6,V7,V12 NLH2: V5,V15,V16 experiment-specific data planned proposal number, etc. meta data raw data, log-files (automatic copy) NEXUS data format central IT Oracle- Data base WEB- portal Meta data-server SAN-file saving: meta data experiment, year, month, run-number raw data + complete experiment description meta data

Conclusions Data archiving is necessary and compulsory at HZB Centralized storage space for paper-based and electronic data is available So far archiving is done only in the departments on local computers In most departments only publication-related data archived Qualtity of archiving very different and dependent on individuals Very few common standards, formats and rules established Improvement by education and better support necessary

Acknowledgements Wolfgang Fritsch (HZB library) Jens Uwe Hoffmann (structure of matter department) Marcus Kühbacher (structure of matter department) Christof Rethfeldt (structure of matter department) Andreas Schöpke (energy department) Klaus Schwarzburg (energy department) And you for your attention!