Forschungsdatenmanagement mit escidoc Matthias Razum Workshop DOI-Registrierung Hannover, 3.11.2011
Data in the Publication Process Today Library Publication Private Files Manuscript Data Metadata Research J. Helly, H. Staudigel, and A. Koppers, Scalable models of data sharing in Earth sciences, Geochem. Geophys. Geosyst., 4(1),1010, doi:10.1029/2002gc000318, 2003 Page 2
Research Data Management A Pressing Issue Several researchers describe particular research support services they lack. Data management and storage topped the list of gaps in research tools and services. A Slice of Research Life: Information Support for Research in the US S. Kroll and R. Forsman: A Slice of Research Life: Information Support for Research in the United States, Report commissioned by OCLC Research in support of the RLG partnership, published online at http://www.oclc.org/research/publications/library/2010/2010-15.pdf, June 2010 Page 3
Research Data Management Requires more than Storing Bits and Bytes A researcher in digital arts says he needs something to handle all the media and metadata. Data access and formatting are not standardised, even within one lab. A Slice of Research Life: Information Support for Research in the US Page 4
Research Data Management Turns Data into Information Equally, if not more important than its own data and information needs, today s research community must also assume responsibility for building a robust data and information infrastructure for the future. International Council for Science Scientific Data and Information A report of the CSPR Assesssment Panel. International Council for Science, 2004 Page 5
Research Data Management Multiple Dimensions Technical Organizational Scientific Social Legal Page 6
escidoc Vision Collaboration Idea Exploration Data Acquisition Experiment Aggregation Analysis Publication Archival escidoc Applications, Services, and exisiting Tools escidoc Infrastructure Continuum of Data Page 7
Collaboration, Publication, and Curation Boundary Treloar, A., Groenewegen, D. und Harboe-Ree, C., 2007. The Data Curation Continuum. D-Lib Magazine, 13 (9/10) Page 8
escidoc Services Data Acquisition Service Control of Named Entities Deposit (SWORD) Citation Style Service Transformation Validation Search and Indexing Technical Metadata Extraction Duplicate Detection PID Service Digilib OAI-PMH Admin Tool Deposit Service Policy Enforcement Point Group Handler Role Handler User Account Handler Admin Service Organizational Unit Service Context Service Container Service Item Service Content Model Service Set Service Scope Service Statistics Data Service Aggregation Definition Service Report Definition Service Report Service Statistics Manager Object Manager Security Page 9
Research Data Differs from Traditional Publications Flexible content models Support for all file formats Compound objects Arbitrary metadata profiles Page 10
Item Object Pattern Container Item Metadata Item Metadata Thumbnail Web Resolution Original Resolution Thumbnail Web Resolution Original Resolution Metadata Metadata Metadata Metadata Metadata Metadata Page 11
Reliable Citations Concept of a Trusted Repository Versioning (including stable released versions) Persistent identifiers Withdrawal instead of deletions Page 12
Persistent Identification and Linking PID Service PID 1 PID 2 PID 3 PID 4 escidoc Other Repositories Page 13
ismeasuredwith Organizational Unit Institut Organizational Unit Work Group User Account Context Nano Lab Context Rigs Context Instruments consistsof Container Study X Other Studies Rig Descriptions Instrument Descriptions Container Investigation 1 Other Investigations for Project X requires Other Datasets for Investigation 1 Container Dataset Item Sample isconfigurationof Item Datafile Item Configuration Page 14
Integration with Existing Tools/Instruments Open programming interfaces (REST-style) Application-agnostic design Extensibility (SOA) CRIG claim Page 15
Preservation Provenance Data Audit Trail PREMIS Event History Standardized XML Object Containers Simple Backup/Rebuild Capabilities Page 16
BW-eLabs BW-eLabs gives access to virtual and remote experiments in the field of nano technology and digital holography. Key concepts include: reproducability of experiments discoverability of and access to primary data storage and curation of all artifacts that emerge throughout the research process The project is funded for 2 ½ years by the Ministry of Science, Research, and Art, Baden-Württemberg Page 17
BW-eLabs: Collecting Data throughout the Research Process Preparation Synthesis and Online Analysis Offline Analysis and Visualisization Publication Common Data Infrastructure Experiment Experiment Experiment Experiment Data Objects Data Objects Data Objects Page 18
Instruments in the Nano Laboratory Absorption Spectrometer Autosampler Laboratory Microwave Photoluminiscence Spectrometer Source: S. Einwächter, M.Krüger, 2010. Syringe Pump Lab Computer Page 19
More About the Research Question http://youtu.be/gwq0r-bureu?hd=1 Page 20
BW-eLabs: Data Acquisition in the Lab Laboratory 5 Office 1 9 Data files from online analysis esync Daemon 4 elab Solution Instrument (e.g. spectroscope) Data Center Metadata Extractor 6 Data file for metadata extraction Monitored folder Replicated data files Deposit Service POSTs new configuration to esync daemon 2 Creates new experiment (Container) Infrastructure 3 7 MD MD Extracted metadata escidoc Items 8 Page 21
Start of Data Acquisition Page 22
BW-eLabs: Metadata Model Institution Investigator Study Instrument Rig Investigation Dataset Sample Datafile Derived from B. Matthews et al. Using a Core Scientific Metadata Model In Large-Scale Facilities. 5th International Digital Curation Conference. London, 2009. Related Datafile Page 23
elab Solution Page 24
Fragen? matthias.razum@fiz-karlsruhe.de www.escidoc.org