Data Management Experiences and Best Practices from the Perspective of a Plant Research Institute

Similar documents
Management von Forschungsprimärdaten und DOI Registrierung. Dr. Matthias Lange (Bioinformatics & Information Technology) June 19 th, 2013

A Laboratory Information. Management System for the Molecular Biology Lab

Matthias Lange. gatersleben.de Bioinformatics Progress Seminar, May 08, BI Progress 05/07/2008 M. Lange: Data IPK

Leibniz DAAD Fellowships 2006 / (The description was drawn up by the hosting Leibniz - Institute) Fellowship - No. 1 & 2

Big data in cancer research : DNA sequencing and personalised medicine

Annex 6: Nucleotide Sequence Information System BEETLE. Biological and Ecological Evaluation towards Long-Term Effects

The National Plant Genome Initiative

It s our people that make the difference

Development and application of molecular and bioinformatic tools for the genetic monitoring of beets

IT BACKGROUND OF THE MEDIUM-TERM STORAGE OF MARTONVÁSÁR CEREAL GENEBANK RESOURCES IN PHYTOTRON COLD ROOMS

Study plan for the Master's degree course Biodiversität/Biodiversity

Study in Chinese Academy of Agricultural Sciences (CAAS)

An example of bioinformatics application on plant breeding projects in Rijk Zwaan

A Degree in Science is:

Delivering the power of the world s most successful genomics platform

E-Learning A new tool for the education of young scientists in the humane treatment of experimental animals: A contribution to the 3R

Research Roadmap for the Future. National Grape and Wine Initiative March 2013

How To Develop A Genomics Programme

Environmental Research and Innovation ( ERIN )

Introduction course leader and module leaders

Kazan (Volga region) Federal University, Kazan, Russia Institute of Fundamental Medicine and Biology. Master s program.

Our Competence to Your Business MITA. All rights reserved.`````

Graduate Degree Granting Units and Programs at Michigan State University Accounting - Master of Science Advertising - Master of Arts African American

Project GRACE Preliminary Author. List:

Intranet Solutions for PG School IARI. Project Brief

Adding Clinical Services to a Research Core Lab at the UW-Madison. Josh Hyman, Director UWBC DNA Sequencing Facility

Big Data: Challenges and Opportunities

Global Human Resource Programs Development in ASEAN

Building Bioinformatics Capacity in Africa. Nicky Mulder CBIO Group, UCT

Finding Your Future. ibio Institute Guide to Life Sciences Careers.

PODD. An Ontology Driven Architecture for Extensible Phenomics Data Management

Automated Library Preparation for Next-Generation Sequencing

A systems-approach to understand the regulation of plant growth and biomass production in bioenergy crops

MEETING AGRICULTURAL CHALLENGES IN A CHANGING WORLD: BIOTECHNOLOGY AND BIOINFORMATICS January 5, 2015 March 5, 2015

Teaching Computational Thinking using Cloud Computing: By A/P Tan Tin Wee

SowiDataNet. Bringing Social and Economic Research Data Together

PRACTICAL APPROACH TO ECOTOXICOGENOMICS

SUBJECT-SPECIFIC CRITERIA. Relating to the accreditation of Bachelor s and Master s degree programmes in life sciences (09 December 2011)

DataSafe Solutions. Protect your valuable genomic data

Mining integrated data sources using LAILAPS

From QMS ideal to performance reality - a hybrid performance management approach for genebanks

Undergraduate Program in Molecular Biology LAST REVISED October 2009

1. Program Title Master of Science Program in Biochemistry (International Program)

Membership Application Form Business and Biodiversity Initiative Biodiversity in Good Company

EUROPASS DIPLOMA SUPPLEMENT

Technische Universität Graz (TU Graz) Graz University of Technology

MSc/Postgraduate Diploma in Food Technology Quality Assurance For students entering in 2006

LABWORKS Laboratory Information Management System. be confident. be informed. take control.

irods in complying with Public Research Policy

Core Bioinformatics. Degree Type Year Semester Bioinformàtica/Bioinformatics OB 0 1

Academic Offerings. Agriculture

FIRST PROFESSIONAL SCIENCE MASTER S GRADUATES ARE FINDING JOBS. CONTACT: Diana Yates, Life Sciences Editor ; diya@illinois.

Copyright Soleran, Inc. esalestrack On-Demand CRM. Trademarks and all rights reserved. esalestrack is a Soleran product Privacy Statement

Careers in Computing and Science

Degrees Offered with Enrollment and Degrees Awarded All plans, programs, and degrees

A Strategy for Plant Breeding Data Management in International Agricultural Research

Making Data Citable The German DOI Registration Agency for Social and Economic Data: da ra

Introduction. Fujian Agriculture and Forestry University

BIOLOGICAL SCIENCES. What can I do with this degree? EMPLOYERS

DOE Office of Biological & Environmental Research: Biofuels Strategic Plan

A leader in the development and application of information technology to prevent and treat disease.

Introductory Biotechnology for High School Teachers - UNC-Wilmington

NI_360_301_A - Subpart A - GS-457, Soil Conservation Qualification Determinations. Subpart A - GS-457, Soil Conservation Qualification Determinations

Cooperation in Horticulture

Using Web Services for Customised Data Entry

Zurich-Basel Plant Science Center. PhD Program in Plant Sciences. Plant Sciences

Leading Genomics. Diagnostic. Discove. Collab. harma. Shanghai Cambridge, MA Reykjavik

All in a highly interactive, easy to use Windows environment.

Biotechnology Programs. Biotechnology Associate in Science Degree

Master's Degree Programme in Biotechnology (MBIOT)

BERKELEY CITY COLLEGE COLLEGE OF ALAMEDA LANEY COLLEGE MERRITT COLLEGE

Partnerships and Opportunities in Agricultural Research

Compendium to the National Technology Benchmarks - NTB 2010 TECHNICIAN

Biotechnology Careers

How To Become A Scientist

Cells. Cell Theory. plant cell. Cytoplasm and Organelles. animal cell

INFORMATION FOR UNDERGRADUATE STUDENTS majoring in PATHOBIOLOGY and VETERINARY SCIENCE

Masters related with Agrifood. Degrees related with Agrifood

General Services Administration Federal Supply Service Authorized Federal Supply Schedule Price List

Image Data, RDA and Practical Policies

FOR RISK ASSESSMENT FEDERAL INSTITUTE

Restructuring the Federal Governmental Research in Agriculture and Food in Germany

Some activities in JRC of relevance for Swedish scientists

MSc/ Postgraduate Diploma in Food Technology Quality Assurance For students entering in 2004

Title 16. Professional and Vocational Regulations Division 20. Veterinary Medical Board

Case Study Life Sciences Data

Biological Sciences B.S. Degree Program Requirements (Effective Fall, 2015)

Enable Location-based Services with a Tracking Framework

Curriculum Policy of the Graduate School of Agricultural Science, Graduate Program

Croatian health ecology specialists and inspectors keep harmful microbes and chemicals at bay

BIOLOGICAL SCIENCES. What can I do with this degree?

Determining the Use of Technology in World Food and Fiber Production

GRIN-Global Project. the global plant genebank information management system

Big Data Challenges. technology basics for data scientists. Spring Jordi Torres, UPC - BSC

#jenkinsconf. Jenkins as a Scientific Data and Image Processing Platform. Jenkins User Conference Boston #jenkinsconf

Agriculture, Food & Natural Resources Career Cluster Nursery and Landscape

Performance audit report. Workforce planning in Crown Research Institutes

ToxBank Requirements Analysis

The global need for sustainable food, fuel and fibre production

1. Programme title and designation Biochemistry. For undergraduate programmes only Single honours Joint Major/minor

Transcription:

DILS 2014, Lisbon, Portugal Daniel Arend Data Management Experiences & Best Practices 1/11 Data Management Experiences and Best Practices from the Perspective of a Plant Research Institute Daniel Arend, Christian Colmsee, Helmut Knüpffer, Markus Oppermann, Uwe Scholz, Danuta Schüler, Stephan Weise and Matthias Lange Leibniz Institute of Plant Genetics and Crop Plant Research (IPK) Gatersleben July 17, 2013

DILS 2014, Lisbon, Portugal Daniel Arend Data Management Experiences & Best Practices 1/11 Outline 1 Data Domains & Storage Database and Information Systems 2 Data Management Study Strategic Realignment LIMS 3 Lessons Learned Some Impressions

Data Domains & Storage Database and Information Systems Leibniz Institute of Plant Genetics and Crop Plant Research over 70 years tradition Federal ex-situ Genebank of Agricultural & Horticultural Crop Species (150.000 accessions) source of the german breeding industry total staff: 550 scientists: 200 (10 Bioinformaticians) about 30 research groups Bioinformatic Research Topics databases & information retrieval sequence-, network- & image-analysis big data management crop plants have multiple purposes: food, feed or bioenergy source DILS 2014, Lisbon, Portugal Daniel Arend Data Management Experiences & Best Practices 1/11

Data Domains & Storage Database and Information Systems Data Domains at the life sciences produce a huge data volume (primary data) e.g. Next-Generation Sequencing, Plant Phenotyping, System Biology... basis for modern research & publication process1 1 Craddock et al. Nature Review Microbiology, 2008 DILS 2014, Lisbon, Portugal Daniel Arend Data Management Experiences & Best Practices 2/11

Data Domains & Storage Database and Information Systems Data Storage at the huge amount of heterogeneous data 80 million data files (230 TB) stored on a HSM over 600 different file types most of the files 1 MB file size up to 600 GB Distribution of file size other 43.31% Distribution of file media types application 20.65% audio/video 0.55% 1.8 10 7 1.6 10 7 1.4 10 7 file number 1.2 10 7 1 10 7 8 10 6 6 10 6 image 28.33% 4 10 6 2 10 6 text 7.16% 0 10 0 10 1 10 2 10 3 10 4 10 5 10 6 10 7 10 8 10 9 10 10 10 11 10 12 file size in bytes application: exe,xls,ppt,... audio/video: wav,mpg,avi,... image: bmp,jpg,png,... text: txt,html,rft,... other: zip,gzip,pdb,... DILS 2014, Lisbon, Portugal Daniel Arend Data Management Experiences & Best Practices 3/11

DILS 2014, Lisbon, Portugal Daniel Arend Data Management Experiences & Best Practices 4/11 IPK Database and Information Systems Data Domains & Storage Database and Information Systems over 20 different project-specific databases & information systems specialized on different research fields & data domains every system uses its own database schema & storage backend heterogeneous development & maintenance

DILS 2014, Lisbon, Portugal Daniel Arend Data Management Experiences & Best Practices 5/11 Data Management Study Data Management Study Strategic Realignment LIMS initiated by IPK directors board in 2010 requirement analysis for an institute-wide Laboratory Information Management System (LIMS) suggest general policies for sustainable data management

DILS 2014, Lisbon, Portugal Daniel Arend Data Management Experiences & Best Practices 6/11 Strategic Realignment Data Management Study Strategic Realignment LIMS

DILS 2014, Lisbon, Portugal Daniel Arend Data Management Experiences & Best Practices 7/11 Laboratory Information and Management System Data Management Study Strategic Realignment LIMS LIMSOPHY (AAC Infotray) selected commercial product no open source introduced at the IPK in 2011 long term support contract + flexible & expandable + additional external support + less training for scientists & technicians (different computer skills) + embedded into central storage backend - developing of new modules & user acceptance for integration of lab workflows - no out-of-the-box module for data publication ( see e!dal poster)

DILS 2014, Lisbon, Portugal Daniel Arend Data Management Experiences & Best Practices 8/11 Summary - Strategic Realignment Lessons Learned Some Impressions Organisational Actions build central bioinformatics service group with a scientific administration financed by research funds & institutional budget core-financed service team for LIMS inter-departmental coordination of bioinformatics research Infrastructure Actions central storage systems & databases combining in-house & public data publication systems

DILS 2014, Lisbon, Portugal Daniel Arend Data Management Experiences & Best Practices 9/11 Summary - Some Impressions Lessons Learned Some Impressions LIMS is used by over 40 project and 10 research groups 4.100 substances & 1.400 experiments stored 1.200.000 measurements & 400.000 linked files goverment verified GMO managment manage complete services processes, e.g. NGS sequencing different new web information systems e.g. management of chemicals, photo-archive... design new system within 2 days, without experiences in programming already 30 manually registrated DOIs over DataCite in future automatic process using e!dal

DILS 2014, Lisbon, Portugal Daniel Arend Data Management Experiences & Best Practices 10/11 Acknowledgment Bioinformatics and Information Technology (BIT): Danuta Schüler Matthias Lange Christian Colmsee Uwe Scholz Genbank Documentation: (GED): Stephan Weise Markus Oppermann Helmut Knüpffer This work was performed within the German-Plant-Phenotyping Network, which is funded by the German Federal Ministry of Education and Research (project identification number: 031A053)

DILS 2014, Lisbon, Portugal Daniel Arend Data Management Experiences & Best Practices 11/11 Thank you for your attention http://www.ipk-gatersleben.de