DATA MANAGEMENT PLAN IN THE REAL LIFE SCIENCES

Similar documents
e-biogenouest : The Tools

A curated Domain centric shared Docker registry linked to the Galaxy toolshed

Cloud Ready for Bioinformatics?

Workprogramme

COPO: Collaborative Open Plant Omics. Rob Davey Data Infrastructure and Algorithms Group Leader

Using the Grid for the interactive workflow management in biomedicine. Andrea Schenone BIOLAB DIST University of Genova

Il est repris ci-dessous sans aucune complétude - quelques éléments de cet article, dont il est fait des citations (texte entre guillemets).

Hadoopizer : a cloud environment for bioinformatics data analysis

Bioinformatique sur Cloud Cas d usage avec le portail Galaxy

Web and Big Data at LIG. Marie-Christine Rousset (Pr UJF, déléguée scientifique du LIG)

Exploitation of ISS scientific data

Quel pilote ètes-vous

Cloud pour la Bioinformatique

Sequencing data. And other experimental data. EMBL-EBI data resources growth

IFB s e-infrastructure

Exploring the roles and responsibilities of data centres and institutions in curating research data a preliminary briefing.

Early Cloud Experiences with the Kepler Scientific Workflow System

SURFsara Data Services

EMBL Identity & Access Management

Formation à l ED STIC ED STIC Doctoral education. Hanna Klaudel

EUDAT. Towards a pan-european Collaborative Data Infrastructure. Willem Elbers

IEEE International Conference on Computing, Analytics and Security Trends CAST-2016 (19 21 December, 2016) Call for Paper

NIH Commons Overview, Framework & Pilots - Version 1. The NIH Commons

You can choose to install the plugin through Magento Connect or by directly using the archive files.

Standard Big Data Architecture and Infrastructure

ENABLING DATA TRANSFER MANAGEMENT AND SHARING IN THE ERA OF GENOMIC MEDICINE. October 2013

Towards a galaxy.prabi.fr

Data Intensive Research Initiative for South Africa (DIRISA)

Preserving French Scientific data

Smart Specialization Regional Innovation Strategy (SRI 3S) in Provence Alpes Côte d Azur

#jenkinsconf. Jenkins as a Scientific Data and Image Processing Platform. Jenkins User Conference Boston #jenkinsconf

Lecture 11 Data storage and LIMS solutions. Stéphane LE CROM

Euro-BioImaging European Research Infrastructure for Imaging Technologies in Biological and Biomedical Sciences

Service Road Map for ANDS Core Infrastructure and Applications Programs

Digital libraries of the future and the role of libraries

Le cloud IFB et son instance Galaxy

Automatic Timeline Construction For Computer Forensics Purposes

Virginia Commonwealth University Rice Rivers Center Data Management Plan

CFT ICT review Questions/Answers

Brian Connolly Systems Engineer, LabKey Software LabKey Server in the Cloud

DATA MANAGEMENT PLAN DELIVERABLE NUMBER RESPONSIBLE AUTHOR. Co- funded by the Horizon 2020 Framework Programme of the European Union

"Internationalization vs. Localization: The Translation of Videogame Advertising"

THE HELMHOLTZ INVENIO REPOSITORY PROJECT :

BIOINFORMATICS Supporting competencies for the pharma industry

Making university-industry partnerships work: trials and lessons. Marie-Odile OTT, PhD Inspectrice générale

How To Write A Blog Post On Globus

Faut-il des cyberarchivistes, et quel doit être leur profil professionnel?

Euro-BioImaging European Research Infrastructure for Imaging Technologies in Biological and Biomedical Sciences

Copyright 2014, Oracle and/or its affiliates. All rights reserved.

Data Management Plan. Name of Contractor. Name of project. Project Duration Start date : End: DMP Version. Date Amended, if any

Managing the Knowledge Exchange between the Partners of the Supply Chain

Report of the DTL focus meeting on Life Science Data Repositories

Semantic Workflows and the Wings Workflow System

OpenAIRE Research Data Management Briefing paper

Information and Communications Technology Strategy

A brief introduction to Cytoscape

Stockage distribué sous Linux

The SIST-GIRE Plate-form, an example of link between research and communication for the development

CDPP in Europlanet/IDIS FP6 and FP7 C. Jacquey, N. André, B. Cecconi, V. Génot, C. Briand. M. Gangloff, M. Bouchemit, E. Budnik, E.

BBSRC TECHNOLOGY STRATEGY: TECHNOLOGIES NEEDED BY RESEARCH KNOWLEDGE PROVIDERS

Grid Computing Perspectives for IBM

Building Bioinformatics Capacity in Africa. Nicky Mulder CBIO Group, UCT

Transcription:

DATA MANAGEMENT PLAN IN THE REAL LIFE SCIENCES Yvan Le Bras Cyril Monjeaud Olivier Collin Jacques Nicolas CNRS UMR 6074 IRISA-INRIA

Context Now : Genomics : Next Generation Sequencing Now : Proteomics Next : Bio-imaging Kahn. On the future of genomic data. Science (2011) vol. 331 (6018) pp. 728-9 Digital data Huge amount Heterogenous Critical situation for some laboratories

Context Exchange from one domain to another From ICT / IT to scientific domains Between scientific domains Life science integrators e-science integrators

E-BIOGENOUEST From the e-biogenouest project to the first french e- Science center : CeSGO

E-Biogenouest Started in May 2012 for 3 years Funded by Brittany and Pays de la Loire E-science initiative for the Biogenouest network Test an e-science approach Roadmap preparation

E-Biogenouest Started in May 2012 for 3 years Funded by Brittany and Pays de la Loire E-science initiative for the Biogenouest network Test an e-science approach More than 120 scientists trained! 1669 meetings ;) Roadmap preparation -UEB C@mpus -CPER -FRM -INCa -H2020 Health 7 submitted publications Agro Environment IT More than 200 users! An innovative VRE concept -Mission interdisciplinarité CNRS -PIA -IFB -Fce Génomique -Rapsodyn -Sciences citoyennes

VRE: a tool for e-science application Virtual Research Environment Data User Web portal Collaboration softwares Community Processing resources

An innovative VRE approach Research Lifecycle Open source solutions Mutualise Don t reinvente the wheel win win Break down silos http://www.jisc.ac.uk/whatwedo/campaigns/res3/jischelp.aspx#simulate

Continuum HubZero Galaxy EMME Communauté Continuum data management & analysis Collaborative environment Collaboration

HUBzero : Scientifique collaborative platform ebgo HUB HUBzero to share knowledge and manage groups and projects Informations 218 users 111 projects 53 groups 729 resources > 400 uniq users uniques by month Purdue University M. McLennan, R. Kennell. Comput Sci Eng, 12:48-53, 2010.

ISAtools : Experimental data management EMME ISAtools suite to store data & metadata Fonctionalities -based on biomed ontologies -bridge between existing biomed standards -format publication submission -Pydio to upload data -biological investigation repository (data + metadata) Oxford eresearch Centre P. Rocca-Serra et al. Bioinformatics, 26;254(6), 2010

Galaxy : Data analysis web platform GALAXY by GenOuest To analyse & share data as processes and tools Informations 34917 jobs 150 users More than 800 outils Share - data - histories - workflows - tools Penn state university J. Goecks, A. Nekrutenko, J. Taylor, et al. Genome Biol, 25;11(8):R86, 2010

Pydio : File sharing platform Pydio by GenOuest To store & share data as links Informations -Galaxy workspace -EMME workspace -INCa workspace Share - data via URI - control - safety - privacy Abstrium SAS Charles du jeu, David Gillard et al.

What are our goals? For society Open Science and open data For end users scientists communities Data management plan Preserve, access, share & visualise (data & analytics porocesses) Help for project management For ICT Facilitate the use of tools Research Service Accelerate switch between dev to production state Optimise infrastructures use (storage, computing & network ) Infrastructure for data infastructure of data

DMP ON THE LINE From data storage to publication

CeSGO : Data storage

Data storage

Data storage URL generation

Metadata management

Metadata management

Metadata management Configuration

Metadata management Configuration

Metadata management Configuration

Metadata management Configuration

Metadata management Configuration

Metadata management Configuration

Metadata management Isacreator

Metadata management Isacreator

Metadata management Isacreator: genomespace

Metadata management Isacreator: local

Metadata management Isacreator: choose a config

Metadata management Isacreator: existing isatab

Metadata management Isacreator: existing isatab

Metadata management Isacreator: existing isatab

Metadata management Isacreator: Investigation

Metadata management Isacreator: Study

Metadata management Isacreator: Study 1

Metadata management Isacreator: Assay 1

Metadata management Isacreator: Assay 1 / Data

Metadata management Isacreator: Study

Metadata management Isacreator: create an ISArchive

Metadata management Isacreator: Study

Data analysis Metadata & data analysis: Galaxy

Data analysis Metadata & data analysis: Galaxy / Import ISArchive

Data analysis Metadata & data analysis: Galaxy / Import ISArchive

Data analysis Metadata & data analysis: Galaxy / Extract ISArchive

Data analysis Metadata & data analysis: Galaxy / Extract ISArchive

Data analysis Metadata & data analysis: Galaxy / Extract ISArchive

Data analysis Metadata & data analysis: Galaxy / Extract ISArchive

Data analysis Metadata & data analysis: Galaxy / Download data

Data analysis Metadata & data analysis: Galaxy / Download raw data

Data analysis Metadata & data analysis: Galaxy / Extract ISArchive

Data analysis Metadata & data analysis: Galaxy / Extract ISArchive

Metadata repository Metadata repository: Bii

Metadata repository Metadata repository: Bii 1 study

Metadata repository Metadata repository: Bii Data via URL / Protocols

CeSGO & DMP Données administratives Dénomination du projet Description du projet Nom / ID du responsable Agence de financement Version du DMP Politique appliquée aux données Responsabilités et ressources Collecte / création de données Description du jeu de données Protocole Méthode Equipements Assurance qualité appliquée Documentation et métadonnées Entrepôt Bii Standard de métadonnées : ISA-TAB

CeSGO & DMP Stockage, sauvegarde et sécurité des données Datacenter CeSGO pendant la durée du projet (max : 5 ans) Ethique et cadre légal Protection des données sensibles ou personnelles CC version 4.0 Partage des données Accès libre ou restreint Délai : 3 ans max après leur collecte Entrepôts (GEO, Genbank, SRA, Uniprot, PRIDE,.) Outils nécessaires à la réutilisation / validation des données Data paper Sélection et archivage des données

CESGO: 5 GOALS From Data Mangement to Accessibility

CeSGO : Western France e-science metadata Data management URI Life sciences protocols

CeSGO : Western France e-science New VREs! Open Data

CeSGO : Western France e-science New VREs! Connected using semantic web approaches Thanks to DOI attribution Linked Data

CeSGO : Western France e-science cloud Reproducibility Galaxy versioning docker

CeSGO : Western France e-science wiki Accessibility Analytics processes Public resources Experiments Publications

Merci de votre attention La plate-forme Bio-informatique GenOuest Le groupe Symbiose IRISA/INRIA GenOuest-Dyliss-Genscale ebgo HUB (collaboration) Scitizen portal (citizen science) EMME portal (data management) Galaxy instance (data analysis) GO4Bioinformatics (education ) http://www.e-biogenouest.org/ http://scitizen.genouest.org http://emme.genouest.org/ http://galaxy.genouest.org/ https://www.e-biogenouest.org/einfrastructure/education

CeSGO : Western France e-science New VREs!