Le cloud IFB et son instance Galaxy

Similar documents
Le cloud IFB et son instance Galaxy

Une e-infrastructure nationale en bioinformatique

Institut Français de Bioinformatique, Un Cloud pour les Sciences du Vivant

Bioinformatique sur Cloud Cas d usage avec le portail Galaxy

Cloud pour la Bioinformatique

Sequencing data. And other experimental data. EMBL-EBI data resources growth

Cloud Ready for Bioinformatics?

IFB s e-infrastructure

Deployment of BioXSDenabled services on a Cloud. christophe.blanchet@ibcp.fr

DATA MANAGEMENT PLAN IN THE REAL LIFE SCIENCES

Towards a galaxy.prabi.fr

A curated Domain centric shared Docker registry linked to the Galaxy toolshed

E-SCIENCE IN WESTERN FRANCE : THE BEGINNING

E-SCIENCE IN WESTERN FRANCE :

Ins$tut Français de Bioinforma$que Current situa+on and prospect. IFB General Assembly Gif- sur- Yve=e, January

Open Cloud System. (Integration of Eucalyptus, Hadoop and AppScale into deployment of University Private Cloud)

SURFsara HPC Cloud Workshop

SURFsara HPC Cloud Workshop

Eoulsan Analyse du séquençage à haut débit dans le cloud et sur la grille

Administrer les solutions Citrix XenApp et XenDesktop 7.6 CXD-203

StratusLab project. Standards, Interoperability and Asset Exploitation. Vangelis Floros, GRNET

icer Bioinformatics Support Fall 2011

Stockage distribué sous Linux

Big Data and Cloud Computing for GHRSST

e-biogenouest : The Tools

Restricted Document. Pulsant Technical Specification

Le Cloud Computing selon IBM : stratégie et offres, zoom sur WebSphere CloudBurst

Les nouveautés 2014 mise en lumière

SOFTWARE DEFINED SOLUTIONS JEUDI 19 NOVEMBRE Nicolas EHRMAN Sr Presales SDS

UGENE Quick Start Guide

SCC / QUANTUM Kick Off 2015 Comment gérer efficacement des workflows et archives de données non structurées?

Cloud Computing through Virtualization and HPC technologies

Vincent Rullier Technology specialist Microsoft Suisse Romande

Alternative Deployment Models for Cloud Computing in HPC Applications. Society of HPC Professionals November 9, 2011 Steve Hebert, Nimbix

Maquette DB2 PureScale

Group Projects M1 - Cubbyhole

Microsoft Hyper-V chose a Primary Server Virtualization Platform

Genomic Applications on Cray supercomputers: Next Generation Sequencing Workflow. Barry Bolding. Cray Inc Seattle, WA

DU PROJET E-BIOGENOUEST À CESGO, PREMIER CENTRE E-SCIENCE EN FRANCE : MISE EN PLACE D UNE INFRASTRUCTURE DE DONNÉES OUVERTE

Seed4C: A Cloud Security Infrastructure validated on Grid 5000

Euro-BioImaging European Research Infrastructure for Imaging Technologies in Biological and Biomedical Sciences

OPEN SOURCE AND BOTTOM-UP VRE APPROACH IN WESTERN FRANCE

Agenda. Begining Research Project. Our problems. λ The End is not near...

Managing and Conducting Biomedical Research on the Cloud Prasad Patil

Data Centers and Cloud Computing. Data Centers

Copyright 2012, Oracle and/or its affiliates. All rights reserved.

Jimmy Hébergement Cloud - TechDay

Open Source Cloud Computing Management with OpenNebula

Big Data in BioMedical Sciences. Steven Newhouse, Head of Technical Services, EMBL-EBI

An Energy-aware Multi-start Local Search Metaheuristic for Scheduling VMs within the OpenNebula Cloud Distribution

Enabling multi-cloud resources at CERN within the Helix Nebula project. D. Giordano (CERN IT-SDC) HEPiX Spring 2014 Workshop 23 May 2014

Nebula Cloud Computing Project: Background, Technology, Operations, Challenges, and Status

Guide Share France Groupe de Travail MQ sept 2013

Calcul parallèle avec R

In order to upload a VM you need to have a VM image in one of the following formats:

Big Data in BioMedical Sciences. Steven Newhouse, Head of Technical Services, EMBL-EBI

Hadoopizer : a cloud environment for bioinformatics data analysis

COLLABORATIVE LCA. Rachel Arnould and Thomas Albisser. Hop-Cube, France

Introduction to Cloud Computing

Experiences and challenges in the development of the JASMIN cloud service for the environmental science community

Workshop. Avril 2015 Benoit Buonassera

Storage solutions for a. infrastructure. Giacinto DONVITO INFN-Bari. Workshop on Cloud Services for File Synchronisation and Sharing

2nd Singapore Heritage Science Conference

Computer Science. About PaaS Security. Donghoon Kim Henry E. Schaffer Mladen A. Vouk

Getting Started Hacking on OpenNebula

Building Storage Service in a Private Cloud

Cloud OS. Philip Meyer Partner Technology Specialist - Hosting

Options in Open Source Virtualization and Cloud Computing. Andrew Hadinyoto Republic Polytechnic

vnebula Cloud. Made Easy. Introducing vnebula from Stream Networks. A simple, self-service cloud portal for our partner community.

OpenNebula Open Souce Solution for DC Virtualization. C12G Labs. Online Webinar

Assignment # 1 (Cloud Computing Security)

Cloud-Based Big Data Analytics in Bioinformatics

- DLP Des nuages. à la terre ferme

Three data delivery cases for EMBL- EBI s Embassy. Guy Cochrane

Deploying Business Virtual Appliances on Open Source Cloud Computing

wu.cloud: Insights Gained from Operating a Private Cloud System

Hyper-V vs ESX at the datacenter

HPC Cloud. Focus on your research. Floris Sluiter Project leader SARA

WebLogic on Oracle Database Appliance: Combining High Availability and Simplicity

Cahier de réalisation

Planning, Provisioning and Deploying Enterprise Clouds with Oracle Enterprise Manager 12c Kevin Patterson, Principal Sales Consultant, Enterprise

New solutions for Big Data Analysis and Visualization

Final Report on StratusLab Adoption

Mobile Cloud Computing T Open Source IaaS

System Requirements Orion

Case study: Migrating 1,000 VMs from VMware to RHEV. Tomas Von Veschler Cox Senior Solution Architect, Red Hat June 2013

Development of Bio-Cloud Service for Genomic Analysis Based on Virtual

Note concernant votre accord de souscription au service «Trusted Certificate Service» (TCS)

Boas Betzler. Planet. Globally Distributed IaaS Platform Examples AWS and SoftLayer. November 9, IBM Corporation

Enterprise Risk Management & Board members. GUBERNA Alumni Event June 19 th 2014 Prepared by Gaëtan LEFEVRE

Data Centers and Cloud Computing

Dedicated Hosting. The best of all worlds. Build your server to deliver just what you want. For more information visit: imcloudservices.com.

Data Centers and Cloud Computing. Data Centers. MGHPCC Data Center. Inside a Data Center

ESCALA. The perfect server for secure private clouds in AIX environments

Estonian Scientific Computing Infrastructure (ETAIS)

2) Xen Hypervisor 3) UEC

Enabling Large-Scale Testing of IaaS Cloud Platforms on the Grid 5000 Testbed

Leveraging Feature Models to Configure Virtual Appliances

Red Hat enterprise virtualization 3.0 feature comparison

Solution for private cloud computing

Transcription:

Le cloud IFB et son instance Galaxy Christophe BLANCHET Institut Français de Bioinformatique - IFB French Institute of Bioinformatics - ELIXIR-FR CNRS UMS3601 - Gif-sur-Yvette - FRANCE Ecole Bioinformatique Aviesan 28 Septembre 2015, Roscoff

Experimental data in life sciences (FR) French national platforms (GIS IBISA) Nb Cellular imaging 19 Genomic, Transcriptomic 16 Proteomic 13 French NGS platforms Structural biology, biophysic 11 NGS C BI PRO Source: omicsmaps.com NGS IMG PRO NGS BI C Biological platform (Genomics, IMaGing, PROteomics...) BI C Bioinformatics center Cloud resources Scientists BI IMG PRO NGS BI C PRO IMG C C Un déluge de donnée. Blanchet C. et Collin O., 2011, Biofutur, 323: 64-67 PRO Regional centers distribute the load in terms of computing and storage, and provide better interactions with scientists Des sites intermédiaires permettent de répartir la charge en terme de stockage et de puissance de calcul tout en assurant une meilleure proximité avec les scientifiques 2

A lot of bioinforma:cs tools tools BLAST FastA OMSSA ClustalW2 SSearch PeptideShaker ARIA BWA X!tandem HMMer TopHat samtools Galaxy Clustal Muscle fastqc Omega R ABYSS 1.3.4 ARIA 2.3 Bioconductor 2.11 biomaj BLAST+ 2.2.27 Blat 35 Bowtie 0.12.8 Bowtie2 2.0.0- beta7 BWA 0.6.2 BWA 0.7.10 CAP3 CD-HIT 4.6.1 Clustal Omega 1.0.3 CLUSTALW 2.1 Cufflinks 2.0.2 Cutadapt 1.2.1 E-SURGE 1.9.0 Exonerate 2.2.0 express 1.5.1 FastA 3.6 FastQC 0.10.1 Galaxy portal GATK 2.3.4 HMMer 3.0 ImageJ 1.48 khmer 1.1 M-SURGE 1.8.5 MEME 4.7 MMSEQ 0.11.2a Mobyle MODAL MultAlin 5.4.1 MUSCLE 3.8.31 neo4j Oases 0.2.08 OMSSA 2.1.9 PeptideShaker 0.18.3 phyml 3.1 PREDATOR 2.1.2 proline python 2.7 R 2.13 R 3.1.1 R 3.1.2 R-studio Ray 1.3 RSAT samtools 0.1.18 Samtools 1.1 SearchGUI 1.10.4 SeqClean Shiny Stacks STAR 2.4.0f1 SuMo v1 TGICL TopHat 2.0.6 trim_galore 0.3.7 Trinity 2.0.4 U-CARE 2.3.2 VCFtools 0.1.11 Velvet 1.2.10 X!tandem 12-10-01-1 XPLOR-NIH 2.30 3

Many interfaces 4

The French Ins:tute of Bioinforma:cs and its e- infrastructure 5

IFB - Ins:tut Français de Bioinforma:que IFB, the French distributed infrastructure for life-science information Mission : to make available core bioinformatics resources to the national/international life science research community. To provide support for national biology programs To provide an IT infrastructure devoted to management and analysis of biological data To act as a middleman between the life science community and the bioinformatics/ computer science research community http://www.france-bioinformatique.fr CNRS UMS3601. Avenue de la Terrasse, Bât 21. 91190 Gif-sur-Yvette ELIXIR French Node The European distributed infrastructure for lifescience information To optimize the interactions and coordination between the national level and ELIXIR and other ESFRI infrastructures in biomedical and environmental field, To promote consistency and complementarities between the components offered by the ELIXIR French node and those of other European nodes 6

IFB e- Infrastructure Mission : to provide core bioinformatics resources to the life science research community. To set up a French IT infrastructure (cloud) devoted to management and analysis of biological data To provide hardware, data collections and bioinformatics tools To collaborate with international infrastructure (ELIXIR) Current resources A national hub : IFB-core IT resources hosted at CNRS IDRIS SC center A network of regional centers 32 bioinformatics platforms - 15,000 cores - 5 PB Two running clouds : IFB-core and GenOuest Create a federation of clouds for life sciences C C 7

Virtualisa:on Virtual machines 1 N } } App App Application OS Matériel P R S Re OS Matériel P R S Re Système d exploitation Hyperviseur Matériel Proc. RAM Stock. Rés. Matériel Proc. RAM Stock. Rés. Physical server 8

IFB- core s cloud IFB-core # Compute Cores # TB Storage # TB RAM Max VM size Technology Location Pilot 200 50 2 40c 256GB StratusLab CNRS-IDRIS, Paris 2016-S1 3,000 500 -?144c 3TB? StratusLab 2017 10,000 2,000 -?? StratusLab CNRS-IDRIS, Paris CNRS-IDRIS, Paris NGS, imaging, statistics, PaaS IaaS launch jobs Scientists RENATER 10giga SaaS Frontend Master Virtualization Layer Shared FS Workers Web portal Pdisk storage iscsi 10giga eth Hosted @ IDRIS CNRS SC-center 10giga eth Cloud Hypervisors - std nodes: 32c 128GB - bigmem nodes: 40c 256GB 9

Provide scien:sts with bioinforma:cs resources - data and tools - as cloud appliances 10

Create bioinforma:cs appliances VM 1 VM n Application Application OS OS HW HW Hypervisor HW tools BLAST FastA OMSSA ClustalW2 SSearch PeptideShaker ARIA BWA X!tandem HMMer TopHat samtools Galaxy Clustal Muscle fastqc Omega Create new cloud services Virtual Machines R + Linux system Bioinformatics Marketplace Appliance? predefined virtual machine including tools, pipeline,recipes Ready to run Appliance annotation Title Description + (w. controlled voc.) Topics Tools Contact Developer(s) and maintainer(s)! Structures Sequences Proteomics Galaxy... 11

Remote desktop IFB s bioinforma:cs appliances Proteomics Imaging Web Galaxy MODAL Eco Pop Galaxy Galaxy RADseq Galaxy AVIESAN 2015 BioDataCloud IGV Scientific apps RSAT z PhyML MacSyFinder SynBioWatch R CLI statistics biocompute Node Aria biohadoop Utilities biodata BioMaj BlobSeer biodata NFS Cassandra Docker CentOS Ubuntu Base OS Neo4j Data mgmt 12

IFB s cloud for Bioinforma:cs Public Data sources Data BioMAJ EMBL PDB Genomes UNIPROT PROSITE Reference Datasets common share VMs VMs VMs VMs VMs VMs VMs VMs VMs VMs Cloud Credentials Data Personalized interfaces j. doe e. martin you chb virtual disks cg User data Author. VMs VMs VMs Cloud for Bioinformatics 13

A cloud driven through a web dashboard http://cloud.france-bioinformatique.fr/cloud 14

Browse the marketplace and run an App! Proteomics Sequences Galaxy Structures?... IFB s bioinformatics marketplace! 15

Pra:que 16

Connexion au Cloud IFB http://cloud.france-bioinformatique.fr/cloud Connectez-vous au cloud IFB dans Sign in 17

Connexion au Cloud IFB (2) Lors de la première connexion rubrique Settings complétez vos paramètres! clé SSH: fichier ~/.ssh/dsa.pub attention aux retours à la ligne lors du copier-coller la créer avec sshkeygen (ou PuTTYgen) 18

Le tableau de bord du Cloud IFB Gérer ses VMs Créer / arrêter / renommer Gérer ses disques virtuels Créer / supprimer Visualisez les paramètres nom/type/taille état/charge CPU disque virtuel attaché 19

Créer une machine virtuelle Proteomics Sequences Galaxy Structures?... IFB s Marketplace! 20

Choisir la bonne appliance 21

Caractéris:ques d une VM Définir sa VM un nom nombre de CPUs taille mémoire attacher un disque virtuel Cluster de VMs remplir le nombre de VMs choix du nom unique Appelée aussi Instance 22

Les disques durs virtuels Pour stocker ses données taille et nombre variable (quota) retrouver ses données d une VM à la suivante.! pas de sauvegarde! Sur un vdisk l attacher à une (seule) VM à la création de la VM Partager un vdisk mode cluster VM NFS 23

Pra:que Depuis le tableau de bord du cloud http://cloud.france-bioinformatique.fr/cloud/ Créez votre disque virtuel bouton New vdisk monddgalaxy, 10Go 24

Pra:que Depuis le tableau de bord du cloud http://cloud.france-bioinformatique.fr/cloud/ bouton New instance identifier quelle appliance fournit l outil deeptools? quelle version de l outil? Créer une instance EBA15 Galaxy ChIP-seq nom: galaxy_roscoff_2015 taille: c3.medium (2 CPU, 8 GB RAM) attacher votre disque virtuel monddgalaxy 25

Monitor your usage 26

Ques:ons? Acknowledgments IFB members IFB hub: Patricia, Jean-François, Mohamed, Jonathan, Maxime, Dominique Alumni : Marie, Quentin we are hiring! Working group IFB-GRISBI (co-chair with Olivier Collin) Appliances developers Samuel Blanck (Inria Lille), Jacques van Helden (TAGC), Stéphane Delmotte (PRABI-Doua), Bruno Spataro (PRABI-Doua), Marie-Laure Franchinard (MIGALE), Anis Djari (BioinfoGenoToul), Bertrand Néron (Institut Pasteur), Adrien Josso (MicroScope), Thomas Lacroix (MIGALE), Christian Baudet (CLB), Germain Paimparay & Baptiste Brault (CFB) CNRS IDRIS: R. Medeiros, C. Gauthey and staff StratusLab members IFB is funded by French programs PIA INBS 2012, BioDataCloud EU H2020 projects, CYCLONE (644925) and EGI-Engage (654142) http://www.france-bioinformatique.fr 27