Bioinformatique sur Cloud Cas d usage avec le portail Galaxy

Size: px
Start display at page:

Download "Bioinformatique sur Cloud Cas d usage avec le portail Galaxy"

Transcription

1 Bioinformatique sur Cloud Cas d usage avec le portail Galaxy Christophe Blanchet Institute of Biology and Chemistry of Proteins Head of Service Infrastructure for Biology - IDB CNRS-IBCP FR LYON - FRANCE - IDB acknowledges co-funding by the European Community's Seventh Framework Programme (INFSO- RI ), the French National Research Agency's Arpege Programme (ANR-10-SEGI-001) and by the French Institute for Bioinformatics (IFB-RENABI)

2 A Bioinformatics Today Biological data are big data 1512 online databases (NAR Database Issue 2013) Institut Sanger, UK, 5 PB Beijing Genome Institute, China, 5 sites, 12.6 PB Big data in many places Analysing such data became difficult Scale-up of the analyses : gene/protein to complete genome/ proteome,... Lot of different daily-used tools That need to be combined in workflows Usual interfaces: portals, Web services, federation,... Datacenters with ease of access/use Distributed resources Experimental platforms: NGS, imaging,... Bioinformatics platforms Federation of datacenters M ADN BI ADN ADN BI CC BI ADN ADN

3 IDB Cloud and Bioinformatics Appliances Cloud workbench for Biology Running since Sept CNRS-IBCP FR3302, Lyon, France opened to Biology community 14 bioinformatics appliances: Galaxy portal, standard compute nodes, proteomics, virtual desktop, structural biology, users from all IFB regional centers PRABI 15, APLIBIO 14, RENABI-NE 8, -SO 2, -GS 1, -GO 1 VMs up to 32cores-768GB RAM tools BLAST FastA OMSSA ClustalW2 SSearch PeptideShaker ARIA BWA X!tandem HMMer TopHat samtools Galaxy Clustal Muscle fastqc Omega Create new cloud services Virtual Machines R + Linux system Bioinformatics Marketplace Infrastructure Compute +900cores +4TB ram Standard nodes (32c-128GB) Bigmen nodes (64c 768GB) Powered by StratusLab Storage +250TB Virtual disks, object storage (S3) BI user data Z Structures Galaxy Sequences Proteomics B A data public data UNIPROT EMBL Genomes PDB PROSITE Move cloud virtual machines tools VM: BLAST, ClustalW2, etc.... IDB Cloud

4 Cloud extended services Native cloud services Authentication Virtual machine management Persistent disk service Client CLI etc. IDB Bioinformatics Marketplace find appropriate appliances more easily. reduce noise in the central Marketplace respect visibility contraints for the bioinformatic appliances, such as confidentiality Bioinformatics metadata bio:tool additional elements related to bioinformatics tools to annotate appliances help users to search for the tools themselves or the type of analysis select suitable bioinformatics appliances containing the required tools Integrated Web interface VM & virtual disks management browse bionformatics appliances with bio:tool MDz

5 Driven throught a simple web interface

6 Run your Bioinformatics Cloud Instances Bioinformatics Marketplace Sequence Structure NGS Galaxy ARIA ( ) Launch Instances BLAST, Clustal, etc. PaaS IaaS launch jobs ssh Shared FS Master & Storage VM ARIA Workers VM CNS Portal IBCP's Cloud Resources

7 Biological Data in Cloud Upload your data Public Data sources Genomes EMBL PDB UNIPROT PROSITE shared (NFS) BLAST, Clustal, etc. PaaS sftp/http/s3 IaaS launch jobs ssh Shared FS User Persistent data pdisk (iscsi) Portal Master & Storage VM ARIA Bioinformatics Cloud Workers VM CNS sftp/http/s3 Get your results

8 Examples of Cloud Bionformatics Appliances

9 Standard Bioinformatics node Biocompute appliance Use your own instance(s) With pre-installed standard bioinformatics tools BLAST, FastA, SSearch,HMM,... ClustalW2, Clustal-Omega, Muscle,.. Bowtie(2), BWA, samtools,... MEME, R, etc. Connected to public reference data Uniprot, EMBL, genomes, PDB, etc. Automaticaly shared to the VMs

10 Structural Biology TOwards StruCtural AssignmeNt Improvement To improve the determination of protein structures based on Nuclear Magnetic Resonance (NMR) information with ARIA software Large computational needs. A NMR laboratory will not specially invest in building a cluster of about 100 nodes to be able to run such NMR structure calculations. Flexibility of the cloud to deploy the different required bioinformatics tools can accelerate such a procedure. Commercial interest in providing such tools to structural biologists on a pay as you go basis. Endorsers: Institut Pasteur Paris and CNRS IBCP

11 Proteomics desktop Motivation Collaboration with a mass spectroscopy platform Running out of space on their local resources Protein identification Mass experimental data Reference databases : nr, Swiss-Prot Reference screening tools: OMSSA, X!Tandem User interface Remote display NX Reference GUIs SearchGUI PeptidShaker source: PeptideShaker site

12 MapReduce Biology Provide turnkey virtual machine with preconfigured mapreduce framework Accelerate bigadata analysis with the two steps map & reduce paradigm Hadoop MapReduce Appliances (2) standard hadoop mapreduce bioinformaytics software integrated in hadoop Sequences similarity with mapreduce paradigm FastA & SSearch deploy database of sequences in HDFS compare each structure to others Developed in the context of the French project MapReduce, ANR ARPEGE Databank FastAMR splits the databank into subsets and puts them in the DFS along with the sequences file FastAMR subset #01 FastA #01 subset #02 Mappers FastA # Each mapper send the score and sequences to reducers Reducers Results score sequence score sequence... Users run the FastAMR script with its sequences and the databank User's Sequences Each mapper runs a FastA program on a part of the databank Reducers copy the best scores of the whole experiment in the DFS

13 Cas d usage avec Galaxy

14 Compte Cloud IDB Connectez-vous Remplissez les différents champs adresse mail institutionnelle Créer la demande implique l acceptation des conditions d utilisations!

15 Appliances disponibles Liste des appliances existantes Documentation spécifique aux appliances Création directe bouton Power

16 Créer mon portail Galaxy Appelée aussi Instance Compléter les différents paramètres lui assigner un nom nombre de CPUs taille mémoire attacher un disque virtuel Cluster de VM remplir le nombre de VMs choix du nom unique

17 Connexion sur mon instance Galaxy

18 Les disques durs virtuels Un disque virtuel permet de conserver ses données indépendamment de l exécution des VMs retrouver ses données d une VM à la suivante. Actions créer un vdisk gérer ses vdisks Utiliser un vdisk à la création de la VM montage à chaud

19 Echanger les données avec mon portail Galaxy sftp / scp client graphique: Cyberduck, Transmit, Filezilla,... Web: Galaxy - Get Data - Lien pour download

20 Conclusion Added value of cloud, e.g. NGS with Galaxy for scientific analyses: user-specific resources, isolated, different instances together for training: Oct 2012 Bordeaux, Mai 2013 Galaxy Lille, (next) 2014 Galaxy Jouy for tools integration: semantic annotation, solve software dependencies for development & operations (DevOps): different versions at the same time Provide turnkey bioinformatics appliances Standard tools and pipelines New developments Ready to run on clouds Public bioinformatics cloud (e.g. IDB) Tightly connected to existing bioinformatics resources Linked to public biological databases In collaboration with the French Institute of Bioinformatics BI user data Z tools BLAST FastA OMSSA ClustalW2 SSearch PeptideShaker ARIA BWA X!tandem HMMer TopHat samtools Galaxy Clustal Muscle fastqc Omega Create new cloud services Virtual Machines R + Linux system Bioinformatics Marketplace Structures Galaxy Sequences Proteomics B A data public data UNIPROT EMBL Genomes PDB PROSITE Move cloud virtual machines tools VM: BLAST, ClustalW2, etc.... IDB Cloud

21 IFB - French Institute of Bioinformatics Mission : to make available core bioinformatics resources to the national/international life science research community. To provide support for biology programs supporting projects training users To provide an IT infrastructure devoted to management and analysis of biological data material resources : CPUs, disks, etc. availability of biology data collections deployment of bioinformatics tools To act as a middleman between the life science community and the bioinformatics/computer science research community Institut Français de Bioinformatique

22 IFB - Infrastructure IFB-Core resources Academic cloud for life cience Will be hosted at CNRS IDRIS supercomputing center (PARIS) A pilot infrastructure (2014-Q1) Production infrastructure +5,000cores 1PB (2014-S2) + Regional resources 6 regional bioinformatics centers +6,000 cores ~1PB 2 existing clouds: PRABI-IBCP IDB cloud (Lyon) & Genouest genocloud (Rennes) - RENABI IFB - Bioinformatics French Institute RENABI-GO APLIBIO FIB-core IT CNRS-IDRIS, Paris RENABI-NE PRABI Deploy a clouds federation RENABI-SO RENABI-GS Institut Français de Bioinformatique

23 Questions? Acknowledgment Clément Gauthey (IDB) StratusLab members co-funding by the European Community's Seventh Framework Programme (INFSO-RI ) and by the French National Research Agency's Arpege Programme (ANR-10-SEGI-001).

Cloud Ready for Bioinformatics?

Cloud Ready for Bioinformatics? IDB acknowledges co-funding by the European Community's Seventh Framework Programme (INFSO-RI-261552) and the French National Research Agency's Arpege Programme (ANR-10-SEGI-001) Cloud Ready for Bioinformatics?

More information

Cloud pour la Bioinformatique

Cloud pour la Bioinformatique Cloud pour la Bioinformatique Christophe Blanchet Institut Français de Bioinformatique - IFB French Institute of Bioinformatics - ELIXIR French Node CNRS UMS3601 - Gif-sur-Yvette - FRANCE Sequencing data

More information

Institut Français de Bioinformatique, Un Cloud pour les Sciences du Vivant

Institut Français de Bioinformatique, Un Cloud pour les Sciences du Vivant Institut Français de Bioinformatique, Un Cloud pour les Sciences du Vivant Christophe Blanchet! Institut Français de Bioinformatique - IFB French Institute of Bioinformatics - ELIXIR French Node CNRS UMS3601

More information

Sequencing data. And other experimental data. EMBL-EBI data resources growth

Sequencing data. And other experimental data. EMBL-EBI data resources growth Sequencing Institut Français de Bioinformatique, Un loud pour les Sciences du Vivant source: www.genomesonline.org source: www.politigenomics.com/next-generation- hristophe Blanchet Institut Français de

More information

Le cloud IFB et son instance Galaxy

Le cloud IFB et son instance Galaxy Le cloud IFB et son instance Galaxy Christophe BLANCHET Institut Français de Bioinformatique - IFB French Institute of Bioinformatics - ELIXIR-FR CNRS UMS3601 - Gif-sur-Yvette - FRANCE Ecole Bioinformatique

More information

Une e-infrastructure nationale en bioinformatique

Une e-infrastructure nationale en bioinformatique Une e-infrastructure nationale en bioinformatique Christophe BLANCHET Institut Français de Bioinformatique - IFB French Institute of Bioinformatics - ELIXIR-FR CNRS UMS3601 - Gif-sur-Yvette - FRANCE JDEV

More information

Le cloud IFB et son instance Galaxy

Le cloud IFB et son instance Galaxy Le cloud IFB et son instance Galaxy Christophe BLANCHET Institut Français de Bioinformatique - IFB French Institute of Bioinformatics - ELIXIR-FR CNRS UMS3601 - Gif-sur-Yvette - FRANCE Ecole Bioinformatique

More information

Deployment of BioXSDenabled services on a Cloud. christophe.blanchet@ibcp.fr

Deployment of BioXSDenabled services on a Cloud. christophe.blanchet@ibcp.fr Deployment of BioXSDenabled services on a Cloud Outline IBCP, provider of BioXSD-enabled services Cloud Computing RENABI GRISBI, French infrastructure Bioinformatics Integrated s gbio-pbil.ibcp.fr/ws GBIO

More information

IFB s e-infrastructure

IFB s e-infrastructure IFB s e-infrastructure Christophe Blanchet Institut Français de Bioinformatique - IFB French Institute of Bioinformatics - ELIXIR-FR CNRS UMS3601 - Gif-sur-Yvette - FRANCE Life Sciences Platforms in France

More information

StratusLab project. Standards, Interoperability and Asset Exploitation. Vangelis Floros, GRNET

StratusLab project. Standards, Interoperability and Asset Exploitation. Vangelis Floros, GRNET StratusLab project Standards, Interoperability and Asset Exploitation Vangelis Floros, GRNET EGI Technical Forum 2011 19-22 September 2011, Lyon, France StratusLab is co-funded by the European Community

More information

Towards a galaxy.prabi.fr

Towards a galaxy.prabi.fr Towards a galaxy.prabi.fr IFB- galaxy Day 04/12/2013 Navra5l V., PhD, UCBL navra5l@prabi.fr www.prabi.fr One among the six IFB regional nodes Region: Rhône- Alpes Director: Guy Perrière 11 Research Team,

More information

Eoulsan Analyse du séquençage à haut débit dans le cloud et sur la grille

Eoulsan Analyse du séquençage à haut débit dans le cloud et sur la grille Eoulsan Analyse du séquençage à haut débit dans le cloud et sur la grille Journées SUCCES Stéphane Le Crom (UPMC IBENS) stephane.le_crom@upmc.fr Paris November 2013 The Sanger DNA sequencing method Sequencing

More information

EGEE-2 NA4 Biomed Bioinformatics in CNRS

EGEE-2 NA4 Biomed Bioinformatics in CNRS Enabling Grids for E-sciencE EGEE-2 NA4 Biomed Bioinformatics in CNRS Christophe Blanchet Institute of Biology and Chemistry of Proteins Lyon, April 28, 2006 www.eu-egee.org Enabling Grids for E-sciencE

More information

Alternative Deployment Models for Cloud Computing in HPC Applications. Society of HPC Professionals November 9, 2011 Steve Hebert, Nimbix

Alternative Deployment Models for Cloud Computing in HPC Applications. Society of HPC Professionals November 9, 2011 Steve Hebert, Nimbix Alternative Deployment Models for Cloud Computing in HPC Applications Society of HPC Professionals November 9, 2011 Steve Hebert, Nimbix The case for Cloud in HPC Build it in house Assemble in the cloud?

More information

UGENE Quick Start Guide

UGENE Quick Start Guide Quick Start Guide This document contains a quick introduction to UGENE. For more detailed information, you can find the UGENE User Manual and other special manuals in project website: http://ugene.unipro.ru.

More information

Ins$tut Français de Bioinforma$que Current situa+on and prospect. IFB General Assembly Gif- sur- Yve=e, January 9 2015

Ins$tut Français de Bioinforma$que Current situa+on and prospect. IFB General Assembly Gif- sur- Yve=e, January 9 2015 Ins$tut Français de Bioinforma$que Current situa+on and prospect IFB General Assembly Gif- sur- Yve=e, January 9 2015 Background 2010: Na+onal Infrastructures in Biology and Health call from the Investment

More information

DATA MANAGEMENT PLAN IN THE REAL LIFE SCIENCES

DATA MANAGEMENT PLAN IN THE REAL LIFE SCIENCES DATA MANAGEMENT PLAN IN THE REAL LIFE SCIENCES Yvan Le Bras Cyril Monjeaud Olivier Collin Jacques Nicolas CNRS UMR 6074 IRISA-INRIA Context Now : Genomics : Next Generation Sequencing Now : Proteomics

More information

E-SCIENCE IN WESTERN FRANCE :

E-SCIENCE IN WESTERN FRANCE : E-SCIENCE IN WESTERN FRANCE : BEGINS Yvan Le Bras Cyril Monjeaud Olivier Collin & the GenOuest team CNRS UMR 6074 IRISA-INRIA Context Now : Genomics : Next Generation Sequencing Now : Proteomics Next :

More information

Final Report on StratusLab Adoption

Final Report on StratusLab Adoption Final Report on StratusLab Adoption Charles Loomis, Mohammed Airaj, Marc-Elian Bégin, Christophe Blanchet, Evangelos Floros, Clément Gauthey To cite this version: Charles Loomis, Mohammed Airaj, Marc-Elian

More information

SURFsara HPC Cloud Workshop

SURFsara HPC Cloud Workshop SURFsara HPC Cloud Workshop doc.hpccloud.surfsara.nl UvA workshop 2016-01-25 UvA HPC Course Jan 2016 Anatoli Danezi, Markus van Dijk cloud-support@surfsara.nl Agenda Introduction and Overview (current

More information

E-SCIENCE IN WESTERN FRANCE : THE BEGINNING

E-SCIENCE IN WESTERN FRANCE : THE BEGINNING E-SCIENCE IN WESTERN FRANCE : THE BEGINNING Yvan Le Bras Olivier Collin Jacques Nicolas CNRS UMR 6074 IRISA-INRIA Context Now : Genomics : Next Generation Sequencing Now : Proteomics Next : Bio-imaging

More information

Seed4C: A Cloud Security Infrastructure validated on Grid 5000

Seed4C: A Cloud Security Infrastructure validated on Grid 5000 Seed4C: A Cloud Security Infrastructure validated on Grid 5000 E. Caron 1, A. Lefray 1, B. Marquet 2, and J. Rouzaud-Cornabas 1 1 Université de Lyon. LIP Laboratory. UMR CNRS - ENS Lyon - INRIA - UCBL

More information

SURFsara HPC Cloud Workshop

SURFsara HPC Cloud Workshop SURFsara HPC Cloud Workshop www.cloud.sara.nl Tutorial 2014-06-11 UvA HPC and Big Data Course June 2014 Anatoli Danezi, Markus van Dijk cloud-support@surfsara.nl Agenda Introduction and Overview (current

More information

Enabling multi-cloud resources at CERN within the Helix Nebula project. D. Giordano (CERN IT-SDC) HEPiX Spring 2014 Workshop 23 May 2014

Enabling multi-cloud resources at CERN within the Helix Nebula project. D. Giordano (CERN IT-SDC) HEPiX Spring 2014 Workshop 23 May 2014 Enabling multi-cloud resources at CERN within the Helix Nebula project D. Giordano (CERN IT-) HEPiX Spring 2014 Workshop This document produced by Members of the Helix Nebula consortium is licensed under

More information

A curated Domain centric shared Docker registry linked to the Galaxy toolshed

A curated Domain centric shared Docker registry linked to the Galaxy toolshed A curated Domain centric shared Docker registry linked to the Galaxy toolshed François Moreews 1, Olivier Sallou 2, Yvan le Bras 2, Marie Grosjean 3, Cyril Monjeaud 2, Thomas Darde 4, Olivier Collin 2,

More information

Cloud BioLinux: Pre-configured and On-demand Bioinformatics Computing for the Genomics Community

Cloud BioLinux: Pre-configured and On-demand Bioinformatics Computing for the Genomics Community Cloud BioLinux: Pre-configured and On-demand Bioinformatics Computing for the Genomics Community Ntinos Krampis Asst. Professor J. Craig Venter Institute kkrampis@jcvi.org http://www.jcvi.org/cms/about/bios/kkrampis/

More information

Cloud BioLinux: Pre-configured and On-demand Bioinformatics Computing for the Genomics Community

Cloud BioLinux: Pre-configured and On-demand Bioinformatics Computing for the Genomics Community Cloud BioLinux: Pre-configured and On-demand Bioinformatics Computing for the Genomics Community Ntinos Krampis Asst. Professor J. Craig Venter Institute kkrampis@jcvi.org http://www.jcvi.org/cms/about/bios/kkrampis/

More information

Big Data in BioMedical Sciences. Steven Newhouse, Head of Technical Services, EMBL-EBI

Big Data in BioMedical Sciences. Steven Newhouse, Head of Technical Services, EMBL-EBI Big Data in BioMedical Sciences Steven Newhouse, Head of Technical Services, EMBL-EBI Big Data for BioMedical Sciences EMBL-EBI: What we do and why? Challenges & Opportunities Infrastructure Requirements

More information

Assignment # 1 (Cloud Computing Security)

Assignment # 1 (Cloud Computing Security) Assignment # 1 (Cloud Computing Security) Group Members: Abdullah Abid Zeeshan Qaiser M. Umar Hayat Table of Contents Windows Azure Introduction... 4 Windows Azure Services... 4 1. Compute... 4 a) Virtual

More information

Deploying Business Virtual Appliances on Open Source Cloud Computing

Deploying Business Virtual Appliances on Open Source Cloud Computing International Journal of Computer Science and Telecommunications [Volume 3, Issue 4, April 2012] 26 ISSN 2047-3338 Deploying Business Virtual Appliances on Open Source Cloud Computing Tran Van Lang 1 and

More information

Cloud Computing Solutions for Genomics Across Geographic, Institutional and Economic Barriers

Cloud Computing Solutions for Genomics Across Geographic, Institutional and Economic Barriers Cloud Computing Solutions for Genomics Across Geographic, Institutional and Economic Barriers Ntinos Krampis Asst. Professor J. Craig Venter Institute kkrampis@jcvi.org http://www.jcvi.org/cms/about/bios/kkrampis/

More information

SURFsara Data Services

SURFsara Data Services SURFsara Data Services SUPPORTING DATA-INTENSIVE SCIENCES Mark van de Sanden The world of the many Many different users (well organised (international) user communities, research groups, universities,

More information

Solution for private cloud computing

Solution for private cloud computing The CC1 system Solution for private cloud computing 1 Outline What is CC1? Features Technical details Use cases By scientist By HEP experiment System requirements and installation How to get it? 2 What

More information

Bioinformatics Grid - Enabled Tools For Biologists.

Bioinformatics Grid - Enabled Tools For Biologists. Bioinformatics Grid - Enabled Tools For Biologists. What is Grid-Enabled Tools (GET)? As number of data from the genomics and proteomics experiment increases. Problems arise for the current sequence analysis

More information

Open Cloud System. (Integration of Eucalyptus, Hadoop and AppScale into deployment of University Private Cloud)

Open Cloud System. (Integration of Eucalyptus, Hadoop and AppScale into deployment of University Private Cloud) Open Cloud System (Integration of Eucalyptus, Hadoop and into deployment of University Private Cloud) Thinn Thu Naing University of Computer Studies, Yangon 25 th October 2011 Open Cloud System University

More information

Big Data in BioMedical Sciences. Steven Newhouse, Head of Technical Services, EMBL-EBI

Big Data in BioMedical Sciences. Steven Newhouse, Head of Technical Services, EMBL-EBI Big Data in BioMedical Sciences Steven Newhouse, Head of Technical Services, EMBL-EBI Big Data for BioMedical Sciences EMBL-EBI: What we do and why? Challenges & Opportunities Infrastructure Requirements

More information

Microsoft Research Windows Azure for Research Training

Microsoft Research Windows Azure for Research Training Copyright 2013 Microsoft Corporation. All rights reserved. Except where otherwise noted, these materials are licensed under the terms of the Apache License, Version 2.0. You may use it according to the

More information

Dutch HPC Cloud: flexible HPC for high productivity in science & business

Dutch HPC Cloud: flexible HPC for high productivity in science & business Dutch HPC Cloud: flexible HPC for high productivity in science & business Dr. Axel Berg SARA national HPC & e-science Support Center, Amsterdam, NL April 17, 2012 4 th PRACE Executive Industrial Seminar,

More information

MapReduce, Hadoop and Amazon AWS

MapReduce, Hadoop and Amazon AWS MapReduce, Hadoop and Amazon AWS Yasser Ganjisaffar http://www.ics.uci.edu/~yganjisa February 2011 What is Hadoop? A software framework that supports data-intensive distributed applications. It enables

More information

Microsoft Research Microsoft Azure for Research Training

Microsoft Research Microsoft Azure for Research Training Copyright 2014 Microsoft Corporation. All rights reserved. Except where otherwise noted, these materials are licensed under the terms of the Apache License, Version 2.0. You may use it according to the

More information

Three data delivery cases for EMBL- EBI s Embassy. Guy Cochrane www.ebi.ac.uk

Three data delivery cases for EMBL- EBI s Embassy. Guy Cochrane www.ebi.ac.uk Three data delivery cases for EMBL- EBI s Embassy Guy Cochrane www.ebi.ac.uk EMBL European Bioinformatics Institute Genes, genomes & variation European Nucleotide Archive 1000 Genomes Ensembl Ensembl Genomes

More information

Brian Amedro CTO. Worldwide Customers

Brian Amedro CTO. Worldwide Customers Denis Caromel CEO Brian Amedro CTO Cloud Enterprise Applications (B2B) Reduce Costs (IT) + Reduce Pains (Time) Worldwide Customers 1 1 Software company born of INRIA in 2007 Software Editor, Open Source

More information

Scientific and Technical Applications as a Service in the Cloud

Scientific and Technical Applications as a Service in the Cloud Scientific and Technical Applications as a Service in the Cloud University of Bern, 28.11.2011 adapted version Wibke Sudholt CloudBroker GmbH Technoparkstrasse 1, CH-8005 Zurich, Switzerland Phone: +41

More information

OpenNebula Open Souce Solution for DC Virtualization

OpenNebula Open Souce Solution for DC Virtualization 13 th LSM 2012 7 th -12 th July, Geneva OpenNebula Open Souce Solution for DC Virtualization Constantino Vázquez Blanco OpenNebula.org What is OpenNebula? Multi-tenancy, Elasticity and Automatic Provision

More information

Hadoopizer : a cloud environment for bioinformatics data analysis

Hadoopizer : a cloud environment for bioinformatics data analysis Hadoopizer : a cloud environment for bioinformatics data analysis Anthony Bretaudeau (1), Olivier Sallou (2), Olivier Collin (3) (1) anthony.bretaudeau@irisa.fr, INRIA/Irisa, Campus de Beaulieu, 35042,

More information

VDI: What Does it Mean, Deploying challenges & Will It Save You Money?

VDI: What Does it Mean, Deploying challenges & Will It Save You Money? VDI: What Does it Mean, Deploying challenges & Will It Save You Money? Jack Watts, Senior Sales Executive & Cloud Solutions Specialist Neil Stobart, Director of Sales Engineering Distributor and Systems

More information

The Greenplum Analytics Workbench

The Greenplum Analytics Workbench The Greenplum Analytics Workbench External Overview 1 The Greenplum Analytics Workbench Definition Is a 1000-node Hadoop Cluster. Pre-configured with publicly available data sets. Contains the entire Hadoop

More information

Cloud-Based Big Data Analytics in Bioinformatics

Cloud-Based Big Data Analytics in Bioinformatics Cloud-Based Big Data Analytics in Bioinformatics Presented By Cephas Mawere Harare Institute of Technology, Zimbabwe 1 Introduction 2 Big Data Analytics Big Data are a collection of data sets so large

More information

OpenNebula Open Souce Solution for DC Virtualization

OpenNebula Open Souce Solution for DC Virtualization OSDC 2012 25 th April, Nürnberg OpenNebula Open Souce Solution for DC Virtualization Constantino Vázquez Blanco OpenNebula.org What is OpenNebula? Multi-tenancy, Elasticity and Automatic Provision on Virtualized

More information

HPC Cloud. Focus on your research. Floris Sluiter Project leader SARA

HPC Cloud. Focus on your research. Floris Sluiter Project leader SARA HPC Cloud Focus on your research Floris Sluiter Project leader SARA Why an HPC Cloud? Christophe Blanchet, IDB - Infrastructure Distributing Biology: Big task to port them all to your favorite architecture

More information

OPEN SOURCE AND BOTTOM-UP VRE APPROACH IN WESTERN FRANCE

OPEN SOURCE AND BOTTOM-UP VRE APPROACH IN WESTERN FRANCE OPEN SOURCE AND BOTTOM-UP VRE APPROACH IN WESTERN FRANCE Towards supporting accessible, reproducible, and transparent research in the life sciences Yvan Le Bras Cyril Monjeaud Olivier Collin, the GenOuest

More information

Boas Betzler. Planet. Globally Distributed IaaS Platform Examples AWS and SoftLayer. November 9, 2015. 20014 IBM Corporation

Boas Betzler. Planet. Globally Distributed IaaS Platform Examples AWS and SoftLayer. November 9, 2015. 20014 IBM Corporation Boas Betzler Cloud IBM Distinguished Computing Engineer for a Smarter Planet Globally Distributed IaaS Platform Examples AWS and SoftLayer November 9, 2015 20014 IBM Corporation Building Data Centers The

More information

Grid Computing Perspectives for IBM

Grid Computing Perspectives for IBM Grid Computing Perspectives for IBM Atelier Internet et Grilles de Calcul en Afrique Jean-Pierre Prost IBM France jpprost@fr.ibm.com Agenda Grid Computing Initiatives within IBM World Community Grid Decrypthon

More information

SOFTWARE DEFINED SOLUTIONS JEUDI 19 NOVEMBRE 2015. Nicolas EHRMAN Sr Presales SDS

SOFTWARE DEFINED SOLUTIONS JEUDI 19 NOVEMBRE 2015. Nicolas EHRMAN Sr Presales SDS SOFTWARE DEFINED SOLUTIONS JEUDI 19 NOVEMBRE 2015 Nicolas EHRMAN Sr Presales SDS Transform your Datacenter to the next level with EMC SDS EMC SOFTWARE DEFINED STORAGE, A SUCCESS STORY 5 ÈME ÉDITEUR MONDIAL

More information

Development of Bio-Cloud Service for Genomic Analysis Based on Virtual

Development of Bio-Cloud Service for Genomic Analysis Based on Virtual Development of Bio-Cloud Service for Genomic Analysis Based on Virtual Infrastructure 1 Jung-Ho Um, 2 Sang Bae Park, 3 Hoon Choi, 4 Hanmin Jung 1, First Author Korea Institute of Science and Technology

More information

The OpenNebula Cloud Platform for Data Center Virtualization

The OpenNebula Cloud Platform for Data Center Virtualization CloudOpen 2012 San Diego, USA, August 29th, 2012 The OpenNebula Cloud Platform for Data Center Virtualization Carlos Martín Project Engineer Acknowledgments The research leading to these results has received

More information

Enabling Large-Scale Testing of IaaS Cloud Platforms on the Grid 5000 Testbed

Enabling Large-Scale Testing of IaaS Cloud Platforms on the Grid 5000 Testbed Enabling Large-Scale Testing of IaaS Cloud Platforms on the Grid 5000 Testbed Sébastien Badia, Alexandra Carpen-Amarie, Adrien Lèbre, Lucas Nussbaum Grid 5000 S. Badia, A. Carpen-Amarie, A. Lèbre, L. Nussbaum

More information

OpenNebula Leading Innovation in Cloud Computing Management

OpenNebula Leading Innovation in Cloud Computing Management OW2 Annual Conference 2010 Paris, November 24th, 2010 OpenNebula Leading Innovation in Cloud Computing Management Ignacio M. Llorente DSA-Research.org Distributed Systems Architecture Research Group Universidad

More information

Agenda: 1. Background 2. Solution: ProActive 3. Live Demonstration 4. IFP EN Use Case

Agenda: 1. Background 2. Solution: ProActive 3. Live Demonstration 4. IFP EN Use Case Advances in Cloud Computing with ProActive Parallel Suite D. Caromel Accelerate and Orchestrate Enterprise Applications Hybrid Cloud Solutions (Private with Public Burst) Agenda: 1. Background 2. Solution:

More information

Managing and Conducting Biomedical Research on the Cloud Prasad Patil

Managing and Conducting Biomedical Research on the Cloud Prasad Patil Managing and Conducting Biomedical Research on the Cloud Prasad Patil Laboratory for Personalized Medicine Center for Biomedical Informatics Harvard Medical School SaaS & PaaS gmail google docs app engine

More information

Cloud Computing Architecture with OpenNebula HPC Cloud Use Cases

Cloud Computing Architecture with OpenNebula HPC Cloud Use Cases NASA Ames NASA Advanced Supercomputing (NAS) Division California, May 24th, 2012 Cloud Computing Architecture with OpenNebula HPC Cloud Use Cases Ignacio M. Llorente Project Director OpenNebula Project.

More information

Scalable Cloud Computing Solutions for Next Generation Sequencing Data

Scalable Cloud Computing Solutions for Next Generation Sequencing Data Scalable Cloud Computing Solutions for Next Generation Sequencing Data Matti Niemenmaa 1, Aleksi Kallio 2, André Schumacher 1, Petri Klemelä 2, Eija Korpelainen 2, and Keijo Heljanko 1 1 Department of

More information

Data Semantics Aware Cloud for High Performance Analytics

Data Semantics Aware Cloud for High Performance Analytics Data Semantics Aware Cloud for High Performance Analytics Microsoft Future Cloud Workshop 2011 June 2nd 2011, Prof. Jun Wang, Computer Architecture and Storage System Laboratory (CASS) Acknowledgement

More information

A Cost-Evaluation of MapReduce Applications in the Cloud

A Cost-Evaluation of MapReduce Applications in the Cloud 1/23 A Cost-Evaluation of MapReduce Applications in the Cloud Diana Moise, Alexandra Carpen-Amarie Gabriel Antoniu, Luc Bougé KerData team 2/23 1 MapReduce applications - case study 2 3 4 5 3/23 MapReduce

More information

Le Cloud Computing selon IBM : stratégie et offres, zoom sur WebSphere CloudBurst

Le Cloud Computing selon IBM : stratégie et offres, zoom sur WebSphere CloudBurst Le Cloud Computing selon IBM : stratégie et offres, zoom sur WebSphere CloudBurst Hervé Grange WebSphere Client Technical Expert Stéphane Woillez Senior IT Architect - Cloud Computing Champion IBM France

More information

e-biogenouest : The Tools

e-biogenouest : The Tools e-biogenouest : The Tools Coordinateur : Olivier Collin Animateur : Yvan Le Bras CNRS UMR 6074 IRISA-INRIA / Plateforme de Bioinformatique GenOuest yvan.le_bras@irisa.fr Programme fédérateur Biogenouest

More information

Leveraging BlobSeer to boost up the deployment and execution of Hadoop applications in Nimbus cloud environments on Grid 5000

Leveraging BlobSeer to boost up the deployment and execution of Hadoop applications in Nimbus cloud environments on Grid 5000 Leveraging BlobSeer to boost up the deployment and execution of Hadoop applications in Nimbus cloud environments on Grid 5000 Alexandra Carpen-Amarie Diana Moise Bogdan Nicolae KerData Team, INRIA Outline

More information

Maquette DB2 PureScale

Maquette DB2 PureScale Maquette DB2 PureScale PureScale et technologie Power7 Thierry Desbourdes thierry.desbourdes@fr.ibm.com DB2 PureScale Cluster Actif / Actif Automatic workload balancing On-Demand Provisioning Cluster de

More information

BlobSeer: Towards efficient data storage management on large-scale, distributed systems

BlobSeer: Towards efficient data storage management on large-scale, distributed systems : Towards efficient data storage management on large-scale, distributed systems Bogdan Nicolae University of Rennes 1, France KerData Team, INRIA Rennes Bretagne-Atlantique PhD Advisors: Gabriel Antoniu

More information

Getting Started Hacking on OpenNebula

Getting Started Hacking on OpenNebula LinuxTag 2013 Berlin, Germany, May 22nd Getting Started Hacking on OpenNebula Carlos Martín Project Engineer Acknowledgments The research leading to these results has received funding from Comunidad de

More information

Hadoop Distributed File System Propagation Adapter for Nimbus

Hadoop Distributed File System Propagation Adapter for Nimbus University of Victoria Faculty of Engineering Coop Workterm Report Hadoop Distributed File System Propagation Adapter for Nimbus Department of Physics University of Victoria Victoria, BC Matthew Vliet

More information

EMBL-EBI Web Services

EMBL-EBI Web Services EMBL-EBI Web Services Rodrigo Lopez Head of the External Services Team SME Workshop Piemonte 2011 EBI is an Outstation of the European Molecular Biology Laboratory. Summary Introduction The JDispatcher

More information

SCC / QUANTUM Kick Off 2015 Comment gérer efficacement des workflows et archives de données non structurées?

SCC / QUANTUM Kick Off 2015 Comment gérer efficacement des workflows et archives de données non structurées? SCC / QUANTUM Kick Off 2015 Comment gérer efficacement des workflows et archives de données non structurées? Stéphane Estevez QUANTUM Senior Product Marketing Manager EMEA Anne Morin SCC Business Developpement

More information

ENABLING DATA TRANSFER MANAGEMENT AND SHARING IN THE ERA OF GENOMIC MEDICINE. October 2013

ENABLING DATA TRANSFER MANAGEMENT AND SHARING IN THE ERA OF GENOMIC MEDICINE. October 2013 ENABLING DATA TRANSFER MANAGEMENT AND SHARING IN THE ERA OF GENOMIC MEDICINE October 2013 Introduction As sequencing technologies continue to evolve and genomic data makes its way into clinical use and

More information

OpenNebula Open Souce Solution for DC Virtualization. C12G Labs. Online Webinar

OpenNebula Open Souce Solution for DC Virtualization. C12G Labs. Online Webinar OpenNebula Open Souce Solution for DC Virtualization C12G Labs Online Webinar What is OpenNebula? Multi-tenancy, Elasticity and Automatic Provision on Virtualized Environments I m using virtualization/cloud,

More information

ESMA REGISTERS OJ/26/06/2012-PROC/2012/004. Questions/ Answers

ESMA REGISTERS OJ/26/06/2012-PROC/2012/004. Questions/ Answers ESMA REGISTERS OJ/26/06/2012-PROC/2012/004 Questions/ Answers Question n.10 (dated 18/07/2012) In the Annex VII Financial Proposal, an estimated budget of 1,500,000 Euro is mentioned for the total duration

More information

Energy efficiency in HPC :

Energy efficiency in HPC : Energy efficiency in HPC : A new trend? A software approach to save power but still increase the number or the size of scientific studies! 19 Novembre 2012 The EDF Group in brief A GLOBAL LEADER IN ELECTRICITY

More information

icer Bioinformatics Support Fall 2011

icer Bioinformatics Support Fall 2011 icer Bioinformatics Support Fall 2011 John B. Johnston HPC Programmer Institute for Cyber Enabled Research 2011 Michigan State University Board of Trustees. Institute for Cyber Enabled Research (icer)

More information

<Insert Picture Here> Private Cloud with Fusion Middleware

<Insert Picture Here> Private Cloud with Fusion Middleware Private Cloud with Fusion Middleware Duško Vukmanović Principal Sales Consultant, Oracle dusko.vukmanovic@oracle.com The following is intended to outline our general product direction.

More information

Cloud Computing Where ISR Data Will Go for Exploitation

Cloud Computing Where ISR Data Will Go for Exploitation Cloud Computing Where ISR Data Will Go for Exploitation 22 September 2009 Albert Reuther, Jeremy Kepner, Peter Michaleas, William Smith This work is sponsored by the Department of the Air Force under Air

More information

CSE-E5430 Scalable Cloud Computing. Lecture 4

CSE-E5430 Scalable Cloud Computing. Lecture 4 Lecture 4 Keijo Heljanko Department of Computer Science School of Science Aalto University keijo.heljanko@aalto.fi 5.10-2015 1/23 Hadoop - Linux of Big Data Hadoop = Open Source Distributed Operating System

More information

MapReduce Détails Optimisation de la phase Reduce avec le Combiner

MapReduce Détails Optimisation de la phase Reduce avec le Combiner MapReduce Détails Optimisation de la phase Reduce avec le Combiner S'il est présent, le framework insère le Combiner dans la pipeline de traitement sur les noeuds qui viennent de terminer la phase Map.

More information

Research Article Cloud Computing for Protein-Ligand Binding Site Comparison

Research Article Cloud Computing for Protein-Ligand Binding Site Comparison BioMed Research International Volume 213, Article ID 17356, 7 pages http://dx.doi.org/1.1155/213/17356 Research Article Cloud Computing for Protein-Ligand Binding Site Comparison Che-Lun Hung 1 and Guan-Jie

More information

Similarity Search in a Very Large Scale Using Hadoop and HBase

Similarity Search in a Very Large Scale Using Hadoop and HBase Similarity Search in a Very Large Scale Using Hadoop and HBase Stanislav Barton, Vlastislav Dohnal, Philippe Rigaux LAMSADE - Universite Paris Dauphine, France Internet Memory Foundation, Paris, France

More information

Solution for private cloud computing

Solution for private cloud computing The CC1 system Solution for private cloud computing 1 Outline What is CC1? Features Technical details System requirements and installation How to get it? 2 What is CC1? The CC1 system is a complete solution

More information

Cloud Services. May 28 th, 2014 Athens, Greece

Cloud Services. May 28 th, 2014 Athens, Greece Cloud Services May 28 th, 2014 Athens, Greece Cloud Services? Cloud services and PT PT is Virtualization technology and delivery leader Well known as storage & data protection integrator Chosen by RedHat

More information

Cloud Computing. What Are We Handing Over? Ganesh Shankar Advanced IT Core Pervasive Technology Institute

Cloud Computing. What Are We Handing Over? Ganesh Shankar Advanced IT Core Pervasive Technology Institute Cloud Computing What Are We Handing Over? Ganesh Shankar Advanced IT Core Pervasive Technology Institute Why is the Cloud Relevant to In the current research workflow. Medical Research? Data volumes are

More information

Building Bioinformatics Capacity in Africa. Nicky Mulder CBIO Group, UCT

Building Bioinformatics Capacity in Africa. Nicky Mulder CBIO Group, UCT Building Bioinformatics Capacity in Africa Nicky Mulder CBIO Group, UCT Outline What is bioinformatics? Why do we need IT infrastructure? What e-infrastructure does it require? How we are developing this

More information

Savanna Hadoop on. OpenStack. Savanna Technical Lead

Savanna Hadoop on. OpenStack. Savanna Technical Lead Savanna Hadoop on OpenStack Sergey Lukjanov Savanna Technical Lead Mirantis, 2013 Agenda Savanna Overview Savanna Use Cases Roadmap & Current Status Architecture & Features Overview Hadoop vs. Virtualization

More information

Open source Google-style large scale data analysis with Hadoop

Open source Google-style large scale data analysis with Hadoop Open source Google-style large scale data analysis with Hadoop Ioannis Konstantinou Email: ikons@cslab.ece.ntua.gr Web: http://www.cslab.ntua.gr/~ikons Computing Systems Laboratory School of Electrical

More information

Vincent Rullier Technology specialist Microsoft Suisse Romande

Vincent Rullier Technology specialist Microsoft Suisse Romande Vincent Rullier Technology specialist Microsoft Suisse Romande Pourquoi virtualiser Différents types de virtualisation Présentation Applications Postes de travail Serveurs Bénéfices Conclusion Q&A Technology

More information

Spécication et analyse formelle des politiques de sécurité dans le cloud computing

Spécication et analyse formelle des politiques de sécurité dans le cloud computing Spécication et analyse formelle des politiques de sécurité dans le cloud computing Asma GUESMI, Patrice CLEMENTE, Frédéric LOULERGUE, Pascal BERTHOMÉ Laboratoire d'informatique Fondamentale d'orléans (LIFO)

More information

1 Copyright 2011, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 7

1 Copyright 2011, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 7 1 Copyright 2011, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 7 Oracle Virtual Machine Server pre x86 Marián Kuna Technology Sales

More information

Course 20533: Implementing Microsoft Azure Infrastructure Solutions

Course 20533: Implementing Microsoft Azure Infrastructure Solutions Course 20533: Implementing Microsoft Azure Infrastructure Solutions Overview About this course This course is aimed at experienced IT Professionals who currently administer their on-premises infrastructure.

More information

Chapter 9 PUBLIC CLOUD LABORATORY. Sucha Smanchat, PhD. Faculty of Information Technology. King Mongkut s University of Technology North Bangkok

Chapter 9 PUBLIC CLOUD LABORATORY. Sucha Smanchat, PhD. Faculty of Information Technology. King Mongkut s University of Technology North Bangkok CLOUD COMPUTING PRACTICE 82 Chapter 9 PUBLIC CLOUD LABORATORY Hand on laboratory based on AWS Sucha Smanchat, PhD Faculty of Information Technology King Mongkut s University of Technology North Bangkok

More information

Cloud Computing. Adam Barker

Cloud Computing. Adam Barker Cloud Computing Adam Barker 1 Overview Introduction to Cloud computing Enabling technologies Different types of cloud: IaaS, PaaS and SaaS Cloud terminology Interacting with a cloud: management consoles

More information

Hadoop & SAS Data Loader for Hadoop

Hadoop & SAS Data Loader for Hadoop Turning Data into Value Hadoop & SAS Data Loader for Hadoop Sebastiaan Schaap Frederik Vandenberghe Agenda What s Hadoop SAS Data management: Traditional In-Database In-Memory The Hadoop analytics lifecycle

More information

OpenNebula The Open Source Solution for Data Center Virtualization

OpenNebula The Open Source Solution for Data Center Virtualization LinuxTag April 23rd 2012, Berlin OpenNebula The Open Source Solution for Data Center Virtualization Hector Sanjuan OpenNebula.org 1 What is OpenNebula? Multi-tenancy, Elasticity and Automatic Provision

More information

SeqPig: simple and scalable scripting for large sequencing data sets in Hadoop

SeqPig: simple and scalable scripting for large sequencing data sets in Hadoop SeqPig: simple and scalable scripting for large sequencing data sets in Hadoop André Schumacher, Luca Pireddu, Matti Niemenmaa, Aleksi Kallio, Eija Korpelainen, Gianluigi Zanetti and Keijo Heljanko Abstract

More information

Personalized Medicine and IT

Personalized Medicine and IT Personalized Medicine and IT Data-driven Medicine in the Age of Genomics www.intel.com/healthcare/bigdata Ketan Paranjape General Manager, Life Sciences Intel Corp. @Portlandketan 1 The Central Dogma of

More information