Outline The RPBS Plateform The Mobyle Project RPBS Mobyle Portal. Mobyle @ RPBS. A web portal for structural bioinformatics and chemoinformatics



Similar documents
Structural Bioinformatics

UGENE Quick Start Guide

Cloud Ready for Bioinformatics?

Genome Explorer For Comparative Genome Analysis

Bio-Informatics Lectures. A Short Introduction

Bioinformatics Grid - Enabled Tools For Biologists.

SGI. High Throughput Computing (HTC) Wrapper Program for Bioinformatics on SGI ICE and SGI UV Systems. January, Abstract. Haruna Cofer*, PhD

A Multiple DNA Sequence Translation Tool Incorporating Web Robot and Intelligent Recommendation Techniques

EMBL-EBI Web Services

Linear Sequence Analysis. 3-D Structure Analysis

RETRIEVING SEQUENCE INFORMATION. Nucleotide sequence databases. Database search. Sequence alignment and comparison

Phylogenetic Trees Made Easy

An agent-based layered middleware as tool integration

BIO 3350: ELEMENTS OF BIOINFORMATICS PARTIALLY ONLINE SYLLABUS

Bio-Linux as a Tool for Bioinformatics Training

PROGRAMMING FOR BIOLOGISTS. BIOL 6297 Monday, Wednesday 10 am -12 pm

CD-HIT User s Guide. Last updated: April 5,

This document presents the new features available in ngklast release 4.4 and KServer 4.2.

Syllabus of B.Sc. (Bioinformatics) Subject- Bioinformatics (as one subject) B.Sc. I Year Semester I Paper I: Basic of Bioinformatics 85 marks

BIOINFORMATICS TUTORIAL

Protein annotation and modelling servers at University College London

Bioinformatics Tools Tutorial Project Gene ID: KRas

A Tutorial in Genetic Sequence Classification Tools and Techniques

Integrating Bioinformatics, Medical Sciences and Drug Discovery

Introduction to GCG and SeqLab

Version 5.0 Release Notes

Eoulsan Analyse du séquençage à haut débit dans le cloud et sur la grille

Putting the pieces together: Integrated Research Data Management Using the LabKey Server

Biological Databases and Protein Sequence Analysis

CPAS Overview. Josh Eckels LabKey Software

Introduction to Bioinformatics 2. DNA Sequence Retrieval and comparison

Cloud Computing Solutions for Genomics Across Geographic, Institutional and Economic Barriers

e-biogenouest : The Tools

Guide for Bioinformatics Project Module 3

BUDAPEST: Bioinformatics Utility for Data Analysis of Proteomics using ESTs

Global and Discovery Proteomics Lecture Agenda

HPC PORTAL DEVELOPMENT PLATFORM

Unipro UGENE User Manual Version

Bioinformatics Resources at a Glance

CLC Server Command Line Tools USER MANUAL

Pipeline Pilot Enterprise Server. Flexible Integration of Disparate Data and Applications. Capture and Deployment of Best Practices

PyRy3D: a software tool for modeling of large macromolecular complexes MODELING OF STRUCTURES FOR LARGE MACROMOLECULAR COMPLEXES

Yuri Pevzner Lembrecht way Tampa, FL Phone: (610)

Core Bioinformatics. Degree Type Year Semester Bioinformàtica/Bioinformatics OB 0 1

Webserver: bioinfo.bio.wzw.tum.de Mail:

Efficient Parallel Execution of Sequence Similarity Analysis Via Dynamic Load Balancing

Bioinformatique sur Cloud Cas d usage avec le portail Galaxy

Rapid alignment methods: FASTA and BLAST. p The biological problem p Search strategies p FASTA p BLAST

Algorithms in Bioinformatics I, WS06/07, C.Dieterich 47. This lecture is based on the following, which are all recommended reading:

Module 1. Sequence Formats and Retrieval. Charles Steward

BioHPC Web Computing Resources at CBSU

Using the Grid for the interactive workflow management in biomedicine. Andrea Schenone BIOLAB DIST University of Genova

Molecular Databases and Tools

Sequencing data. And other experimental data. EMBL-EBI data resources growth

Apply PERL to BioInformatics (II)

PHYML Online: A Web Server for Fast Maximum Likelihood-Based Phylogenetic Inference

Debian Med. Integrated software environment for all medical purposes based on Debian GNU/Linux. Andreas Tille. OSWC, Malaga Debian.

Early Cloud Experiences with the Kepler Scientific Workflow System

A curated Domain centric shared Docker registry linked to the Galaxy toolshed

Oracle Universal Content Management

Discovering Bioinformatics

(A GUIDE for the Graphical User Interface (GUI) GDE)

DAISY PRODUCER: AN INTEGRATED PRODUCTION MANAGEMENT SYSTEM FOR ACCESSIBLE MEDIA

Protein Block Expert (PBE): a web-based protein structure analysis server using a structural alphabet

When you install Mascot, it includes a copy of the Swiss-Prot protein database. However, it is almost certain that you and your colleagues will want

Introduction to bioknoppix: Linux for the life sciences

Cloud pour la Bioinformatique

Section I Using Jmol as a Computer Visualization Tool

Processing Genome Data using Scalable Database Technology. My Background

Software review. Analysis for free: Comparing programs for sequence analysis

Transcription:

Mobyle @ RPBS A web portal for structural bioinformatics and chemoinformatics Julien Maupetit, Pierre Tufféry. RPBS Université Paris Diderot Paris 7 Bâtiment Lamarck 36, rue Hélène Brion 75013 Paris, France 2009/11/30

1 The RPBS Plateform

1 The RPBS Plateform 2 The Mobyle Project

1 The RPBS Plateform 2 The Mobyle Project 3 RPBS Mobyle Portal

1 The RPBS Plateform RPBS Project Location & computing resources 2 The Mobyle Project 3 RPBS Mobyle Portal

RPBS Project Ressource Parisienne en Bioinformatique Structurale Collaborative research in the field of structural bioinformatics Services related to protein structure Research groups implicated 1 MTi - INSERM U973 - Université Paris 7 2 DSIMB - INSERM U665 - Université Paris 7 3 ABI - Université Paris 6 4 IMPMC - CNRS UMR 7590 - Université Paris 6 et 7 5 IBBMC-MIP - CNRS UMR 8619 - Université Paris-sud 11 6 CNAM-STIC - Conservatoire National des Arts et Métiers, Paris 7 MAP5 - CNRS UMR 8145 - University Paris Descartes Who? Project coordination : Dr Pierre Tufféry Technical director : Dr Julien Maupetit

Location & computing resources Where? Paris Rive Gauche : Lamarck 5th floor What? Programs : more than 200 on-line programs available Storage : 8To + 15To (2010) Computing : 84 CPU cores (64/32bits : 66/18) + 160 CPU Cores (2010) Services : structural bioinformatics, 3D printing (new in 2010!)

1 The RPBS Plateform 2 The Mobyle Project Motivation Participative design Functionalities Architecture overview XML program description XML program description Quick tour : workspace Quick tour : form submission Quick tour : results Incoming features Related projects

Motivation Key problem Ease the acces to bio/chemo-informatics tools, for scientists. Bioinformatics tools are often command-line tools Command line-tools steep learning curve Providing a web interface, which biologists are more familiar with Developing custom cgis is both a time consuming and error-prone approach. Based on former projects PISE system (1999, C. Letondal) and P-Serveur (2004, P. Tufféry et al.)

Participative design User interviews, participatory workshops Need for a stable and integrated set of tools Synthetic view of results and analysis Re-usability features User-defined and ready-to-use "pipelines" Skepticism towards complex products

Functionalities Users Service search/discovery Service usage and documentation Data and services integration Workspace navigation Admins Users assistance Job management, tracability Developers Easy integration for new tools

Architecture overview Web server Web Portal (user interface) Network Remote Access Core Server Jobs management Users management Programs publication Administration Tools Execution environment Submission System (e.g., SGE, Torque) Bio-Programs (e.g., BLAST, EMBOSS, Phylip) Jobs XML User accounts XML Program definitions XML Bio-Banks (e.g., SWISSPROT, PDB)

XML program description Information in Mobyle stored in XML format (program definition, job status) What is a mobyle program description? a network service definition a program wrapper a UI (User Interface) definition a "semantic" description

XML program description,, <? xml version =" 1.0 " encoding="iso -8859-1 "?>... < program > <head > <name > blast2 </ name > < version > 2.2.17 </ version > <doc > <title > BLAST2 </ title > <description > <text lang="en">ncbi BLAST, with gaps</ text> </ description > <authors>altschul, Madden, Schaeffer, Zhang, Miller, Lipman</ authors> <reference>altschul, Stephen F., Thomas L. Madden, Alejandro A. Schaeffer, Jinghui Zhang, Zheng Zhang, Webb Miller, and David J. Lipman (1997), Gapped BLAST and PSI - BLAST: a new generation of protein database search programs, Nucleic Acids Res. 25 :3389-3402. </ reference> < doclink >http: // www.ncbi.nlm.nih.gov / books /bv.fcgi?rid = handbook. chapter. ch16 </ doclink > < doclink >http: // www.ncbi.nlm.nih.gov / Education / BLASTinfo /tut1.html </ doclink > </ doc > <category>database:search:homology </ category> <env name =" BLASTDB ">/path /to/db </env > <env name =" BLASTMAT ">/path /to/mat </env > </ head >

XML program description,, < paragraph > <name >query </ name > < prompt lang ="en">query Sequence </ prompt > < argpos >4</ argpos > < parameters > < parameter ismandatory ="1" issimple ="1" ismaininput ="1"> <name > query_seq </ name > < prompt lang ="en">query (-i)</ prompt > <type > < datatype > <class > Sequence </ class > </ datatype > <accepteddataformats > <dataformat>fasta</ dataformat > </ accepteddataformats > <card >1,n</ card > </ type > < format > <code proglang =" perl ">" -i \$ query "</ code > <code proglang =" python ">" -i "+ str ( query_seq )</code > </ format > < comment > <text lang="en">read ( first, query ) sequence or set from file</ text> </ comment > </ parameter >

Quick tour : workspace

Quick tour : form submission

Quick tour : results

Incoming features Visualization plugins Programs classification/research to improve Mobyle as web services client (BioMOBY) Workflow : edition and execution from the portal Programs "bookmarks" Tutorials "Dynamic" forms...

Related projects LIPM : PlayMOBY / Mobyle NIAID : on-line XML program description generator - Workflow execution SDSC : New Generation Biology Workbench NCSU s SNAP : workbench management tool for evolutionary population genetic analysis SIDGRid Portal : social sciences portal http://sidgrid.ci.uchicago.edu

1 The RPBS Plateform 2 The Mobyle Project 3 RPBS Mobyle Portal Architecture 3D bioinformatics resource Programs XML Datatypes Java Applets Usage

Architecture Mobyle RPBS server http://mobyle.rpbs.univ-paris-diderot.fr

3D bioinformatics resource Drug

3D bioinformatics resource Drug 2D/3D, ADMETox, LigandSearch, various tools (OpenBabel, DeSalt, JME, LogP)

3D bioinformatics resource Drug 2D/3D, ADMETox, LigandSearch, various tools (OpenBabel, DeSalt, JME, LogP) Sequence

3D bioinformatics resource Drug 2D/3D, ADMETox, LigandSearch, various tools (OpenBabel, DeSalt, JME, LogP) Sequence Alignment (ProbCons, clustalw), sequence formatter (squizz), EMBOSS, Phylip,...

3D bioinformatics resource Drug 2D/3D, ADMETox, LigandSearch, various tools (OpenBabel, DeSalt, JME, LogP) Sequence Alignment (ProbCons, clustalw), sequence formatter (squizz), EMBOSS, Phylip,... Structure

3D bioinformatics resource Drug 2D/3D, ADMETox, LigandSearch, various tools (OpenBabel, DeSalt, JME, LogP) Sequence Alignment (ProbCons, clustalw), sequence formatter (squizz), EMBOSS, Phylip,... Structure Relevant and RPBS-specific tools for protein structure analysis (Stride, ASA, H-Bonds, PCE-pKa, PCE-pot, TEF,...), edition (side-chains substitution, add hydrogens,...), prediction (HCA, PSIPRED, SSpro, PEP-FOLD, MIR), quality assessment (QMean)...

3D bioinformatics resource Drug 2D/3D, ADMETox, LigandSearch, various tools (OpenBabel, DeSalt, JME, LogP) Sequence Alignment (ProbCons, clustalw), sequence formatter (squizz), EMBOSS, Phylip,... Structure Relevant and RPBS-specific tools for protein structure analysis (Stride, ASA, H-Bonds, PCE-pKa, PCE-pot, TEF,...), edition (side-chains substitution, add hydrogens,...), prediction (HCA, PSIPRED, SSpro, PEP-FOLD, MIR), quality assessment (QMean)... MobyleNet homology modeling pipeline

Datatypes Parameters datatype is the core of Mobyle (MobyleNet) pipelining capability MobyleNet common scheme for typing arguments. See : http://mobylenet.rpbs.univ-paris-diderot.fr/doc/types.html Class SuperClass DataFormat Sequence - FASTA, CLUSTAL, PIR, GDE, EMBL, GENBANK, SWISSPROT, PIR_3D Alignment - Structure AbstractText PDB, xyz, Mol2, smiles, sdf... There is no relevant converter for structural data, each parameter needs an accurate <DataFormat>

Java Applets

Java Applets JMol applet example :,, < interface > <table xmlns=" http: // www.w3.org /1999/ xhtml " width="100% "> <tr > <td width =" 50% "> < applet code =" JmolApplet " archive ="/ portal / applets /jmol / JmolApplet.jar " width =" 100% " height =" 450 px"> <param name =" progressbar " value ="true "/> <param name =" load " value ="$ resultfile "/> </ applet > </td > <td width =" 50% "> <object xmlns=" http: // www.w3.org /1999/ xhtml " type="text / plain " data="$ resultfile " height =" 250 px"/> </td > </tr > </ table > </ interface >

Usage

Usage Jobs location (2009/09 to 2009/11) Stats Around 20000 jobs launched since september 2008 from more than 5000 different locations More than 5000 jobs since september 2009 France represents only 25% of the jobs

C. Letondal. A Web interface generator for molecular biology programs in Unix. Bioinformatics, 17 :73 82, Jan 2001. C. Alland, F. Moreews, D. Boens, M. Carpentier, S. Chiusa, M. Lonquety, N. Renault, Y. Wong, H. Cantalloube, J. Chomilier, J. Hochez, J. Pothier, B. O. Villoutreix, J. F. Zagury, and P. Tufféry. RPBS : a web resource for structural bioinformatics. Nucleic Acids Res., 33 :W44 49, Jul 2005. B. Néron, H. Ménager, C. Maufrais, N. Joly, J. Maupetit, S. Letort, S. Carrere, P. Tuffery, and C. Letondal. Mobyle : a new full web bioinformatics framework. Bioinformatics, 25 :3005 3011, Nov 2009. J. Maupetit, P. Derreumaux, and P. Tuffery. PEP-FOLD : an online resource for de novo peptide structure prediction. Nucleic Acids Res., 37 :498 503, Jul 2009. O. Sperandio, M. Petitjean, and P. Tuffery. wwligcsrre : a 3D ligand-based server for hit identification and optimization. Nucleic Acids Res., 37 :W504 509, Jul 2009.