Mobyle @ RPBS A web portal for structural bioinformatics and chemoinformatics Julien Maupetit, Pierre Tufféry. RPBS Université Paris Diderot Paris 7 Bâtiment Lamarck 36, rue Hélène Brion 75013 Paris, France 2009/11/30
1 The RPBS Plateform
1 The RPBS Plateform 2 The Mobyle Project
1 The RPBS Plateform 2 The Mobyle Project 3 RPBS Mobyle Portal
1 The RPBS Plateform RPBS Project Location & computing resources 2 The Mobyle Project 3 RPBS Mobyle Portal
RPBS Project Ressource Parisienne en Bioinformatique Structurale Collaborative research in the field of structural bioinformatics Services related to protein structure Research groups implicated 1 MTi - INSERM U973 - Université Paris 7 2 DSIMB - INSERM U665 - Université Paris 7 3 ABI - Université Paris 6 4 IMPMC - CNRS UMR 7590 - Université Paris 6 et 7 5 IBBMC-MIP - CNRS UMR 8619 - Université Paris-sud 11 6 CNAM-STIC - Conservatoire National des Arts et Métiers, Paris 7 MAP5 - CNRS UMR 8145 - University Paris Descartes Who? Project coordination : Dr Pierre Tufféry Technical director : Dr Julien Maupetit
Location & computing resources Where? Paris Rive Gauche : Lamarck 5th floor What? Programs : more than 200 on-line programs available Storage : 8To + 15To (2010) Computing : 84 CPU cores (64/32bits : 66/18) + 160 CPU Cores (2010) Services : structural bioinformatics, 3D printing (new in 2010!)
1 The RPBS Plateform 2 The Mobyle Project Motivation Participative design Functionalities Architecture overview XML program description XML program description Quick tour : workspace Quick tour : form submission Quick tour : results Incoming features Related projects
Motivation Key problem Ease the acces to bio/chemo-informatics tools, for scientists. Bioinformatics tools are often command-line tools Command line-tools steep learning curve Providing a web interface, which biologists are more familiar with Developing custom cgis is both a time consuming and error-prone approach. Based on former projects PISE system (1999, C. Letondal) and P-Serveur (2004, P. Tufféry et al.)
Participative design User interviews, participatory workshops Need for a stable and integrated set of tools Synthetic view of results and analysis Re-usability features User-defined and ready-to-use "pipelines" Skepticism towards complex products
Functionalities Users Service search/discovery Service usage and documentation Data and services integration Workspace navigation Admins Users assistance Job management, tracability Developers Easy integration for new tools
Architecture overview Web server Web Portal (user interface) Network Remote Access Core Server Jobs management Users management Programs publication Administration Tools Execution environment Submission System (e.g., SGE, Torque) Bio-Programs (e.g., BLAST, EMBOSS, Phylip) Jobs XML User accounts XML Program definitions XML Bio-Banks (e.g., SWISSPROT, PDB)
XML program description Information in Mobyle stored in XML format (program definition, job status) What is a mobyle program description? a network service definition a program wrapper a UI (User Interface) definition a "semantic" description
XML program description,, <? xml version =" 1.0 " encoding="iso -8859-1 "?>... < program > <head > <name > blast2 </ name > < version > 2.2.17 </ version > <doc > <title > BLAST2 </ title > <description > <text lang="en">ncbi BLAST, with gaps</ text> </ description > <authors>altschul, Madden, Schaeffer, Zhang, Miller, Lipman</ authors> <reference>altschul, Stephen F., Thomas L. Madden, Alejandro A. Schaeffer, Jinghui Zhang, Zheng Zhang, Webb Miller, and David J. Lipman (1997), Gapped BLAST and PSI - BLAST: a new generation of protein database search programs, Nucleic Acids Res. 25 :3389-3402. </ reference> < doclink >http: // www.ncbi.nlm.nih.gov / books /bv.fcgi?rid = handbook. chapter. ch16 </ doclink > < doclink >http: // www.ncbi.nlm.nih.gov / Education / BLASTinfo /tut1.html </ doclink > </ doc > <category>database:search:homology </ category> <env name =" BLASTDB ">/path /to/db </env > <env name =" BLASTMAT ">/path /to/mat </env > </ head >
XML program description,, < paragraph > <name >query </ name > < prompt lang ="en">query Sequence </ prompt > < argpos >4</ argpos > < parameters > < parameter ismandatory ="1" issimple ="1" ismaininput ="1"> <name > query_seq </ name > < prompt lang ="en">query (-i)</ prompt > <type > < datatype > <class > Sequence </ class > </ datatype > <accepteddataformats > <dataformat>fasta</ dataformat > </ accepteddataformats > <card >1,n</ card > </ type > < format > <code proglang =" perl ">" -i \$ query "</ code > <code proglang =" python ">" -i "+ str ( query_seq )</code > </ format > < comment > <text lang="en">read ( first, query ) sequence or set from file</ text> </ comment > </ parameter >
Quick tour : workspace
Quick tour : form submission
Quick tour : results
Incoming features Visualization plugins Programs classification/research to improve Mobyle as web services client (BioMOBY) Workflow : edition and execution from the portal Programs "bookmarks" Tutorials "Dynamic" forms...
Related projects LIPM : PlayMOBY / Mobyle NIAID : on-line XML program description generator - Workflow execution SDSC : New Generation Biology Workbench NCSU s SNAP : workbench management tool for evolutionary population genetic analysis SIDGRid Portal : social sciences portal http://sidgrid.ci.uchicago.edu
1 The RPBS Plateform 2 The Mobyle Project 3 RPBS Mobyle Portal Architecture 3D bioinformatics resource Programs XML Datatypes Java Applets Usage
Architecture Mobyle RPBS server http://mobyle.rpbs.univ-paris-diderot.fr
3D bioinformatics resource Drug
3D bioinformatics resource Drug 2D/3D, ADMETox, LigandSearch, various tools (OpenBabel, DeSalt, JME, LogP)
3D bioinformatics resource Drug 2D/3D, ADMETox, LigandSearch, various tools (OpenBabel, DeSalt, JME, LogP) Sequence
3D bioinformatics resource Drug 2D/3D, ADMETox, LigandSearch, various tools (OpenBabel, DeSalt, JME, LogP) Sequence Alignment (ProbCons, clustalw), sequence formatter (squizz), EMBOSS, Phylip,...
3D bioinformatics resource Drug 2D/3D, ADMETox, LigandSearch, various tools (OpenBabel, DeSalt, JME, LogP) Sequence Alignment (ProbCons, clustalw), sequence formatter (squizz), EMBOSS, Phylip,... Structure
3D bioinformatics resource Drug 2D/3D, ADMETox, LigandSearch, various tools (OpenBabel, DeSalt, JME, LogP) Sequence Alignment (ProbCons, clustalw), sequence formatter (squizz), EMBOSS, Phylip,... Structure Relevant and RPBS-specific tools for protein structure analysis (Stride, ASA, H-Bonds, PCE-pKa, PCE-pot, TEF,...), edition (side-chains substitution, add hydrogens,...), prediction (HCA, PSIPRED, SSpro, PEP-FOLD, MIR), quality assessment (QMean)...
3D bioinformatics resource Drug 2D/3D, ADMETox, LigandSearch, various tools (OpenBabel, DeSalt, JME, LogP) Sequence Alignment (ProbCons, clustalw), sequence formatter (squizz), EMBOSS, Phylip,... Structure Relevant and RPBS-specific tools for protein structure analysis (Stride, ASA, H-Bonds, PCE-pKa, PCE-pot, TEF,...), edition (side-chains substitution, add hydrogens,...), prediction (HCA, PSIPRED, SSpro, PEP-FOLD, MIR), quality assessment (QMean)... MobyleNet homology modeling pipeline
Datatypes Parameters datatype is the core of Mobyle (MobyleNet) pipelining capability MobyleNet common scheme for typing arguments. See : http://mobylenet.rpbs.univ-paris-diderot.fr/doc/types.html Class SuperClass DataFormat Sequence - FASTA, CLUSTAL, PIR, GDE, EMBL, GENBANK, SWISSPROT, PIR_3D Alignment - Structure AbstractText PDB, xyz, Mol2, smiles, sdf... There is no relevant converter for structural data, each parameter needs an accurate <DataFormat>
Java Applets
Java Applets JMol applet example :,, < interface > <table xmlns=" http: // www.w3.org /1999/ xhtml " width="100% "> <tr > <td width =" 50% "> < applet code =" JmolApplet " archive ="/ portal / applets /jmol / JmolApplet.jar " width =" 100% " height =" 450 px"> <param name =" progressbar " value ="true "/> <param name =" load " value ="$ resultfile "/> </ applet > </td > <td width =" 50% "> <object xmlns=" http: // www.w3.org /1999/ xhtml " type="text / plain " data="$ resultfile " height =" 250 px"/> </td > </tr > </ table > </ interface >
Usage
Usage Jobs location (2009/09 to 2009/11) Stats Around 20000 jobs launched since september 2008 from more than 5000 different locations More than 5000 jobs since september 2009 France represents only 25% of the jobs
C. Letondal. A Web interface generator for molecular biology programs in Unix. Bioinformatics, 17 :73 82, Jan 2001. C. Alland, F. Moreews, D. Boens, M. Carpentier, S. Chiusa, M. Lonquety, N. Renault, Y. Wong, H. Cantalloube, J. Chomilier, J. Hochez, J. Pothier, B. O. Villoutreix, J. F. Zagury, and P. Tufféry. RPBS : a web resource for structural bioinformatics. Nucleic Acids Res., 33 :W44 49, Jul 2005. B. Néron, H. Ménager, C. Maufrais, N. Joly, J. Maupetit, S. Letort, S. Carrere, P. Tuffery, and C. Letondal. Mobyle : a new full web bioinformatics framework. Bioinformatics, 25 :3005 3011, Nov 2009. J. Maupetit, P. Derreumaux, and P. Tuffery. PEP-FOLD : an online resource for de novo peptide structure prediction. Nucleic Acids Res., 37 :498 503, Jul 2009. O. Sperandio, M. Petitjean, and P. Tuffery. wwligcsrre : a 3D ligand-based server for hit identification and optimization. Nucleic Acids Res., 37 :W504 509, Jul 2009.