Running a Bioinformatics Help Desk. Solved and Unsolved Problems



Similar documents
Overview sequence projects

Building Bioinformatics Capacity in Africa. Nicky Mulder CBIO Group, UCT

Analysis and Integration of Big Data from Next-Generation Genomics, Epigenomics, and Transcriptomics

Cloud Computing Solutions for Genomics Across Geographic, Institutional and Economic Barriers

icer Bioinformatics Support Fall 2011

Comparing Methods for Identifying Transcription Factor Target Genes

Bioinformatics Resources at a Glance

Practical Solutions for Big Data Analytics

Handling next generation sequence data

Visualisation tools for next-generation sequencing

#jenkinsconf. Jenkins as a Scientific Data and Image Processing Platform. Jenkins User Conference Boston #jenkinsconf

RETRIEVING SEQUENCE INFORMATION. Nucleotide sequence databases. Database search. Sequence alignment and comparison

Large-scale Research Data Management and Analysis Using Globus Services. Ravi Madduri Argonne National Lab University of

Introduction to bioknoppix: Linux for the life sciences

Scientific and Technical Applications as a Service in the Cloud

BIOINFORMATICS Supporting competencies for the pharma industry

Computational Statistics: A Crash Course using R for Biologists (and Their Friends)

BIO 3352: BIOINFORMATICS II HYBRID COURSE SYLLABUS

BioHPC Web Computing Resources at CBSU

Cloud BioLinux: Pre-configured and On-demand Bioinformatics Computing for the Genomics Community

Full VM Tutorial. i2b2 Desktop Installation (Windows) Informatics for Integrating Biology and the Bedside

Cloud BioLinux: Pre-configured and On-demand Bioinformatics Computing for the Genomics Community

Chapter 2. imapper: A web server for the automated analysis and mapping of insertional mutagenesis sequence data against Ensembl genomes

CLC Sequence Viewer USER MANUAL

UGENE Quick Start Guide

Basic processing of next-generation sequencing (NGS) data

Hadoopizer : a cloud environment for bioinformatics data analysis

SPS ELECTRONIC HELP DESK Navigational Manual

A Primer of Genome Science THIRD

Brian Connolly Systems Engineer, LabKey Software LabKey Server in the Cloud

C I T Y O F W E S T L I N N

Service Desk. (Ver.Oct.2012)

IP Interface for the Somfy Digital Network (SDN) & RS485 URTSII

Windows Installation Guide

LifeScope Genomic Analysis Software 2.5

Genomic Applications on Cray supercomputers: Next Generation Sequencing Workflow. Barry Bolding. Cray Inc Seattle, WA

Cloud-Based Big Data Analytics in Bioinformatics

Bringing Hadoop into Bioinformatics with Cloudgene and CloudMan

GeneProf and the new GeneProf Web Services

Genome Viewing. Module 2. Using Genome Browsers to View Annotation of the Human Genome

How to Access the Economics Undergraduate Lab

Pipeline Pilot Enterprise Server. Flexible Integration of Disparate Data and Applications. Capture and Deployment of Best Practices

Technical Support System

KNIME Enterprise server usage and global deployment at NIBR

The University is comprised of seven colleges and offers 19. including more than 5000 graduate students.

WESSL-2 Script Builder Reference Guide

FREEping Installation and User Guide

Bioinformatics Unit Department of Biological Services. Get to know us

Whole genome sequencing of foodborne pathogens: experiences from the Reference Laboratory. Kathie Grant Gastrointestinal Bacteria Reference Unit

OpenCB a next generation big data analytics and visualisation platform for the Omics revolution

Running Agilent GeneSpring MPP on the Cloud

Version 5.0 Release Notes

Cheminformatics in the Cloud. Michael A. Dippolito DeltaSoft, Inc. 3-June-2009 ChemAxon European User Group Meeting

DataSafe Solutions. Protect your valuable genomic data

Managing and Conducting Biomedical Research on the Cloud Prasad Patil

Focusing on results not data comprehensive data analysis for targeted next generation sequencing

Scribe Demonstration Script Web Leads to Dynamics CRM. October 4,

CCR Biology - Chapter 9 Practice Test - Summer 2012

NECC History. Karl V. Steiner 2011 Annual NECC Meeting, Orono, Maine March 15, 2011

Fast. Integrated Genome Browser & DAS. Easy. Flexible. Free. bioviz.org/igb

GMQL Functional Comparison with BEDTools and BEDOPS

AGILENT S BIOINFORMATICS ANALYSIS SOFTWARE

Capacities vs. jobs in bioinformatics and biotechnology: a few points to the attention of current students & job-seekers

Amahi Instruction Manual

Science Gateways What are they and why are they having such a tremendous impact on science? Nancy Wilkins- Diehr wilkinsn@sdsc.edu

GETTING STARTED WITH FLEXI-CLOUD

Euro-BioImaging European Research Infrastructure for Imaging Technologies in Biological and Biomedical Sciences

Download Google Drive to windows 7

Bulk . What s Inside this Guide. Bulk and How To Get There 2. Bulk Setup 4. Bulk Details 7

Teaching Bioinformatics to Undergraduates

Tutorial. Reference Genome Tracks. Sample to Insight. November 27, 2015

Anforderungen der Life-Science Industrie an die Hochschulen. Hans Widmer Novartis Institutes for BioMedical Research

Computational infrastructure for NGS data analysis. José Carbonell Caballero Pablo Escobar

JBCC Electronic Service Payment Certificate Application. User Documentation Guide

UCLA Team Sequences Cell Line, Puts Open Source Software Framework into Production

Netezza Workbench Documentation

Educational and research resources in creating a community. Valérie de Crécy-Lagard & Matt Gitzendanner

Analysis of ChIP-seq data in Galaxy

Bio-Linux as a Tool for Bioinformatics Training

TEACHING WORKSTATION ROOMS

Core Bioinformatics. Degree Type Year Semester Bioinformàtica/Bioinformatics OB 0 1

Transcription:

2012/07/16 Running a Bioinformatics Help Desk from drawing colorful plasmid maps to working with HiSeq data Solved and Unsolved Problems Hans-Rudolf Hotz ( hrh@fmi.ch ) Friedrich Miescher Institute for Biomedical Research Basel, Switzerland

Friedrich Miescher Institute - part of the Novartis Research Foundation - affiliated institute of Basel University 316 employees (incl. 96 PhD students, 95 Post Docs) Epigenetics (7 research groups) Growth Control (8 research groups) Neurobiology (8 research groups) Technology Platforms Computational Biology Cell Sorting Imaging and Microscopy C. elegans Functional Genomics Histology Mass Spectrometry Protein Structure

Computational Biology / Bioinformatics - member of Swiss Institute of Bioinformatics - 3 core funded and 2 third party funded FTE - many interactions with Functional Genomics - hardware is maintained by IT - providing support for ~250 scientists - all services are free - collaborations papers - helpdesk

Bioinformatics Helpdesk providing support for: the average lab scientist, who wants to: the modern lab scientist, who wants to: draw plasmids do BLAST searches use Excel analyze NGS data work genome wide write (Perl) scripts...and how to bridge the gap?

the average lab scientist, struggles with drawing plasmids doing BLAST searches using Excel

drawing plasmids: the actual problem: there is no good desktop bioinformatics package our situation: package A: - 10 perpetual licenses bought in 2006 - windows only - stuck on version X (does not run on windows 7) package B: - 20 perpetual licenses bought in 2008-3 year support and upgrades - stuck on version Y - windows/mac/linux both packages are ridiculously expensive

drawing plasmids: open source/free alternatives we have been looking at: GENtle Serial Cloner pdraw32 BioEdit GeneCoder Workbench Ape UGene http://gentle.magnusmanske.de/ http://serialbasics.free.fr/serial_cloner.html http://www.acaclone.com/ http://www.mbio.ncsu.edu/bioedit/bioedit.html http://www.algosome.com http://www.ncbi.nlm.nih.gov/tools/gbench/ http://biologylabs.utah.edu/jorgensen/wayned/ape/ http://ugene.unipro.ru/ Has anybody experience with these or other open source/free packages?

drawing plasmids: what about EMBOSS? (we offer most EMBOSS tools in our Galaxy server) The tools cirdna and lindna produce reasonable maps of DNA constructs...but the data needs to be in 'cirp' and 'linp' format, respectively. How do I transform a genbank file to 'cirp format? we have no satisfying solution

doing BLAST searches the actual problem: people are struggling using web resources running training courses internal wiki pages / FAQ

using Excel the actual problems: - people have no statistics understanding - Excel no help provided running training courses promoting the use of R

the modern lab scientist, struggles with writing (Perl) scripts analyzing NGS data work genome wide simple solution: running introductory and advanced training courses in R

running R training courses - R is a decent scripting language - we can teach them statistics on the side - they can start using Bioconductor we currently re-implement our (perl based) NGS pipeline in a new Bioconductor package: QuasR...but one problem remains: people want to display their data in a genome browsers

genome browsers we use a combination of: R/Bioconductor: GenomeGraphs, new package Gviz web resources: ensembl - too slow UCSC - S. pombe is missing (we don t have the resources to run a local mirror) local on desktop: IGV and IGB Galaxy Trackster Has anybody a perfect solution?

Bridging the Gap the average lab scientist? the modern lab scientist http://galaxyproject.org/

History Tools GUI Display

why are we using Galaxy - open source - we can modify the tools - we can add our own tools (we offer our own NGS pipeline tools, and have disabled the provided Galaxy NGS tools/wrapper) - the Galaxy community is big and part of a wider community: GenomeSpace, GMOD - it is simple to install and maintain - you can adjust the set-up according your needs - it is easy to track what people are doing

we are using Galaxy for: - microarray analysis (wrapped R/Bioconductor scripts) - NGS analysis (wrapped perl scripts) - EMBOSS - file format conversion - genomic interval operations - providing a GUI for helpdesk scripts Galaxy is a stepping stone - people learn how to built workflows instead of pressing red buttons

Galaxy does not solve all your problems - there are no plasmid drawing tools - built in genome browser ( Trackster ) is Beta - it does not replace the Bioinformatician do not offer tools you don t understand - it does not replace the sys-admin if your tool does not run on the command line, it won t run in Galaxy big data needs big toys - it is simple to install and maintain...but it does need maintenance!

Acknowledgment Michael Stadler Anita Lerch Tim Roloff Lukas Burger Dimos Gaidatzis Stefan Grzybek