A data management framework for the Fungal Tree of Life
|
|
- Virgil Armstrong
- 8 years ago
- Views:
Transcription
1 Web Accessible Sequence Analysis for Biological Inference A data management framework for the Fungal Tree of Life Kauff F, Cox CJ, Lutzoni F WASABI: An automated sequence processing system for multi-gene phylogenies. Syst. Biol. 56(3):
2 Quelle: Tree Thinking
3 Cladistics Aus: Assembling the Tree of Life, Oxford University Press, 2004
4
5 ATOL Assembling the Tree of Life... Along with comparative data on morphology, fossils, development, behavior, and interactions of all forms of life on earth, these new data streams make even more critical the need for an organizing framework for information retrieval, analysis, and prediction.... Currently, single investigators or small teams of researchers are studying the evolutionary pathways of heredity usually concentrating on phylogenetic groups of modest size and lower taxonomic rank. Assembly of a framework phylogeny, or Tree of Life, for all 1.7 million described species requires a greatly magnified effort by large teams working across institutions and disciplines.... Teams of investigators also will be supported for projects in data acquisition, analysis, algorithm development and dissemination in computational phylogenetics and phyloinformatics. (NSF website at
6 AFTOL: the Fungal Tree of Life Part of NSF financed ATOL project Cooperation: Clark University, Duke University, Oregon State University, University of Minnesota Goal: sequencing of 8 genetic loci for a total of 1500 taxa TEM / ultrastructural data of selected specimen
7 AFTOL Bioinformatics: Web Accessible Sequence Analysis for Biological Inference Central storage for all project data Participant and public interface to the project data Automated analyses of raw sequence data: Phred, Phrap, local BLAST,... Automated analyses of gene sequence data: alignment, test for topological congruence provide conflict free datasets of single and combined loci for further analysis (e.g. CIPRES) and individual download Interface to GenBank Taxon information Voucher & sample plate submission WASABI GenBank DNA, analyses, & results
8 WASABI: components PostgreSQL database Zope Application Server User (Internet)
9 WASABI: components Duke Seqencing lab Phred Blast Phrap Blast PostgreSQL database Zope Application Server User (Internet)
10 WASABI: components Blast database Duke Seqencing lab Phred Blast Verification Phrap Blast PostgreSQL database Zope Application Server User (Internet)
11 WASABI: components Blast database GenBank Duke Seqencing lab Phred Blast Phrap Blast PostgreSQL database Zope Application Server Alignment Congruence Phylogen. Analysis (MrBayes, Paup, p4) User (Internet)
12 Blast database GenBank EUtils Server Sequencing facility MOA Phred Blast Phrap Blast PostgreSQL database Zope Application Server alignment congruence (compat & tct) phylogenetic analyses (MrBayes, Paup, p4) Users (Internet) Python
13 Data analysis New AFTOL DB LSU LSU core LSU core LSU SSU core SSU SSU core Alignment SSU RPB1 RPB1 core RPB1 core RPB1
14 Alignment atrich_hirs atrype_unkn Auric_auri Aurip_aure Auris_vulg Auxar_zuff averpa_coni axanth_cons axylar_acut axylar_hypo Backu_circ Backu_cten BAEPLAx Banke_fuli Basid_hapt Basid_rana Benja_poit Bimur_nova Blake_tris CTTAGGTATCGGGCGATGTTAATTTTAT---GTGTCGCTCTTGGGTTCT GGA GGGGGAGTATGGT CTTAGGGATCGGGCGATGTTATCTTTTT---ATGTCGCTCTTGGGCTCT GGA GGGGGAGTATGGT CTTAGGGATCGGGCGACCTCTTTTTT---ATGTGGCGCTCTTGGGTTCT GGA GGGGGAGTATGGT CTTAGGGATCGGGCGGTCTCAATTAT---ATATGTCGATCTTGGGTTCT GGA GGGGGAGTATGGT CTTAGGGATCGGGCGACCTCAATTTAA---TTTGTCGCTCTTGGGTTCT GGA GGGGGAGTATGGT CTTAGGGATCGGGCGGGCAACTTTTAA---TATGTCGCTCTTGGGTTCTCGATCGGCTACGAGCGGACTAGCGGCGGCGCATCGAGCAGGGA GGGGGAGTATGGT CTTAGGGATCGGGCGATGCTTAATAGAT---GTGTCGCTCTTGGGTTCT GGA GGGGGAGTATGGT CTTAGGGATCGGGCGGTGTTATTATTTT---GTGTCGGTCTTGTTTTCT GGA GGGGGAGTATGGT CTTAGGGATCGGGCGATGTTATTTTTT----GTGTCGCTCTTGGGTTCT GGA GGGGGAGTATGGT CTTAGGGATCGGGCGATGTTATTTTTT----GTGTCGCTCCTGGGTTCT GGA GGGGGAGTATGGT CTTAAGGATCGGGCCTGTTTATT------ATGTGTCGCTCTTGGGTTCT GGA GGGGGAGTATGGT CTTAGGGATCGGGCTTGTTTATT------ATGTGTCGCTCTTGGGTTCT GGA GGGGGAGTATGGT CTTAGGTATCGGGCGGTGTTATCATTTT---GTGTCGCTCCTGGGTTCT GGA GGGGGAGTATGGT CTTAGGGATCGGGCGAACTCAATTCTA---TGTGTCGCTCTTGGGTTCT GGA GGGGGAGTATGGT CTTAGGGATCGGGCAAT---GT------TATGTGCCGCTCTTAGGTTCT GGAACGGGCAGGATGTCGTAGGCTGGGGGAGTATGGT CTTAGGGATCGGGCAAT---GT------TATGTGTCGCTCTTGGGTTCT GGA GGGGGAGTATGGT CTAAGGGATCGGGCTTGTTTATT------ATGTGTCGCTCTTGGGTTCT GGA GGGGGAGTATGGT CTTAGGGATCGGGCGGTGTTTCTATTG---TGTGTCGCTCTTGGGTTCT GGA GGGGGAGTATGGT CTTAGGGATCGGGCTTGTTTATT------ATGTGTCGCTCTTGGGTTCT GGA GGGGGAGTATGGT Esto_nia GGGGGTTCGCTTAGGGATCGGGCTTGTTTATTATGTGTCGCTCTTGGGTTCTCTACGAGCGGACTAGCGGCGGCGCATCGAGGAGGGGGAGTATGGTCGGGCGGTGTTTATTAGATTTTAGATGGT
15 Alignment atrich_hirs atrype_unkn Auric_auri Aurip_aure Auris_vulg Auxar_zuff averpa_coni axanth_cons axylar_acut axylar_hypo Backu_circ Backu_cten BAEPLAx Banke_fuli Basid_hapt Basid_rana Benja_poit Bimur_nova Blake_tris ambiguous intron indel CTTAGGTATCGGGCGATGTTAATTTTAT---GTGTCGCTCTTGGGTTCT GGA GGGGGAGTATGGT CTTAGGGATCGGGCGATGTTATCTTTTT---ATGTCGCTCTTGGGCTCT GGA GGGGGAGTATGGT CTTAGGGATCGGGCGACCTCTTTTTT---ATGTGGCGCTCTTGGGTTCT GGA GGGGGAGTATGGT CTTAGGGATCGGGCGGTCTCAATTAT---ATATGTCGATCTTGGGTTCT GGA GGGGGAGTATGGT CTTAGGGATCGGGCGACCTCAATTTAA---TTTGTCGCTCTTGGGTTCT GGA GGGGGAGTATGGT CTTAGGGATCGGGCGGGCAACTTTTAA---TATGTCGCTCTTGGGTTCTCGATCGGCTACGAGCGGACTAGCGGCGGCGCATCGAGCAGGGA GGGGGAGTATGGT CTTAGGGATCGGGCGATGCTTAATAGAT---GTGTCGCTCTTGGGTTCT GGA GGGGGAGTATGGT CTTAGGGATCGGGCGGTGTTATTATTTT---GTGTCGGTCTTGTTTTCT GGA GGGGGAGTATGGT CTTAGGGATCGGGCGATGTTATTTTTT----GTGTCGCTCTTGGGTTCT GGA GGGGGAGTATGGT CTTAGGGATCGGGCGATGTTATTTTTT----GTGTCGCTCCTGGGTTCT GGA GGGGGAGTATGGT CTTAAGGATCGGGCCTGTTTATT------ATGTGTCGCTCTTGGGTTCT GGA GGGGGAGTATGGT CTTAGGGATCGGGCTTGTTTATT------ATGTGTCGCTCTTGGGTTCT GGA GGGGGAGTATGGT CTTAGGTATCGGGCGGTGTTATCATTTT---GTGTCGCTCCTGGGTTCT GGA GGGGGAGTATGGT CTTAGGGATCGGGCGAACTCAATTCTA---TGTGTCGCTCTTGGGTTCT GGA GGGGGAGTATGGT CTTAGGGATCGGGCAAT---GT------TATGTGCCGCTCTTAGGTTCT GGAACGGGCAGGATGTCGTAGGCTGGGGGAGTATGGT CTTAGGGATCGGGCAAT---GT------TATGTGTCGCTCTTGGGTTCT GGA GGGGGAGTATGGT CTAAGGGATCGGGCTTGTTTATT------ATGTGTCGCTCTTGGGTTCT GGA GGGGGAGTATGGT CTTAGGGATCGGGCGGTGTTTCTATTG---TGTGTCGCTCTTGGGTTCT GGA GGGGGAGTATGGT CTTAGGGATCGGGCTTGTTTATT------ATGTGTCGCTCTTGGGTTCT GGA GGGGGAGTATGGT Esto_nia GGGGGTTCGCTTAGGGATCGGGCTTGTTTATTATGTGTCGCTCTTGGGTTCTCTACGAGCGGACTAGCGGCGGCGCATCGAGGAGGGGGAGTATGGTCGGGCGGTGTTTATTAGATTTTAGATGGT
16 Data analysis New AFTOL DB LSU LSU core LSU core new LSU core SSU core SSU SSU core Alignment new SSU core RPB1 RPB1 core RPB1 core new RPB1 core
17 Data set combination data set 1 + data set 2 data set 1 data set 2 combined data set phylogenetic estimate
18 Data set combination data set 1 yes test for congruence data set 2 no eliminate conflicting
19 Data analysis Neue Sequenzen AFTOL DB LSU core SSU core RPB1 core LSU LSU core SSU SSU core RPB1 RPB1 core Alignment LSU SSU RPB1 Test for topological congruence Taxon pruning Multiprocessor Cluster
20 Data analysis Neue Sequenzen AFTOL DB LSU core SSU core RPB1 core LSU LSU core SSU SSU core RPB1 RPB1 core Alignment LSU SSU RPB1 Test for topological congruence Taxon pruning LSU SSU SSU RPB1 LSU RPB1 LSU SSU RPB1 Multiprocessor Cluster
21 Data analysis LSU SSU SSU RPB1 LSU RPB1 LSU SSU RPB1 very sophisticated phylogenetic analysis Multiprocessor Cluster
22 Data flow overview B. WASABI Pipeline GenBank Final analysis Publication CLUSTALW PHRED PHRAP Local BLAST WASALIGN Conflict detection Automated sequencer DNA sequence chromatograms Single read Contig BLAST results Finalized gene Core alignments Deleted Single locus trees Combined loci trees A. WASABI Database Mandatory user verification C. WASABI Data Interface External editing and visualization (e.g. Sequencher) ZOPE WWW interface MESQUITE interface Direct data access, editing, and visualization (future development)
23 Data flow: automated data processing pipeline B. WASABI Pipeline GenBank Final analysis Publication CLUSTALW PHRED PHRAP Local BLAST WASALIGN Conflict detection Automated sequencer DNA sequence chromatograms Single read Contig BLAST results Finalized gene Core alignments Deleted Single locus trees Combined loci trees A. WASABI Database Mandatory user verification C. WASABI Data Interface External editing and visualization (e.g. Sequencher) ZOPE WWW interface MESQUITE interface Direct data access, editing, and visualization (future development)
24 Provenance in WASABI: keep track of user interactions B. WASABI Pipeline GenBank Final analysis Publication CLUSTALW PHRED PHRAP Local BLAST WASALIGN Conflict detection Automated sequencer DNA sequence chromatograms Single read Contig BLAST results Finalized gene Core alignments Deleted Single locus trees Combined loci trees A. WASABI Database Mandatory user verification C. WASABI Data Interface PHRED External editing and visualization (e.g. Sequencher) ZOPE WWW interface MESQUITE interface PHRAP Local BLAST
25 Provenance in WASABI: keep track of user interactions B. WASABI Pipeline Current implementation gives access only to owners of the data PHRED PHRAP GenBank Other data access only by admins (direct SQL) Local BLAST Authors are supposed to keep track of their changes WASABI only keeps most recent version. Future data access with third-party software and access by multiple users will need more Final sophisticated Publication access analysis control CLUSTALW Access to Conflict different versions of the data WASALIGN and a 'roll-back' detection feature are desirable. Automated sequencer DNA sequence chromatograms Single read Contig BLAST results Finalized gene Core alignments Deleted Single locus trees Combined loci trees A. WASABI Database Mandatory user verification C. WASABI Data Interface PHRED External editing and visualization (e.g. Sequencher) ZOPE WWW interface MESQUITE interface PHRAP Local BLAST
26 Provenance in WASABI: keep track of user interactions B. WASABI Pipeline A GenBank Final analysis B CLUSTALW C PHRED PHRAP Local BLAST WASALIGN Conflict detection D Automated sequencer DNA sequence chromatograms Single read Contig BLAST results Finalized gene Core alignments Deleted Single locus trees Combined loci trees A. WASABI Database Mandatory user verification C. WASABI Data Interface External editing and visualization (e.g. Sequencher) ZOPE WWW interface MESQUITE interface Direct data access, editing, and visualization (future development)
27 Tracing back final results to original data B. WASABI Pipeline A GenBank Final analysis B CLUSTALW C PHRED PHRAP Local BLAST WASALIGN Conflict detection D Automated sequencer DNA sequence chromatograms Single read Contig BLAST results Finalized gene Core alignments Deleted Single locus trees Combined loci trees A. WASABI Database A Mandatory user verification B C. WASABI Data Interface C D based on multiple consisting of many Core alignments External editing and visualization (e.g. Sequencher) Finalized gene ZOPE WWW interface created from many MESQUITE interface DNA Single sequence read Direct data access, chromatograms editing, and visualization (future development)
28 Thanks to Cymon Cox (Natural History Museum, London)... Francois Lutzoni and all lab members in Duke Biology Department... AFTOL and its participants... NSF (DEB )
Introduction to Bioinformatics 3. DNA editing and contig assembly
Introduction to Bioinformatics 3. DNA editing and contig assembly Benjamin F. Matthews United States Department of Agriculture Soybean Genomics and Improvement Laboratory Beltsville, MD 20708 matthewb@ba.ars.usda.gov
More informationRules and Format for Taxonomic Nucleotide Sequence Annotation for Fungi: a proposal
Rules and Format for Taxonomic Nucleotide Sequence Annotation for Fungi: a proposal The need for third-party sequence annotation Taxonomic names attached to nucleotide sequences occasionally need to be
More informationBio-Informatics Lectures. A Short Introduction
Bio-Informatics Lectures A Short Introduction The History of Bioinformatics Sanger Sequencing PCR in presence of fluorescent, chain-terminating dideoxynucleotides Massively Parallel Sequencing Massively
More informationData search and visualization tools at the Comparative Evolutionary Genomics of Cotton Web resource
Data search and visualization tools at the Comparative Evolutionary Genomics of Cotton Web resource Alan R. Gingle Andrew H. Paterson Joshua A. Udall Jonathan F. Wendel 1 CEGC project goals set the context
More informationRETRIEVING SEQUENCE INFORMATION. Nucleotide sequence databases. Database search. Sequence alignment and comparison
RETRIEVING SEQUENCE INFORMATION Nucleotide sequence databases Database search Sequence alignment and comparison Biological sequence databases Originally just a storage place for sequences. Currently the
More informationAn introduction to bioinformatic tools for population genomic and metagenetic data analysis, 2.5 higher education credits Third Cycle
An introduction to bioinformatic tools for population genomic and metagenetic data analysis, 2.5 higher education credits Third Cycle Faculty of Science; Department of Marine Sciences The Swedish Royal
More informationA Primer of Genome Science THIRD
A Primer of Genome Science THIRD EDITION GREG GIBSON-SPENCER V. MUSE North Carolina State University Sinauer Associates, Inc. Publishers Sunderland, Massachusetts USA Contents Preface xi 1 Genome Projects:
More informationMolecular typing of VTEC: from PFGE to NGS-based phylogeny
Molecular typing of VTEC: from PFGE to NGS-based phylogeny Valeria Michelacci 10th Annual Workshop of the National Reference Laboratories for E. coli in the EU Rome, November 5 th 2015 Molecular typing
More informationGeospiza s Finch-Server: A Complete Data Management System for DNA Sequencing
KOO10 5/31/04 12:17 PM Page 131 10 Geospiza s Finch-Server: A Complete Data Management System for DNA Sequencing Sandra Porter, Joe Slagel, and Todd Smith Geospiza, Inc., Seattle, WA Introduction The increased
More informationMaster's projects at ITMO University. Daniil Chivilikhin PhD Student @ ITMO University
Master's projects at ITMO University Daniil Chivilikhin PhD Student @ ITMO University General information Guidance from our lab's researchers Publishable results 2 Research areas Research at ITMO Evolutionary
More informationDNA Barcoding in Plants: Biodiversity Identification and Discovery
DNA Barcoding in Plants: Biodiversity Identification and Discovery University of Sao Paulo December 2009 W. John Kress Department of Botany National Museum of Natural History Smithsonian Institution New
More informationSeqScape Software Version 2.5 Comprehensive Analysis Solution for Resequencing Applications
Product Bulletin Sequencing Software SeqScape Software Version 2.5 Comprehensive Analysis Solution for Resequencing Applications Comprehensive reference sequence handling Helps interpret the role of each
More informationBioinformatics Resources at a Glance
Bioinformatics Resources at a Glance A Note about FASTA Format There are MANY free bioinformatics tools available online. Bioinformaticists have developed a standard format for nucleotide and protein sequences
More information2.3 Identify rrna sequences in DNA
2.3 Identify rrna sequences in DNA For identifying rrna sequences in DNA we will use rnammer, a program that implements an algorithm designed to find rrna sequences in DNA [5]. The program was made by
More informationBIO 3350: ELEMENTS OF BIOINFORMATICS PARTIALLY ONLINE SYLLABUS
BIO 3350: ELEMENTS OF BIOINFORMATICS PARTIALLY ONLINE SYLLABUS NEW YORK CITY COLLEGE OF TECHNOLOGY The City University Of New York School of Arts and Sciences Biological Sciences Department Course title:
More informationIntroduction to Bioinformatics AS 250.265 Laboratory Assignment 6
Introduction to Bioinformatics AS 250.265 Laboratory Assignment 6 In the last lab, you learned how to perform basic multiple sequence alignments. While useful in themselves for determining conserved residues
More information4.2.1. What is a contig? 4.2.2. What are the contig assembly programs?
Table of Contents 4.1. DNA Sequencing 4.1.1. Trace Viewer in GCG SeqLab Table. Box. Select the editor mode in the SeqLab main window. Import sequencer trace files from the File menu. Select the trace files
More informationUF EDGE brings the classroom to you with online, worldwide course delivery!
What is the University of Florida EDGE Program? EDGE enables engineering professional, military members, and students worldwide to participate in courses, certificates, and degree programs from the UF
More informationGenome Explorer For Comparative Genome Analysis
Genome Explorer For Comparative Genome Analysis Jenn Conn 1, Jo L. Dicks 1 and Ian N. Roberts 2 Abstract Genome Explorer brings together the tools required to build and compare phylogenies from both sequence
More informationPhylogenetic Trees Made Easy
Phylogenetic Trees Made Easy A How-To Manual Fourth Edition Barry G. Hall University of Rochester, Emeritus and Bellingham Research Institute Sinauer Associates, Inc. Publishers Sunderland, Massachusetts
More informationFocusing on results not data comprehensive data analysis for targeted next generation sequencing
Focusing on results not data comprehensive data analysis for targeted next generation sequencing Daniel Swan, Jolyon Holdstock, Angela Matchan, Richard Stark, John Shovelton, Duarte Mohla and Simon Hughes
More informationNORTH PACIFIC RESEARCH BOARD SEMIANNUAL PROGRESS REPORT
1. PROJECT INFORMATION NPRB Project Number: 1303 Title: Assessing benthic meiofaunal community structure in the Alaskan Arctic: A high-throughput DNA sequencing approach Subaward period July 1, 2013 Jun
More informationCore Bioinformatics. Degree Type Year Semester. 4313473 Bioinformàtica/Bioinformatics OB 0 1
Core Bioinformatics 2014/2015 Code: 42397 ECTS Credits: 12 Degree Type Year Semester 4313473 Bioinformàtica/Bioinformatics OB 0 1 Contact Name: Sònia Casillas Viladerrams Email: Sonia.Casillas@uab.cat
More informationA Tutorial in Genetic Sequence Classification Tools and Techniques
A Tutorial in Genetic Sequence Classification Tools and Techniques Jake Drew Data Mining CSE 8331 Southern Methodist University jakemdrew@gmail.com www.jakemdrew.com Sequence Characters IUPAC nucleotide
More informationFinal Project Report
CPSC545 by Introduction to Data Mining Prof. Martin Schultz & Prof. Mark Gerstein Student Name: Yu Kor Hugo Lam Student ID : 904907866 Due Date : May 7, 2007 Introduction Final Project Report Pseudogenes
More informationUniversity of Glasgow - Programme Structure Summary C1G5-5100 MSc Bioinformatics, Polyomics and Systems Biology
University of Glasgow - Programme Structure Summary C1G5-5100 MSc Bioinformatics, Polyomics and Systems Biology Programme Structure - the MSc outcome will require 180 credits total (full-time only) - 60
More informationBIO 3352: BIOINFORMATICS II HYBRID COURSE SYLLABUS
BIO 3352: BIOINFORMATICS II HYBRID COURSE SYLLABUS NEW YORK CITY COLLEGE OF TECHNOLOGY The City University Of New York School of Arts and Sciences Biological Sciences Department Course title: Bioinformatics
More informationAssign: Unit 1: Preparation Activity page 4-7. Chapter 1: Classifying Life s Diversity page 8
Assign: Unit 1: Preparation Activity page 4-7 Chapter 1: Classifying Life s Diversity page 8 1.1: Identifying, Naming, and Classifying Species page 10 Key Terms: species, morphology, phylogeny, taxonomy,
More informationAmphoraNet: Taxonomic Composition Analysis of Metagenomic Shotgun Sequencing Data
Csaba Kerepesi, Dániel Bánky, Vince Grolmusz: AmphoraNet: Taxonomic Composition Analysis of Metagenomic Shotgun Sequencing Data http://pitgroup.org/amphoranet/ PIT Bioinformatics Group, Department of Computer
More informationEttema Lab Information Management System Documentation
Ettema Lab Information Management System Documentation Release 0.1 Ino de Bruijn, Lionel Guy November 28, 2014 Contents 1 Pipeline Design 3 2 Database design 5 3 Naming Scheme 7 3.1 Complete description...........................................
More informationNext Generation Sequencing Technologies in Microbial Ecology. Frank Oliver Glöckner
Next Generation Sequencing Technologies in Microbial Ecology Frank Oliver Glöckner 1 Max Planck Institute for Marine Microbiology Investigation of the role, diversity and features of microorganisms Interactions
More informationCOMPARING DNA SEQUENCES TO DETERMINE EVOLUTIONARY RELATIONSHIPS AMONG MOLLUSKS
COMPARING DNA SEQUENCES TO DETERMINE EVOLUTIONARY RELATIONSHIPS AMONG MOLLUSKS OVERVIEW In the online activity Biodiversity and Evolutionary Trees: An Activity on Biological Classification, you generated
More informationDescription: Molecular Biology Services and DNA Sequencing
Description: Molecular Biology s and DNA Sequencing DNA Sequencing s Single Pass Sequencing Sequence data only, for plasmids or PCR products Plasmid DNA or PCR products Plasmid DNA: 20 100 ng/μl PCR Product:
More informationAnalyzing A DNA Sequence Chromatogram
LESSON 9 HANDOUT Analyzing A DNA Sequence Chromatogram Student Researcher Background: DNA Analysis and FinchTV DNA sequence data can be used to answer many types of questions. Because DNA sequences differ
More informationName Class Date. binomial nomenclature. MAIN IDEA: Linnaeus developed the scientific naming system still used today.
Section 1: The Linnaean System of Classification 17.1 Reading Guide KEY CONCEPT Organisms can be classified based on physical similarities. VOCABULARY taxonomy taxon binomial nomenclature genus MAIN IDEA:
More informationTyping in the NGS era: The way forward!
Typing in the NGS era: The way forward! Valeria Michelacci NGS course, June 2015 Typing from sequence data NGS-derived conventional Multi Locus Sequence Typing (University of Warwick, 7 housekeeping genes)
More informationHow Sequencing Experiments Fail
How Sequencing Experiments Fail v1.0 Simon Andrews simon.andrews@babraham.ac.uk Classes of Failure Technical Tracking Library Contamination Biological Interpretation Something went wrong with a machine
More informationJust the Facts: A Basic Introduction to the Science Underlying NCBI Resources
1 of 8 11/7/2004 11:00 AM National Center for Biotechnology Information About NCBI NCBI at a Glance A Science Primer Human Genome Resources Model Organisms Guide Outreach and Education Databases and Tools
More informationPROC. CAIRO INTERNATIONAL BIOMEDICAL ENGINEERING CONFERENCE 2006 1. E-mail: msm_eng@k-space.org
BIOINFTool: Bioinformatics and sequence data analysis in molecular biology using Matlab Mai S. Mabrouk 1, Marwa Hamdy 2, Marwa Mamdouh 2, Marwa Aboelfotoh 2,Yasser M. Kadah 2 1 Biomedical Engineering Department,
More informationChromatoGate: A Tool for Detecting Base Mis-Calls in Multiple Sequence Alignments by Semi-Automatic Chromatogram Inspection
ChromatoGate: A Tool for Detecting Base Mis-Calls in Multiple Sequence Alignments by Semi-Automatic Chromatogram Inspection Nikolaos Alachiotis Emmanouella Vogiatzi Scientific Computing Group Institute
More informationLab 2/Phylogenetics/September 16, 2002 1 PHYLOGENETICS
Lab 2/Phylogenetics/September 16, 2002 1 Read: Tudge Chapter 2 PHYLOGENETICS Objective of the Lab: To understand how DNA and protein sequence information can be used to make comparisons and assess evolutionary
More informationHeuristics for the Gene-duplication Problem: A Θ(n) Speed-up for the Local Search
Heuristics for the Gene-duplication Problem: A Θ(n) Speed-up for the Local Search M. S. Bansal 1, J. G. Burleigh 2, O. Eulenstein 1, and A. Wehe 3 1 Department of Computer Science, Iowa State University,
More informationTACC, its Natural History Collections and iplant
TACC, its Natural History Collections and iplant AIM-UP Santa Fe, NM October 16, 2010 TACC - Mission To enable discoveries that advance science and society through the application of advanced computing
More informationIntroduction to Databases and Data Mining
Introduction to Databases and Data Mining Computer Science 105 Boston University David G. Sullivan, Ph.D. Welcome to CS 105! This course examines how collections of data are organized, stored, and processed.
More informationUGENE Quick Start Guide
Quick Start Guide This document contains a quick introduction to UGENE. For more detailed information, you can find the UGENE User Manual and other special manuals in project website: http://ugene.unipro.ru.
More informationThe Central Dogma of Molecular Biology
Vierstraete Andy (version 1.01) 1/02/2000 -Page 1 - The Central Dogma of Molecular Biology Figure 1 : The Central Dogma of molecular biology. DNA contains the complete genetic information that defines
More informationPipeline Pilot Enterprise Server. Flexible Integration of Disparate Data and Applications. Capture and Deployment of Best Practices
overview Pipeline Pilot Enterprise Server Pipeline Pilot Enterprise Server (PPES) is a powerful client-server platform that streamlines the integration and analysis of the vast quantities of data flooding
More informationThe enigmatic monotypic crab plover Dromas ardeola is closely related to pratincoles and coursers (Aves, Charadriiformes, Glareolidae)
Short Communication Genetics and Molecular Biology, 33, 3, 583-586 (2010) Copyright 2010, Sociedade Brasileira de Genética. Printed in Brazil www.sbg.org.br The enigmatic monotypic crab plover Dromas ardeola
More informationVector NTI Advance 11 Quick Start Guide
Vector NTI Advance 11 Quick Start Guide Catalog no. 12605050, 12605099, 12605103 Version 11.0 December 15, 2008 12605022 Published by: Invitrogen Corporation 5791 Van Allen Way Carlsbad, CA 92008 U.S.A.
More informationNetwork Protocol Analysis using Bioinformatics Algorithms
Network Protocol Analysis using Bioinformatics Algorithms Marshall A. Beddoe Marshall_Beddoe@McAfee.com ABSTRACT Network protocol analysis is currently performed by hand using only intuition and a protocol
More informationGenBank, Entrez, & FASTA
GenBank, Entrez, & FASTA Nucleotide Sequence Databases First generation GenBank is a representative example started as sort of a museum to preserve knowledge of a sequence from first discovery great repositories,
More informationEDIT Workpackage 5 Unified Model Software and Activities
Software Categories Main Activities in revisionary process 1 - Bibliographic 2 - Geographical 3 - Taxonomic 4 - Descriptive 5 - Communication 6 - Publication Collecting Gather existing specimens Examine
More informationIntegrating Bioinformatics, Medical Sciences and Drug Discovery
Integrating Bioinformatics, Medical Sciences and Drug Discovery M. Madan Babu Centre for Biotechnology, Anna University, Chennai - 600025 phone: 44-4332179 :: email: madanm1@rediffmail.com Bioinformatics
More informationBASIC STATISTICAL METHODS FOR GENOMIC DATA ANALYSIS
BASIC STATISTICAL METHODS FOR GENOMIC DATA ANALYSIS SEEMA JAGGI Indian Agricultural Statistics Research Institute Library Avenue, New Delhi-110 012 seema@iasri.res.in Genomics A genome is an organism s
More informationCloud Computing Solutions for Genomics Across Geographic, Institutional and Economic Barriers
Cloud Computing Solutions for Genomics Across Geographic, Institutional and Economic Barriers Ntinos Krampis Asst. Professor J. Craig Venter Institute kkrampis@jcvi.org http://www.jcvi.org/cms/about/bios/kkrampis/
More informationBIOINF 525 Winter 2016 Foundations of Bioinformatics and Systems Biology http://tinyurl.com/bioinf525-w16
Course Director: Dr. Barry Grant (DCM&B, bjgrant@med.umich.edu) Description: This is a three module course covering (1) Foundations of Bioinformatics, (2) Statistics in Bioinformatics, and (3) Systems
More informationLESSON 9. Analyzing DNA Sequences and DNA Barcoding. Introduction. Learning Objectives
9 Analyzing DNA Sequences and DNA Barcoding Introduction DNA sequencing is performed by scientists in many different fields of biology. Many bioinformatics programs are used during the process of analyzing
More informationDelivering the power of the world s most successful genomics platform
Delivering the power of the world s most successful genomics platform NextCODE Health is bringing the full power of the world s largest and most successful genomics platform to everyday clinical care NextCODE
More informationAn introduction to bioinformatic tools for metagenetic and population genomic data analysis, 2.0 higher education credits
An introduction to bioinformatic tools for metagenetic and population genomic data analysis, 2.0 higher education credits Course period: 3-7 November 2014 Course leaders / Addresses for applications: Pierre
More informationPHYML Online: A Web Server for Fast Maximum Likelihood-Based Phylogenetic Inference
PHYML Online: A Web Server for Fast Maximum Likelihood-Based Phylogenetic Inference Stephane Guindon, F. Le Thiec, Patrice Duroux, Olivier Gascuel To cite this version: Stephane Guindon, F. Le Thiec, Patrice
More informationCustom TaqMan Assays For New SNP Genotyping and Gene Expression Assays. Design and Ordering Guide
Custom TaqMan Assays For New SNP Genotyping and Gene Expression Assays Design and Ordering Guide For Research Use Only. Not intended for any animal or human therapeutic or diagnostic use. Information in
More informatione-science and technology infrastructure for biodiversity research
e-science and technology infrastructure for biodiversity research Wouter Los Coordinator of the Preparatory Project University of Amsterdam (institute of Biodiversity and Ecosystem Dynamics) Outline Users
More informationOntology-Driven Workflow Management for Biosequence Processing Systems
Ontology-Driven Workflow Management for Biosequence Processing Systems Melissa Lemos 1, Marco A. Casanova 1, Luiz Fernando Bessa Seibel 1, José Antonio F. de Macedo 1, Antonio Basílio de Miranda 2 1 Department
More informationSequence Formats and Sequence Database Searches. Gloria Rendon SC11 Education June, 2011
Sequence Formats and Sequence Database Searches Gloria Rendon SC11 Education June, 2011 Sequence A is the primary structure of a biological molecule. It is a chain of residues that form a precise linear
More informationSupplementary Material
Supplementary Material Fernando Izquierdo-Carrasco, John Cazes, Stephen A. Smith, and Alexandros Stamatakis November 22, 2013 Contents 1 The PUmPER Framework 2 1.1 MSA Construction/Extension with PHLAWD.........
More informationEMBL-EBI Web Services
EMBL-EBI Web Services Rodrigo Lopez Head of the External Services Team SME Workshop Piemonte 2011 EBI is an Outstation of the European Molecular Biology Laboratory. Summary Introduction The JDispatcher
More informationLecture 11 Data storage and LIMS solutions. Stéphane LE CROM lecrom@biologie.ens.fr
Lecture 11 Data storage and LIMS solutions Stéphane LE CROM lecrom@biologie.ens.fr Various steps of a DNA microarray experiment Experimental steps Data analysis Experimental design set up Chips on catalog
More informationRichmond, VA. Richmond, VA. 2 Department of Microbiology and Immunology, Virginia Commonwealth University,
Massive Multi-Omics Microbiome Database (M 3 DB): A Scalable Data Warehouse and Analytics Platform for Microbiome Datasets Shaun W. Norris 1 (norrissw@vcu.edu) Steven P. Bradley 2 (bradleysp@vcu.edu) Hardik
More informationScaling the gene duplication problem towards the Tree of Life: Accelerating the rspr heuristic search
Scaling the gene duplication problem towards the Tree of Life: Accelerating the rspr heuristic search André Wehe 1 and J. Gordon Burleigh 2 1 Department of Computer Science, Iowa State University, Ames,
More informationBiological Sequence Data Formats
Biological Sequence Data Formats Here we present three standard formats in which biological sequence data (DNA, RNA and protein) can be stored and presented. Raw Sequence: Data without description. FASTA
More informationDoctor of Philosophy in Computer Science
Doctor of Philosophy in Computer Science Background/Rationale The program aims to develop computer scientists who are armed with methods, tools and techniques from both theoretical and systems aspects
More informationSoftware review. Analysis for free: Comparing programs for sequence analysis
Analysis for free: Comparing programs for sequence analysis Keywords: sequence comparison tools, alignment, annotation, freeware, sequence analysis Abstract Programs to import, manage and align sequences
More informationAS4.1 190509 Replaces 260806 Page 1 of 50 ATF. Software for. DNA Sequencing. Operators Manual. Assign-ATF is intended for Research Use Only (RUO):
Replaces 260806 Page 1 of 50 ATF Software for DNA Sequencing Operators Manual Replaces 260806 Page 2 of 50 1 About ATF...5 1.1 Compatibility...5 1.1.1 Computer Operator Systems...5 1.1.2 DNA Sequencing
More informationAlternative Deployment Models for Cloud Computing in HPC Applications. Society of HPC Professionals November 9, 2011 Steve Hebert, Nimbix
Alternative Deployment Models for Cloud Computing in HPC Applications Society of HPC Professionals November 9, 2011 Steve Hebert, Nimbix The case for Cloud in HPC Build it in house Assemble in the cloud?
More informationNECC History. Karl V. Steiner 2011 Annual NECC Meeting, Orono, Maine March 15, 2011
NECC History Karl V. Steiner 2011 Annual NECC Meeting, Orono, Maine March 15, 2011 EPSCoR Cyberinfrastructure Workshop First regional NENI (now NECC) Workshop held in Vermont in August 2007 Workshop heldinkentucky
More informationChironomid DNA Barcode Database Search System. User Manual
Chironomid DNA Barcode Database Search System User Manual National Institute for Environmental Studies Center for Environmental Biology and Ecosystem Studies December 2015 Contents 1. Overview 1 2. Search
More informationGeneious 7.0. Biomatters Ltd
h in a flash Geneious 7.0 Biomatters Ltd September 3, 2013 2 Contents 1 Getting Started 7 1.1 Downloading & Installing Geneious.......................... 7 1.2 Using Geneious for the first time............................
More informationA short guide to phylogeny reconstruction
A short guide to phylogeny reconstruction E. Michu Institute of Biophysics, Academy of Sciences of the Czech Republic, Brno, Czech Republic ABSTRACT This review is a short introduction to phylogenetic
More informationA demonstration of the use of Datagrid testbed and services for the biomedical community
A demonstration of the use of Datagrid testbed and services for the biomedical community Biomedical applications work package V. Breton, Y Legré (CNRS/IN2P3) R. Météry (CS) Credits : C. Blanchet, T. Contamine,
More informationData for phylogenetic analysis
Data for phylogenetic analysis The data that are used to estimate the phylogeny of a set of tips are the characteristics of those tips. Therefore the success of phylogenetic inference depends in large
More informationicer Bioinformatics Support Fall 2011
icer Bioinformatics Support Fall 2011 John B. Johnston HPC Programmer Institute for Cyber Enabled Research 2011 Michigan State University Board of Trustees. Institute for Cyber Enabled Research (icer)
More informationTeaching Bioinformatics to Undergraduates
Teaching Bioinformatics to Undergraduates http://www.med.nyu.edu/rcr/asm Stuart M. Brown Research Computing, NYU School of Medicine I. What is Bioinformatics? II. Challenges of teaching bioinformatics
More informationVIBE. Visual Integrated Bioinformatics Environment. Enter the Visual Age of Computational Genomics. Whitepaper
VIBE Visual Integrated Bioinformatics Environment Whitepaper Enter the Visual Age of Computational Genomics INCOGEN, Inc. 104 George Perry Williamsburg, VA 23185 www.incogen.com Phone: 757-221-0550 info@incogen.com
More informationWJEC AS Biology Biodiversity & Classification (2.1 All Organisms are related through their Evolutionary History)
Name:.. Set:. Specification Points: WJEC AS Biology Biodiversity & Classification (2.1 All Organisms are related through their Evolutionary History) (a) Biodiversity is the number of different organisms
More informationBIOLOMICS SOFTWARE & SERVICES GENERAL INFORMATION DOCUMENT
BIOLOMICS SOFTWARE & SERVICES GENERAL INFORMATION DOCUMENT BIOAWARE SA NV - VERSION 2.0 - AUGUST 2013 BIOLOMICS SOFTWARE DYNAMIC CREATION AND MODIFICATION OF DATABASES Create simple or complex databases
More informationSample policy of Naturalis Biodiversity Center
Sample policy of Naturalis Biodiversity Center INTRODUCTION Naturalis Biodiversity Center (hereafter Naturalis) has the mission to use its collections in as many ways as possible for the furtherance of
More informationNext generation sequencing (NGS)
Next generation sequencing (NGS) Vijayachitra Modhukur BIIT modhukur@ut.ee 1 Bioinformatics course 11/13/12 Sequencing 2 Bioinformatics course 11/13/12 Microarrays vs NGS Sequences do not need to be known
More informationSoftware review. Pise: Software for building bioinformatics webs
Pise: Software for building bioinformatics webs Keywords: bioinformatics web, Perl, sequence analysis, interface builder Abstract Pise is interface construction software for bioinformatics applications
More informationAn example of bioinformatics application on plant breeding projects in Rijk Zwaan
An example of bioinformatics application on plant breeding projects in Rijk Zwaan Xiangyu Rao 17-08-2012 Introduction of RZ Rijk Zwaan is active worldwide as a vegetable breeding company that focuses on
More information11, Olomouc, 783 71, Czech Republic. Version of record first published: 24 Sep 2012.
This article was downloaded by: [Knihovna Univerzity Palackeho], [Vladan Ondrej] On: 24 September 2012, At: 05:24 Publisher: Routledge Informa Ltd Registered in England and Wales Registered Number: 1072954
More informationData Registry Workshop Report
Data Registry Workshop Report Background A Joint Working Group on Data Sharing and Archiving (JWG), representing major professional societies that publish ecology, evolution, and organismal biology journals,
More informationEvaluating the Performance of a Successive-Approximations Approach to Parameter Optimization in Maximum-Likelihood Phylogeny Estimation
Evaluating the Performance of a Successive-Approximations Approach to Parameter Optimization in Maximum-Likelihood Phylogeny Estimation Jack Sullivan,* Zaid Abdo, à Paul Joyce, à and David L. Swofford
More informationinvestigation 3 Comparing DNA Sequences to
Big Idea 1 Evolution investigation 3 Comparing DNA Sequences to Understand Evolutionary Relationships with BLAST How can bioinformatics be used as a tool to determine evolutionary relationships and to
More informationIntroduction to Phylogenetic Analysis
Subjects of this lecture Introduction to Phylogenetic nalysis Irit Orr 1 Introducing some of the terminology of phylogenetics. 2 Introducing some of the most commonly used methods for phylogenetic analysis.
More informationCore Bioinformatics. Titulació Tipus Curs Semestre. 4313473 Bioinformàtica/Bioinformatics OB 0 1
Core Bioinformatics 2014/2015 Codi: 42397 Crèdits: 12 Titulació Tipus Curs Semestre 4313473 Bioinformàtica/Bioinformatics OB 0 1 Professor de contacte Nom: Sònia Casillas Viladerrams Correu electrònic:
More informationLifeScope Genomic Analysis Software 2.5
USER GUIDE LifeScope Genomic Analysis Software 2.5 Graphical User Interface DATA ANALYSIS METHODS AND INTERPRETATION Publication Part Number 4471877 Rev. A Revision Date November 2011 For Research Use
More informationPrimetime for KNIME:
Primetime for KNIME: Towards an Integrated Analysis and Visualization Environment for RNAi Screening Data F. Oliver Gathmann, Ph. D. Director IT, Cenix BioScience Presentation for: KNIME User Group Meeting
More informationAutomated Plausibility Analysis of Large Phylogenies
Automated Plausibility Analysis of Large Phylogenies Bachelor Thesis of David Dao At the Department of Informatics Institute of Theoretical Computer Science Reviewers: Advisors: Prof. Dr. Alexandros Stamatakis
More informationMissing data and the accuracy of Bayesian phylogenetics
Journal of Systematics and Evolution 46 (3): 307 314 (2008) (formerly Acta Phytotaxonomica Sinica) doi: 10.3724/SP.J.1002.2008.08040 http://www.plantsystematics.com Missing data and the accuracy of Bayesian
More informationVersion 5.0 Release Notes
Version 5.0 Release Notes 2011 Gene Codes Corporation Gene Codes Corporation 775 Technology Drive, Ann Arbor, MI 48108 USA 1.800.497.4939 (USA) +1.734.769.7249 (elsewhere) +1.734.769.7074 (fax) www.genecodes.com
More information