Constructing Phylogenetic Trees. Gloria Rendon SC 11 - Education June, 2011
|
|
- Poppy Marshall
- 7 years ago
- Views:
Transcription
1 Constructing Phylogenetic Trees Gloria Rendon SC 11 - Education June, 2011
2 Phylogenetic Tree Reconstruction PTR IN THE PAST: much of this work was done by making observations of anatomy and physiology and with comparisons in fossil records NOW: techniques have been developed in molecular biology for performing such evolutionary comparisons at the molecular level using computational tools.
3 Phylogenetic Tree Reconstruction is an (Inference) Problem Given n species m characters For each species, the values for all characters is known Goal: a fully labeled phylogenetic tree that best explains the given data (i.e. maximize a target function (score)) Assumptions: Characters are mutually independent After two species diverged, their further evolution is independent of each other Solution: exhaustive search of the tree space to find the best possible solution is unfeasible. Heuristic approach to finding an approximate solution that is close enough to the best solution.
4 Desired Properties of the data used in (Species) Phylogenetic Tree Reconstruction An ideal choice is a genomic region that: appears exactly once in every species has evolutionary history identical to that of the species exhibits a rate of change that is both fast enough to distinguish between closely related species and slow enough so that they resemble each other on any pair of distantly related species Small ribosomal subunit rrna, called 16s ribosomal RNA in prokaryotes and 18s ribosomal RNA in eukaryotes, has been found to be the best genomic segment for this type of analysis.
5 Many Possible Phylotrees The number of possible rooted phylogenetic trees that can be constructed with n sequences grows exponentially. (2n)!/n!*(n+1)! Where n is the number of nodes (internal and leaf nodes) For example, with five sequences and four internal nodes (so n=9); we have 4,862 possibilities; 98 of which are structurally different, seven of them are illustrated here.
6 Many Possible Phylotrees Several computational tools can produce more than one phylotree for a given set of sequences. Human expertise is usually necessary to make a judgment call on the most likely phylogeny for a given set of sequences. Lacking that, we can use bootstrapping as a second-best choice.
7 Is the phylotree correct? Bootstrapping techniques have been developed to test if not the correctness at least the reliability of the phylogeny calculated by a program Bootstrap quantifies the degree of support within the data for a particular branch given the evolutionary model and tree reconstruction method
8 Basic Procedure for building biological trees: ONE TWIG AT A TIME 1.Start with any TWO sequences and add the rest of the sequences one at a time. 2. Each new sequence becomes a leaf of the tree (meaning, nothing further can be attached to this point). 3. Use a particular model of evolution and method to choose the place where the new sequence ought to go, It should be closer to the sequence in the tree that it is most similar to than to any other sequence already in the tree. 4. Repeat steps 2 and 3 until all sequences have been inserted into the tree 5. Stop
9 Basic Procedure for building biological trees: ONE TWIG AT A TIME 1.Start with any TWO sequences and add the rest of the sequences one at a time. 2. Each new sequence becomes a leaf of the tree (meaning, nothing further can be attached to this point). 3. Use a particular model of evolution and method to choose the place where the new sequence ought to go, It should be closer to the sequence in the tree that it is most similar to than to any other sequence already in the tree. 4. Repeat steps 2 and 3 until all sequences have been inserted into the tree 5. Stop
10 Choice of a PTR Method Two broad categories exist: distance-based methods and sequence-based methods Distance-based methods first compute pairwise distances from the sequences and then use those distances to calculate the phylotree Sequence-based methods use the MSA of all the sequences and search for the best tree according to optimality criterion defined by a model
11 Properties of the PRT Methods Method Type of method Tree type Single tree? Tree score? Tree test? UPGMA distance ultrametric Yes No No Neighbor joining distance additive Yes No No Fitch-Margolish distance additive Yes No No Minimum evolution distance additive No Yes Yes Maximum parsimony sequence additive No Yes Yes Maximum likelihood sequence additive No Yes Yes Bayesian sequence additive No Yes Yes
12 Choice of a Model of Evolution Model Base composition R=1? Identical transition rates? Identical transversion rates? Reference JC 1:1:1:1 No Yes Yes Jukes and Cantor (1969) F81 Variable No Yes Yes Felsenstein(1981) K2P 1:1:1:1 Yes Yes Yes Kimura(1980) HKY85 Variable Yes No No Hasegawa et al.(1985) TN Variable Yes No Yes Tamura and Nei(1993) K3P Variable Yes No Yes Kimura(1981) SYM 1:1:1:1 Yes No No Zharkikh(1994) GTR Variable Yes No No Rodriguez et al.(1990)
13 Which Model to Use?
14 Illustrating the procedure manually with a toy example 1.Start with any TWO sequences and add the rest of the sequences one at a time. 2. Each new sequence becomes a leaf of the tree (meaning, nothing further can be attached to this point). 3. Use a particular model of evolution and method to choose the place where the new sequence ought to go, It should be closer to the sequence in the tree that it is most similar to than to any other sequence already in the tree. 4. Repeat steps 2 and 3 until all sequences have been inserted into the tree 5. Stop
15 Illustrating the procedure manually with a toy example 1. Calculate a multiple sequence alignment with all the sequences that you want in your tree. This step is not manually done. Sequence Sequence Alignment Length Name A sequence alignment is a way of arranging the sequences of DNA, RNA, or proteins to identify regions of similarity that may be a consequence of functional, structural, or evolutionary relationships between the sequences.
16 Illustrating the procedure manually with a toy example 1. Calculate a multiple sequence alignment with all the sequences that you want in your tree. This step is not manually done. Once the alignment is calculated; the similarity between any pair of sequences is established and phylogenetic relationships can be predicted. For example, here, the smaller the score in a cell, the higher the similarity is for the pair of sequences
17 For this toy example we start with the alignment results start with 2 sequences, for instance seq3 and seq4 add seq1 If next to seq3, consider Score(seq1,seq3)=51 If next to seq4, consider Score(seq1,seq4)=51 So seq1 goes in another branch add seq2 If next to seq1, consider Score(seq2,seq1)=48 If next to seq3, consider Score(seq2,seq3)=66 So seq2 goes in seq1 s branch add seq5 If next to seq1, consider Score(seq5,seq1)=70 If next to seq2, consider Score(seq5,seq2)=85 So, seq5 goes in another branch
18 Using tools to reconstruct a Phylogenetic Tree
19 Example2: Phylogeny of Proteobacteria 16s ribosomal RNA sequences from 38 species in these families: alphaproteobacteria, betaproteobacteria, gammaproteobacteria, deltaproteobacteria, and epsilonproteobacteria Tree 1: generated using ML and GTR Tree 2: generated using ML Tree 3: generated using UPGMA and JR correction, removing gaps Tree 4: generated using UPGMA and JR correction, no gaps were removed Tree 5: condense Tree 3 obtained by a bootstrap analysis; branches with bootstrap value below 75% have been contracted
20 Tree1: Delta and Epsilon branched off early from the rest of the family Tree2: Gamma and Beta branched off early from the rest of the family
21 Trees 3 and 4 calculated with UPGMA, a distance-based method, have the branching off of Epsilon happening earlier than in Trees 1 and 2, which were calculated using maximum likelihood.
22 As a result of bootstrapping, we may end up with a nonbinary tree like in this case.
23 Where are the Phylogeny Tools in the Mobyle Web Server?
24 Exercise 1 Phylotree of the eight imaginary species We are going to revisit the example we used in the previous lesson. Up until this point, we have aligned the sequences of the eight imaginary species of the solar system with a multiple sequence alignment tool: ClustalW. Let us go back to the results of ClustalW for the set containing the sequences of the eight imaginary species and use those results to reconstruct a likely phylogeny of the species. Since it is possible to end up with different phylotrees; we are going to actually tweak the parameters of the tool and see what phylotree it produces each time. In real life we would need to be guided by the expert opinion of a taxonomist as to which tree is the most likely phylogeny for the species.
25 Exercise 1 Phylotree of the eight imaginary species Open the browser again and go back to the results of ClustalW for the set containing the sequences of the eight imaginary species Click on the pull-down menu next to the button further analysis located in the alignment frame Select PUZZLE from the pull down menu first; Then click on further analysis [the alignment is loaded into the input frame of the puzzle tool] Note: PUZZLE is a phylogenetic tool that uses ML and NJ; suitable for large trees.
26 Exercise 1 Phylotree of the eight imaginary species [the alignment is loaded into the input frame of the puzzle tool] How can you check? The name of the current tool should read Tree-Puzzle.. The frame for alignment file should contain the result that ClustalW produced. Leave all the other parameters unchanged with their default values and click on RUN
27 Exercise 1 Phylotree of the eight imaginary species The output page of PUZZLE consists of several frames as indicated in this figure with numbers Is the output file with everything 2. Is the output tree in Newick format 3. Is the output distance file 4. Is the standard output report. We are just interested in the tree. Click on view with archaeopteryx to see the tree in graphical form
28 Exercise 1 Phylotree of the eight imaginary species Close the window that shows the tree. Now, we are going to repeat the same steps changing only the parameters of the PUZZLE tool Go back to the ClustalW results page and start all over. When you get to the puzzle page; scroll down to the Quartet puzzling options and change AT LEAST the value of the last entry that reads Display as outgroup? N N should be a number [1-8]; the default value is 1 Then press RUN and check the tree
29 Exercise 1 Phylotree of the eight imaginary species Now, let s try lvb, a phylogeny tool that uses parsimony to calculate trees from dna sequences. Go back to the ClustalW results page and start all over. From the pull-down menu close to the further analysis button; choose lvb. Then click on further analysis [wait for the results to get loaded onto the input box of the lvb page] Then press RUN and check the tree.
30 Exercise 1 Phylotree of the eight imaginary species Now, let s try quicktree, a phylogeny tool that uses least-squre distances to calculate trees. Go back to the ClustalW results page and start all over. From the pull-down menu close to the further analysis button; choose quicktree. Then click on further analysis [wait for the results to get loaded onto the input box of the quicktree page] Then press RUN and check the tree.
31 Exercise 1 Phylotree of the eight imaginary species All three PRT programs produced ONE unrooted phylogenetic tree. Lvb s tree has NO branch length estimates.
32 Exercise 2: Produce the phylogeny of the three kingdom of Carl Woese The first kingdom, Eukaryotes, is made up of sequences 1-3 The second kingdom, bacteria, is made up of sequences 4-9 The third kingdom, Archaea, is made up of sequences 10-13
33 Exercise 2: Produce the phylogeny of the three kingdom of Carl Woese 1.Open the browser select align/multiple/mafft 2. Upload the file called woese.fasta located in the exercise folder 3. Run the mafft program to obtain the multiple sequence alignment 4.Click on the pull-down menu next to the button further analysis located in the alignment frame 3.Select PUZZLE from the pull down menu first; then click on further analysis [the alignment is loaded into the input frame of the PUZZLE tool] 4.Click on RUN [wait until the result is available] 5.To view the resulting phylogenetic tree, scroll down to the frame named output tree and then click on view with archaeopteryx
34 Exercise 2: Produce the phylogeny of the three kingdom of Carl Woese 6. Repeat steps 3-5 using lvb instead of PUZZLE to obtain a tree using parsimony 7. Repeat steps 3-5 using Quicktree instead of PUZZLE to obtain a tree using distances. 8. Compare the trees Q:The dataset contains a sequence that proves to be a challenge for computational PTR tools, which sequence is that? Q:Are the trees identical? If not, which tree seems to be more accurate?
35 Exercise 2: Produce the phylogeny of the three kingdom of Carl Woese
36 Exercise 3: Produce the phylogeny of the Eukaryotic species 1.Open the browser select align/multiple/mafft 2. Upload the file called 18s.rRNA.seqs.fasta located in the exercise folder 3. Run the mafft program to obtain the multiple sequence alignment 4.Click on the pull-down menu next to the button further analysis located in the alignment frame 3.Select PUZZLE from the pull down menu first; then click on further analysis [the alignment is loaded into the input frame of the PUZZLE tool] 4.Click on RUN [wait until the result is available] 5.To view the resulting phylogenetic tree, scroll down to the frame named output tree and then click on view with archaeopteryx
37 Additional Readings Enumerating binary trees: Nei, M. and Kumar, S. Molecular evolution and phylogenetics. Oxford University Press Chapter 5 Bodoroski, M and Ekisheva, S. Problems and solutions in biological sequence analysis. Cambridge University Press, Chapter 7 Gusfield, D. Algorithms on strings, trees, and sequences: computer science and computational biology. Cambridge University Press, Chapter 17 Pevsner, J. Bioinformatics and functional genomics. Hoboken, N.J. : Wiley- Blackwell, pp Tateno, Y, M. Nei, AND F. Tajima. Accuracy of estimated phylogenetic trees from molecular data. J Mol Evol. 1982;18(6):
Introduction to Bioinformatics AS 250.265 Laboratory Assignment 6
Introduction to Bioinformatics AS 250.265 Laboratory Assignment 6 In the last lab, you learned how to perform basic multiple sequence alignments. While useful in themselves for determining conserved residues
More informationPhylogenetic Trees Made Easy
Phylogenetic Trees Made Easy A How-To Manual Fourth Edition Barry G. Hall University of Rochester, Emeritus and Bellingham Research Institute Sinauer Associates, Inc. Publishers Sunderland, Massachusetts
More informationName: Class: Date: Multiple Choice Identify the choice that best completes the statement or answers the question.
Name: Class: Date: Chapter 17 Practice Multiple Choice Identify the choice that best completes the statement or answers the question. 1. The correct order for the levels of Linnaeus's classification system,
More informationArbres formels et Arbre(s) de la Vie
Arbres formels et Arbre(s) de la Vie A bit of history and biology Definitions Numbers Topological distances Consensus Random models Algorithms to build trees Basic principles DATA sequence alignment distance
More informationProtein Sequence Analysis - Overview -
Protein Sequence Analysis - Overview - UDEL Workshop Raja Mazumder Research Associate Professor, Department of Biochemistry and Molecular Biology Georgetown University Medical Center Topics Why do protein
More informationVisualization of Phylogenetic Trees and Metadata
Visualization of Phylogenetic Trees and Metadata November 27, 2015 Sample to Insight CLC bio, a QIAGEN Company Silkeborgvej 2 Prismet 8000 Aarhus C Denmark Telephone: +45 70 22 32 44 www.clcbio.com support-clcbio@qiagen.com
More informationName Class Date. binomial nomenclature. MAIN IDEA: Linnaeus developed the scientific naming system still used today.
Section 1: The Linnaean System of Classification 17.1 Reading Guide KEY CONCEPT Organisms can be classified based on physical similarities. VOCABULARY taxonomy taxon binomial nomenclature genus MAIN IDEA:
More informationIntroduction to Phylogenetic Analysis
Subjects of this lecture Introduction to Phylogenetic nalysis Irit Orr 1 Introducing some of the terminology of phylogenetics. 2 Introducing some of the most commonly used methods for phylogenetic analysis.
More informationMaximum-Likelihood Estimation of Phylogeny from DNA Sequences When Substitution Rates Differ over Sites1
Maximum-Likelihood Estimation of Phylogeny from DNA Sequences When Substitution Rates Differ over Sites1 Ziheng Yang Department of Animal Science, Beijing Agricultural University Felsenstein s maximum-likelihood
More informationBio-Informatics Lectures. A Short Introduction
Bio-Informatics Lectures A Short Introduction The History of Bioinformatics Sanger Sequencing PCR in presence of fluorescent, chain-terminating dideoxynucleotides Massively Parallel Sequencing Massively
More informationNetwork Protocol Analysis using Bioinformatics Algorithms
Network Protocol Analysis using Bioinformatics Algorithms Marshall A. Beddoe Marshall_Beddoe@McAfee.com ABSTRACT Network protocol analysis is currently performed by hand using only intuition and a protocol
More informationSequence Formats and Sequence Database Searches. Gloria Rendon SC11 Education June, 2011
Sequence Formats and Sequence Database Searches Gloria Rendon SC11 Education June, 2011 Sequence A is the primary structure of a biological molecule. It is a chain of residues that form a precise linear
More informationAlgorithms in Computational Biology (236522) spring 2007 Lecture #1
Algorithms in Computational Biology (236522) spring 2007 Lecture #1 Lecturer: Shlomo Moran, Taub 639, tel 4363 Office hours: Tuesday 11:00-12:00/by appointment TA: Ilan Gronau, Taub 700, tel 4894 Office
More informationMolecular Clocks and Tree Dating with r8s and BEAST
Integrative Biology 200B University of California, Berkeley Principals of Phylogenetics: Ecology and Evolution Spring 2011 Updated by Nick Matzke Molecular Clocks and Tree Dating with r8s and BEAST Today
More informationSequence Analysis 15: lecture 5. Substitution matrices Multiple sequence alignment
Sequence Analysis 15: lecture 5 Substitution matrices Multiple sequence alignment A teacher's dilemma To understand... Multiple sequence alignment Substitution matrices Phylogenetic trees You first need
More information2.3 Identify rrna sequences in DNA
2.3 Identify rrna sequences in DNA For identifying rrna sequences in DNA we will use rnammer, a program that implements an algorithm designed to find rrna sequences in DNA [5]. The program was made by
More informationThe Central Dogma of Molecular Biology
Vierstraete Andy (version 1.01) 1/02/2000 -Page 1 - The Central Dogma of Molecular Biology Figure 1 : The Central Dogma of molecular biology. DNA contains the complete genetic information that defines
More informationMissing data and the accuracy of Bayesian phylogenetics
Journal of Systematics and Evolution 46 (3): 307 314 (2008) (formerly Acta Phytotaxonomica Sinica) doi: 10.3724/SP.J.1002.2008.08040 http://www.plantsystematics.com Missing data and the accuracy of Bayesian
More informationBayesian Phylogeny and Measures of Branch Support
Bayesian Phylogeny and Measures of Branch Support Bayesian Statistics Imagine we have a bag containing 100 dice of which we know that 90 are fair and 10 are biased. The
More informationWhat mathematical optimization can, and cannot, do for biologists. Steven Kelk Department of Knowledge Engineering (DKE) Maastricht University, NL
What mathematical optimization can, and cannot, do for biologists Steven Kelk Department of Knowledge Engineering (DKE) Maastricht University, NL Introduction There is no shortage of literature about the
More informationLab 2/Phylogenetics/September 16, 2002 1 PHYLOGENETICS
Lab 2/Phylogenetics/September 16, 2002 1 Read: Tudge Chapter 2 PHYLOGENETICS Objective of the Lab: To understand how DNA and protein sequence information can be used to make comparisons and assess evolutionary
More informationBioinformatics Resources at a Glance
Bioinformatics Resources at a Glance A Note about FASTA Format There are MANY free bioinformatics tools available online. Bioinformaticists have developed a standard format for nucleotide and protein sequences
More informationPHYML Online: A Web Server for Fast Maximum Likelihood-Based Phylogenetic Inference
PHYML Online: A Web Server for Fast Maximum Likelihood-Based Phylogenetic Inference Stephane Guindon, F. Le Thiec, Patrice Duroux, Olivier Gascuel To cite this version: Stephane Guindon, F. Le Thiec, Patrice
More informationUser Manual for SplitsTree4 V4.14.2
User Manual for SplitsTree4 V4.14.2 Daniel H. Huson and David Bryant November 4, 2015 Contents Contents 1 1 Introduction 4 2 Getting Started 5 3 Obtaining and Installing the Program 5 4 Program Overview
More informationA comparison of methods for estimating the transition:transversion ratio from DNA sequences
Molecular Phylogenetics and Evolution 32 (2004) 495 503 MOLECULAR PHYLOGENETICS AND EVOLUTION www.elsevier.com/locate/ympev A comparison of methods for estimating the transition:transversion ratio from
More informationA Step-by-Step Tutorial: Divergence Time Estimation with Approximate Likelihood Calculation Using MCMCTREE in PAML
9 June 2011 A Step-by-Step Tutorial: Divergence Time Estimation with Approximate Likelihood Calculation Using MCMCTREE in PAML by Jun Inoue, Mario dos Reis, and Ziheng Yang In this tutorial we will analyze
More informationRETRIEVING SEQUENCE INFORMATION. Nucleotide sequence databases. Database search. Sequence alignment and comparison
RETRIEVING SEQUENCE INFORMATION Nucleotide sequence databases Database search Sequence alignment and comparison Biological sequence databases Originally just a storage place for sequences. Currently the
More informationGenBank, Entrez, & FASTA
GenBank, Entrez, & FASTA Nucleotide Sequence Databases First generation GenBank is a representative example started as sort of a museum to preserve knowledge of a sequence from first discovery great repositories,
More informationActivity IT S ALL RELATIVES The Role of DNA Evidence in Forensic Investigations
Activity IT S ALL RELATIVES The Role of DNA Evidence in Forensic Investigations SCENARIO You have responded, as a result of a call from the police to the Coroner s Office, to the scene of the death of
More informationBioinformatics Grid - Enabled Tools For Biologists.
Bioinformatics Grid - Enabled Tools For Biologists. What is Grid-Enabled Tools (GET)? As number of data from the genomics and proteomics experiment increases. Problems arise for the current sequence analysis
More informationmorephyml User Guide [Version 1.14] August 2011 by Alexis Criscuolo
morephyml User Guide [Version 1.14] August 2011 by Alexis Criscuolo ftp://ftp.pasteur.fr/pub/gensoft/projects/morephyml/ http://mobyle.pasteur.fr/cgi-bin/portal.py Please cite this paper if you use this
More informationBIO 3350: ELEMENTS OF BIOINFORMATICS PARTIALLY ONLINE SYLLABUS
BIO 3350: ELEMENTS OF BIOINFORMATICS PARTIALLY ONLINE SYLLABUS NEW YORK CITY COLLEGE OF TECHNOLOGY The City University Of New York School of Arts and Sciences Biological Sciences Department Course title:
More informationA branch-and-bound algorithm for the inference of ancestral. amino-acid sequences when the replacement rate varies among
A branch-and-bound algorithm for the inference of ancestral amino-acid sequences when the replacement rate varies among sites Tal Pupko 1,*, Itsik Pe er 2, Masami Hasegawa 1, Dan Graur 3, and Nir Friedman
More informationDNA Sequencing Overview
DNA Sequencing Overview DNA sequencing involves the determination of the sequence of nucleotides in a sample of DNA. It is presently conducted using a modified PCR reaction where both normal and labeled
More informationPROC. CAIRO INTERNATIONAL BIOMEDICAL ENGINEERING CONFERENCE 2006 1. E-mail: msm_eng@k-space.org
BIOINFTool: Bioinformatics and sequence data analysis in molecular biology using Matlab Mai S. Mabrouk 1, Marwa Hamdy 2, Marwa Mamdouh 2, Marwa Aboelfotoh 2,Yasser M. Kadah 2 1 Biomedical Engineering Department,
More informationSection 3 Comparative Genomics and Phylogenetics
Section 3 Section 3 Comparative enomics and Phylogenetics At the end of this section you should be able to: Describe what is meant by DNA sequencing. Explain what is meant by Bioinformatics and Comparative
More information4. Why are common names not good to use when classifying organisms? Give an example.
1. Define taxonomy. Classification of organisms 2. Who was first to classify organisms? Aristotle 3. Explain Aristotle s taxonomy of organisms. Patterns of nature: looked like 4. Why are common names not
More informationA short guide to phylogeny reconstruction
A short guide to phylogeny reconstruction E. Michu Institute of Biophysics, Academy of Sciences of the Czech Republic, Brno, Czech Republic ABSTRACT This review is a short introduction to phylogenetic
More informationNext Generation Sequencing Technologies in Microbial Ecology. Frank Oliver Glöckner
Next Generation Sequencing Technologies in Microbial Ecology Frank Oliver Glöckner 1 Max Planck Institute for Marine Microbiology Investigation of the role, diversity and features of microorganisms Interactions
More informationAnalyzing A DNA Sequence Chromatogram
LESSON 9 HANDOUT Analyzing A DNA Sequence Chromatogram Student Researcher Background: DNA Analysis and FinchTV DNA sequence data can be used to answer many types of questions. Because DNA sequences differ
More informationCore Bioinformatics. Degree Type Year Semester. 4313473 Bioinformàtica/Bioinformatics OB 0 1
Core Bioinformatics 2014/2015 Code: 42397 ECTS Credits: 12 Degree Type Year Semester 4313473 Bioinformàtica/Bioinformatics OB 0 1 Contact Name: Sònia Casillas Viladerrams Email: Sonia.Casillas@uab.cat
More informationPairwise Sequence Alignment
Pairwise Sequence Alignment carolin.kosiol@vetmeduni.ac.at SS 2013 Outline Pairwise sequence alignment global - Needleman Wunsch Gotoh algorithm local - Smith Waterman algorithm BLAST - heuristics What
More informationjmodeltest 0.1.1 (April 2008) David Posada 2008 onwards
jmodeltest 0.1.1 (April 2008) David Posada 2008 onwards dposada@uvigo.es http://darwin.uvigo.es/ See the jmodeltest FORUM and FAQs at http://darwin.uvigo.es/ INDEX 1 1. DISCLAIMER 3 2. PURPOSE 3 3. CITATION
More informationKeywords: evolution, genomics, software, data mining, sequence alignment, distance, phylogenetics, selection
Sudhir Kumar has been Director of the Center for Evolutionary Functional Genomics in The Biodesign Institute at Arizona State University since 2002. His research interests include development of software,
More informationMEGA. Molecular Evolutionary Genetics Analysis VERSION 4. Koichiro Tamura, Joel Dudley Masatoshi Nei, Sudhir Kumar
MEGA Molecular Evolutionary Genetics Analysis VERSION 4 Koichiro Tamura, Joel Dudley Masatoshi Nei, Sudhir Kumar Center of Evolutionary Functional Genomics Biodesign Institute Arizona State University
More informationHeuristics for the Sorting by Length-Weighted Inversions Problem on Signed Permutations
Heuristics for the Sorting by Length-Weighted Inversions Problem on Signed Permutations AlCoB 2014 First International Conference on Algorithms for Computational Biology Thiago da Silva Arruda Institute
More informationAn experimental study comparing linguistic phylogenetic reconstruction methods *
An experimental study comparing linguistic phylogenetic reconstruction methods * François Barbançon, a Steven N. Evans, b Luay Nakhleh c, Don Ringe, d and Tandy Warnow, e, a Palantir Technologies, 100
More informationA Tutorial in Genetic Sequence Classification Tools and Techniques
A Tutorial in Genetic Sequence Classification Tools and Techniques Jake Drew Data Mining CSE 8331 Southern Methodist University jakemdrew@gmail.com www.jakemdrew.com Sequence Characters IUPAC nucleotide
More informationGenome Explorer For Comparative Genome Analysis
Genome Explorer For Comparative Genome Analysis Jenn Conn 1, Jo L. Dicks 1 and Ian N. Roberts 2 Abstract Genome Explorer brings together the tools required to build and compare phylogenies from both sequence
More informationMicrosoft FrontPage 2003
Information Technology Services Kennesaw State University Microsoft FrontPage 2003 Information Technology Services Microsoft FrontPage Table of Contents Information Technology Services...1 Kennesaw State
More informationUGENE Quick Start Guide
Quick Start Guide This document contains a quick introduction to UGENE. For more detailed information, you can find the UGENE User Manual and other special manuals in project website: http://ugene.unipro.ru.
More informationSupplementary material: A benchmark of multiple sequence alignment programs upon structural RNAs Paul P. Gardner a Andreas Wilm b Stefan Washietl c
Supplementary material: A benchmark of multiple sequence alignment programs upon structural RNAs Paul P. Gardner a Andreas Wilm b Stefan Washietl c a Department of Evolutionary Biology, University of Copenhagen,
More informationMAKING AN EVOLUTIONARY TREE
Student manual MAKING AN EVOLUTIONARY TREE THEORY The relationship between different species can be derived from different information sources. The connection between species may turn out by similarities
More informationDNA Insertions and Deletions in the Human Genome. Philipp W. Messer
DNA Insertions and Deletions in the Human Genome Philipp W. Messer Genetic Variation CGACAATAGCGCTCTTACTACGTGTATCG : : CGACAATGGCGCT---ACTACGTGCATCG 1. Nucleotide mutations 2. Genomic rearrangements 3.
More informationAP Biology Essential Knowledge Student Diagnostic
AP Biology Essential Knowledge Student Diagnostic Background The Essential Knowledge statements provided in the AP Biology Curriculum Framework are scientific claims describing phenomenon occurring in
More informationSupplementary Information accompanying with the manuscript titled:
Supplementary Information accompanying with the manuscript titled: Tethering preferences of domain families cooccurring in multi domain proteins Smita Mohanty, Mansi Purvar, Naryanswamy Srinivasan* and
More informationDnaSP, DNA polymorphism analyses by the coalescent and other methods.
DnaSP, DNA polymorphism analyses by the coalescent and other methods. Author affiliation: Julio Rozas 1, *, Juan C. Sánchez-DelBarrio 2,3, Xavier Messeguer 2 and Ricardo Rozas 1 1 Departament de Genètica,
More informationPhylogenetic Analysis using MapReduce Programming Model
2015 IEEE International Parallel and Distributed Processing Symposium Workshops Phylogenetic Analysis using MapReduce Programming Model Siddesh G M, K G Srinivasa*, Ishank Mishra, Abhinav Anurag, Eklavya
More informationLecture 10: Regression Trees
Lecture 10: Regression Trees 36-350: Data Mining October 11, 2006 Reading: Textbook, sections 5.2 and 10.5. The next three lectures are going to be about a particular kind of nonlinear predictive model,
More informationCOMPARING DNA SEQUENCES TO DETERMINE EVOLUTIONARY RELATIONSHIPS AMONG MOLLUSKS
COMPARING DNA SEQUENCES TO DETERMINE EVOLUTIONARY RELATIONSHIPS AMONG MOLLUSKS OVERVIEW In the online activity Biodiversity and Evolutionary Trees: An Activity on Biological Classification, you generated
More informationThe Origin of Life. The Origin of Life. Reconstructing the history of life: What features define living systems?
The Origin of Life I. Introduction: What is life? II. The Primitive Earth III. Evidence of Life s Beginning on Earth A. Fossil Record: a point in time B. Requirements for Chemical and Cellular Evolution:
More informationCATIA Tubing and Piping TABLE OF CONTENTS
TABLE OF CONTENTS Introduction...1 Manual Format...2 Tubing and Piping design...3 Log on/off procedures for Windows...4 To log on...4 To logoff...8 Pull-down Menus...9 Edit...9 Insert...12 Tools...13 Analyze...16
More informationPhylogenetic Models of Rate Heterogeneity: A High Performance Computing Perspective
Phylogenetic Models of Rate Heterogeneity: A High Performance Computing Perspective Alexandros Stamatakis Institute of Computer Science, Foundation for Research and Technology-Hellas P.O. Box 1385, Heraklion,
More informationSPSS INSTRUCTION CHAPTER 1
SPSS INSTRUCTION CHAPTER 1 Performing the data manipulations described in Section 1.4 of the chapter require minimal computations, easily handled with a pencil, sheet of paper, and a calculator. However,
More informationHigh Throughput Network Analysis
High Throughput Network Analysis Sumeet Agarwal 1,2, Gabriel Villar 1,2,3, and Nick S Jones 2,4,5 1 Systems Biology Doctoral Training Centre, University of Oxford, Oxford OX1 3QD, United Kingdom 2 Department
More informationPHYLOGENY AND COMPARATIVE METHODS SYMBIOMICS WORKSHOP
PHYLOGENY AND COMPARATIVE METHODS SYMBIOMICS WORKSHOP March 4-7, 2013 Valencia, Spain Parc Cientific of the University of Valencia Goals The aim of this workshop is to provide the attendees with a broad
More informationBASIC STATISTICAL METHODS FOR GENOMIC DATA ANALYSIS
BASIC STATISTICAL METHODS FOR GENOMIC DATA ANALYSIS SEEMA JAGGI Indian Agricultural Statistics Research Institute Library Avenue, New Delhi-110 012 seema@iasri.res.in Genomics A genome is an organism s
More informationA combinatorial test for significant codivergence between cool-season grasses and their symbiotic fungal endophytes
A combinatorial test for significant codivergence between cool-season grasses and their symbiotic fungal endophytes Ruriko Yoshida Dept. of Statistics University of Kentucky Joint work with C.L. Schardl,
More informationEMBL-EBI Web Services
EMBL-EBI Web Services Rodrigo Lopez Head of the External Services Team SME Workshop Piemonte 2011 EBI is an Outstation of the European Molecular Biology Laboratory. Summary Introduction The JDispatcher
More informationInference of Large Phylogenetic Trees on Parallel Architectures. Michael Ott
Inference of Large Phylogenetic Trees on Parallel Architectures Michael Ott TECHNISCHE UNIVERSITÄT MÜNCHEN Lehrstuhl für Rechnertechnik und Rechnerorganisation / Parallelrechnerarchitektur Inference of
More informationKEY CONCEPT Organisms can be classified based on physical similarities. binomial nomenclature
Section 17.1: The Linnaean System of Classification Unit 9 Study Guide KEY CONCEPT Organisms can be classified based on physical similarities. VOCABULARY taxonomy taxon binomial nomenclature genus MAIN
More informationProtein & DNA Sequence Analysis. Bobbie-Jo Webb-Robertson May 3, 2004
Protein & DNA Sequence Analysis Bobbie-Jo Webb-Robertson May 3, 2004 Sequence Analysis Anything connected to identifying higher biological meaning out of raw sequence data. 2 Genomic & Proteomic Data Sequence
More informationGuide for Bioinformatics Project Module 3
Structure- Based Evidence and Multiple Sequence Alignment In this module we will revisit some topics we started to look at while performing our BLAST search and looking at the CDD database in the first
More information(A GUIDE for the Graphical User Interface (GUI) GDE)
The Genetic Data Environment: A User Modifiable and Expandable Multiple Sequence Analysis Package (A GUIDE for the Graphical User Interface (GUI) GDE) Jonathan A. Eisen Department of Biological Sciences
More informationSimilarity Searches on Sequence Databases: BLAST, FASTA. Lorenza Bordoli Swiss Institute of Bioinformatics EMBnet Course, Basel, October 2003
Similarity Searches on Sequence Databases: BLAST, FASTA Lorenza Bordoli Swiss Institute of Bioinformatics EMBnet Course, Basel, October 2003 Outline Importance of Similarity Heuristic Sequence Alignment:
More informationAmino Acids and Their Properties
Amino Acids and Their Properties Recap: ss-rrna and mutations Ribosomal RNA (rrna) evolves very slowly Much slower than proteins ss-rrna is typically used So by aligning ss-rrna of one organism with that
More informationGuidelines for Establishment of Contract Areas Computer Science Department
Guidelines for Establishment of Contract Areas Computer Science Department Current 07/01/07 Statement: The Contract Area is designed to allow a student, in cooperation with a member of the Computer Science
More informationEvaluating the Performance of a Successive-Approximations Approach to Parameter Optimization in Maximum-Likelihood Phylogeny Estimation
Evaluating the Performance of a Successive-Approximations Approach to Parameter Optimization in Maximum-Likelihood Phylogeny Estimation Jack Sullivan,* Zaid Abdo, à Paul Joyce, à and David L. Swofford
More informationUSER S MANUAL. ArboWebForest
USER S MANUAL ArboWebForest i USER'S MANUAL TABLE OF CONTENTS Page # 1.0 GENERAL INFORMATION... 1-1 1.1 System Overview... 1-1 1.2 Organization of the Manual... 1-1 2.0 SYSTEM SUMMARY... 2-1 2.1 System
More informationUCINET Quick Start Guide
UCINET Quick Start Guide This guide provides a quick introduction to UCINET. It assumes that the software has been installed with the data in the folder C:\Program Files\Analytic Technologies\Ucinet 6\DataFiles
More informationCore Bioinformatics. Titulació Tipus Curs Semestre. 4313473 Bioinformàtica/Bioinformatics OB 0 1
Core Bioinformatics 2014/2015 Codi: 42397 Crèdits: 12 Titulació Tipus Curs Semestre 4313473 Bioinformàtica/Bioinformatics OB 0 1 Professor de contacte Nom: Sònia Casillas Viladerrams Correu electrònic:
More informationDistributed Bioinformatics Computing System for DNA Sequence Analysis
Global Journal of Computer Science and Technology: A Hardware & Computation Volume 14 Issue 1 Version 1.0 Type: Double Blind Peer Reviewed International Research Journal Publisher: Global Journals Inc.
More informationRNA Structure and folding
RNA Structure and folding Overview: The main functional biomolecules in cells are polymers DNA, RNA and proteins For RNA and Proteins, the specific sequence of the polymer dictates its final structure
More informationUsing Impatica for Power Point
Using Impatica for Power Point What is Impatica? Impatica is a tool that will help you to compress PowerPoint presentations and convert them into a more efficient format for web delivery. Impatica for
More informationGIS I Business Exr02 (av 9-10) - Expand Market Share (v3b, Jul 2013)
GIS I Business Exr02 (av 9-10) - Expand Market Share (v3b, Jul 2013) Learning Objectives: Reinforce information literacy skills Reinforce database manipulation / querying skills Reinforce joining and mapping
More informationTutorial for proteome data analysis using the Perseus software platform
Tutorial for proteome data analysis using the Perseus software platform Laboratory of Mass Spectrometry, LNBio, CNPEM Tutorial version 1.0, January 2014. Note: This tutorial was written based on the information
More informationProtein Phylogenies and Signature Sequences: A Reappraisal of Evolutionary Relationships among Archaebacteria, Eubacteria, and Eukaryotes
MICROBIOLOGY AND MOLECULAR BIOLOGY REVIEWS, Dec. 1998, p. 1435 1491 Vol. 62, No. 4 1092-2172/98/$04.00 0 Copyright 1998, American Society for Microbiology. All Rights Reserved. Protein Phylogenies and
More informationAmphoraNet: Taxonomic Composition Analysis of Metagenomic Shotgun Sequencing Data
Csaba Kerepesi, Dániel Bánky, Vince Grolmusz: AmphoraNet: Taxonomic Composition Analysis of Metagenomic Shotgun Sequencing Data http://pitgroup.org/amphoranet/ PIT Bioinformatics Group, Department of Computer
More informationUCHIME in practice Single-region sequencing Reference database mode
UCHIME in practice Single-region sequencing UCHIME is designed for experiments that perform community sequencing of a single region such as the 16S rrna gene or fungal ITS region. While UCHIME may prove
More informationContent Author's Reference and Cookbook
Sitecore CMS 6.5 Content Author's Reference and Cookbook Rev. 110621 Sitecore CMS 6.5 Content Author's Reference and Cookbook A Conceptual Overview and Practical Guide to Using Sitecore Table of Contents
More informationBut what about the prokaryotic cells?
Chapter 32: Page 318 In the past two chapters, you have explored the organelles that can be found in both plant and animal s. You have also learned that plant s contain an organelle that is not found in
More informationCreating a Web Site with Publisher 2010
Creating a Web Site with Publisher 2010 Information Technology Services Outreach and Distance Learning Technologies Copyright 2012 KSU Department of Information Technology Services This document may be
More informationCore Bioinformatics. Degree Type Year Semester
Core Bioinformatics 2015/2016 Code: 42397 ECTS Credits: 12 Degree Type Year Semester 4313473 Bioinformatics OB 0 1 Contact Name: Sònia Casillas Viladerrams Email: Sonia.Casillas@uab.cat Teachers Use of
More informationAn Introduction to Phylogenetics
An Introduction to Phylogenetics Bret Larget larget@stat.wisc.edu Departments of Botany and of Statistics University of Wisconsin Madison February 4, 2008 1 / 70 Phylogenetics and Darwin A phylogeny is
More informationPHYLOGENETIC ANALYSIS
Bioinformatics: A Practical Guide to the Analysis of Genes and Proteins, Second Edition Andreas D. Baxevanis, B.F. Francis Ouellette Copyright 2001 John Wiley & Sons, Inc. ISBNs: 0-471-38390-2 (Hardback);
More informationClone Manager. Getting Started
Clone Manager for Windows Professional Edition Volume 2 Alignment, Primer Operations Version 9.5 Getting Started Copyright 1994-2015 Scientific & Educational Software. All rights reserved. The software
More informationUpdating KP Learner Manager Enterprise X On Your Server
Updating KP Learner Manager Enterprise Edition X on Your Server Third Party Software KP Learner Manager Enterprise provides links to some third party products, like Skype (www.skype.com) and PayPal (www.paypal.com).
More informationroot node level: internal node edge leaf node CS@VT Data Structures & Algorithms 2000-2009 McQuain
inary Trees 1 A binary tree is either empty, or it consists of a node called the root together with two binary trees called the left subtree and the right subtree of the root, which are disjoint from each
More informationGMAT SYLLABI. Types of Assignments - 1 -
GMAT SYLLABI The syllabi on the following pages list the math and verbal assignments for each class. Your homework assignments depend on your current math and verbal scores. Be sure to read How to Use
More information