# How to Build a Phylogenetic Tree

 To view this video please enable JavaScript, and consider upgrading to a web browser that supports HTML5 video
Save this PDF as:

Size: px
Start display at page:

## Transcription

1 How to Build a Phylogenetic Tree Phylogenetics tree is a structure in which species are arranged on branches that link them according to their relationship and/or evolutionary descent. A typical rooted tree with scaled branches is illustrated in fig1 [1] In this short article, a brief review of different methods of tree building is given and their validity, reliability and efficiency are compared. The calculation of genetic distance and methods of trees rooting are also discussed. For other issues such as tree evaluation and searching algorithms, see reference [2]. Genetic distance: Genetic distance is the number of mutation/evolutionary events between species since their divergence. The simplest way to calculate it is to count the number of difference between two sequences. However, the method, which is often refer to as hamming method, does not reflect the evolutionary history of the sequence, because not all the mutation events are recorded in the current sequences. Figure 2 shows an example in which only 3 of 12 mutation events are detected between homologous sequences [3]. Jukes-Cantor Model, which was proposed in 1969, was the first attempt to cope with this problem. They gave a corrective formula 3 4 K = ln(1 p) (fig 3), where K is the genetic 4 3 distance (the mean number of substitutions) and p is the percent difference. This relation reveals that the correction increase with the genetic distance and when the percent of difference saturates (75%), the true genetic distance is no longer extractable [4]. More recent models take into consideration of the different transition/transversion rate, different base frequencies and non-uniform substation rates between sites. All these models give similar results at low divergence. Nevertheless, at high percent of difference, choosing proper substitution model will be very crucial. Tree-Building Methods The most popular and frequently used methods of tree building can be classified into two major categories: phenetic methods based on distances and cladistic methods based on characters. The former measures the pair-wise distance/dissimilarity between two genes, the actual size of which depends on different definitions, and constructs the tree totally from the resultant distance matrix. The latter evaluate all possible trees and seek for the one that optimizes the evolution. Distance-Based Methods: The most popular distance-based methods are the unweighted pair group method with arithmetic mean (UPGMA), neighbor joining (NJ) and those that optimize the additivity of a distance tree (FM and ME) [2]

2 UPGMA Method This method follows a clustering procedure: (1) Assume that initially each species is a cluster on its own. (2) Join closest 2 clusters and recalculate distance of the joint pair by taking the average. (3) Repeat this process until all species are connected in a single cluster. Strictly speaking, this algorithm is phenetic, which does not aim to reflect evolutionary descent. It assigns equal weight on the distance and assumes a randomized molecular clock. WPGMA is a similar algorithm but assigns different weight on the distances. UPGMS method is simple, fast and has been extensively used in literature. However, it behaves poorly at most cases where the above presumptions are not met. Neighbor Joining Method (NJ) This algorithm does not make the assumption of molecular clock and adjust for the rate variation among branches. It begins with an unresolved star-like tree (fig 4(a)). Each pair is evaluated for being joined and the sum of all branches length is calculated of the resultant tree. The pair that yields the smallest sum is considered the closest neighbors and is thus joined.a new branch is inserted between them and the rest of the tree (fig4 (b)) and the branch length is recalculated. This process is repeated until only one terminal is present. NJ method is comparatively rapid and generally gives better results than UPGMA method. But it produces only one tree and neglects other possible trees, which might be as good as NJ trees, if not significantly better. Moreover, since errors in distance estimates are exponentially larger for longer distances, under some condition, this method will yield a biased tree [5]. Weighted Neighbor-Joining (Weighbor) This is a new method proposed recently [5]. The Weighbor criterion consists of two terms; an additivity term (of external branches) and a positivity term (of internal branches), that quantifies the implications of joining the pair. Weighbor gives less weight to the longer distances in the distance matrix and the resulting trees are less sensitive to specific biases than NJ and relatively immune to the "long branches attraction/distraction" drawbacks observed with other methods. Fitch-Margoliash (FM) and Minimum Evolution (ME) Methods Fitch and Margoliash proposed in 1967 a criteria (FM Method) for fitting trees to distance matrices [2]. This method seeks the least squared fit of all observed pair-wise distances to the expected distance of a tree. The ME method also seeks the tree with the minimum sum of branch lengths. But instead of using all the pair-wise distances as FM, it fixed the internal nodes by using the distance to external nodes and then optimizes the internal branch lengths FM and ME methods perform best in the group of distance-based methods, but they work much more slowly than NJ, which generally yield a very close tree to these methods

3 Character-Based Methods Distance-based methods are more rapid and less computationally intensive than characterbased methods, but the actual characters are discarded once the distance matrix is derived. On the other hand, character-based methods make use of all known evolutionary information, i.e. the individual substitutions among the sequences, to determine the most likely ancestral relationships. Maximum parsimony (MP) The criterion of MP method is that the simplest explanation of the data is preferred, because it requires the fewest conjectures. By this criterion, the MP tree is the one with fewest substitutions/evolutionary changes for all sequences to derive from a common ancestor. For each site in the alignment, all possible trees are evaluated and are given a score based on the number of evolutionary changes needed to produce the observed sequence changes. The best tree is thus the one that minimized the overall number of mutation at all site. MP works faster than ML and the weighted parsimony schemes can deal with most of the different models used by ML. However, this method yields little information about the branch lengths and suffers badly from long-branch attraction, that is the long branches have become artificially connected because of accumulation of inhomogous similarities, even if they are not at all phylogenetically related. MP yields more than one tree with the same score. Maximum Likelihood (ML): Like MP methods, ML method also uses each position in an alignment and evaluates all possible trees. It calculates the likelihood for each tree and seeks the one with the maximum likelihood. For a given tree, at each site, the likelihood is determined by evaluating the probability that a certain evolutionary model has generated the observed data. The likelihood s for each site are then multiplied to provide likelihood for each tree. ML method is the slowest and most computationally intensive method, though it seems to give the best result and the most informative tree. Rooting trees: Root is the common ancestor of the species under study. Most phylogenetic methods do not locate the root of a tree and the unrooted trees only reflect the relationship among species but not the evolutionary path. Fig5 (a) shows an unrooted tree of species A, B, C and D. If one assumes that argument of molecular clock evolution is valid, which asserts that informational macromolecules evolve at rates that are constant through time and for different lineages, then the root is simply the mid-point of the longest span across the tree. For example, for the unrooted tree in fig 5(a), the root is just the mid-point of the longest span A-B (fig 5(b)). Whether the molecular clock assumption is valid is still a controversial [6]. However, evidences from many research works have shown that such a model is too simplified and - 3 -

4 sometimes behave in an erratic manner. A more commonly used method is to evaluate the rooting by an outgroup, i.e. a distantly related specie. For instance, mouse can be used an outgroup to root a tree of primates [3] (see fig5(c)). However, this method gives rise to a dilemma. If the outgroup species is very closely related to the ingroup, it may just be a mistakenly excluded member. (Of course, in our example, the mouse is obviously not a member of primate, but it is not always apparent in other cases). On the other hand, a very divergent outgroup may have accumulated so many inhomologous similarities that it seems to be artificially connected with the ingroup (see the long-branch attraction mentioned in MP method). Reference: [1] Phylogenetics (1999) [2] Andreas D. Baxevanis & B.F. Francis Ouellette, Bioinformatics: a practical guide to the analysis of genes and proteins (2001) [3] P.Higgs & Manchester, Introduction to Phylogenetics Methods (ITP series on-line seminars) (2001) [4] How to make a phylogenetic tree (1999) [5] Bruno WJ, Socci ND, Halpern AL, Mol Biol Evol, 17(1), (2000) [6] Wen-Hsiung Li & Dan Graur, Fundamentals of Molecular Evolution (1991) - 4 -

5 Fig 1. A typical rooted tree with scaled branches [1]. For definition of terminology, visit the website:

6 Fig 2 Two homologous DNA sequences (on the right) which descended form an ancestral sequence (on the left). Although 12 mutations have accumulated since their divergence from each other, only 3 of them are recorded by the current sequences [3]. Fig 3. Jukes-Cantor Model of the relation between the percent difference p and the genetic distance K : 3 4 [4] K = ln(1 p)

7 Fig 4 NJ method: (a) an unresolved star-like tree (b) the closest terminals(for definition, see text) 1 and 2 are joined and a branch is inserted between them and the remainder of the tree and the value of the branch is taken to be the mean of the two original values. This process is continued until only one terminal is left

8 Fig5 (a) an unrooted tree (b) is rooted under the assumption of molecular clock (c) Rooting a tree by using mouse as outgroup [3]

### Likelihood Ratio Tests for Detecting Positive Selection and Application to Primate Lysozyme Evolution

Likelihood Ratio Tests for Detecting Positive Selection and Application to Primate Lysozyme Evolution Ziheng Yang Department of Biology, University College, London An excess of nonsynonymous substitutions

### The Central Dogma of Molecular Biology

Vierstraete Andy (version 1.01) 1/02/2000 -Page 1 - The Central Dogma of Molecular Biology Figure 1 : The Central Dogma of molecular biology. DNA contains the complete genetic information that defines

### Arbres formels et Arbre(s) de la Vie

Arbres formels et Arbre(s) de la Vie A bit of history and biology Definitions Numbers Topological distances Consensus Random models Algorithms to build trees Basic principles DATA sequence alignment distance

### Protein Sequence Analysis - Overview -

Protein Sequence Analysis - Overview - UDEL Workshop Raja Mazumder Research Associate Professor, Department of Biochemistry and Molecular Biology Georgetown University Medical Center Topics Why do protein

### Introduction to Phylogenetic Analysis

Subjects of this lecture Introduction to Phylogenetic nalysis Irit Orr 1 Introducing some of the terminology of phylogenetics. 2 Introducing some of the most commonly used methods for phylogenetic analysis.

### PHYLOGENETIC ANALYSIS

Bioinformatics: A Practical Guide to the Analysis of Genes and Proteins, Second Edition Andreas D. Baxevanis, B.F. Francis Ouellette Copyright 2001 John Wiley & Sons, Inc. ISBNs: 0-471-38390-2 (Hardback);

Phylogenetic Trees Made Easy A How-To Manual Fourth Edition Barry G. Hall University of Rochester, Emeritus and Bellingham Research Institute Sinauer Associates, Inc. Publishers Sunderland, Massachusetts

### Network Protocol Analysis using Bioinformatics Algorithms

Network Protocol Analysis using Bioinformatics Algorithms Marshall A. Beddoe Marshall_Beddoe@McAfee.com ABSTRACT Network protocol analysis is currently performed by hand using only intuition and a protocol

### Introduction to Bioinformatics AS 250.265 Laboratory Assignment 6

Introduction to Bioinformatics AS 250.265 Laboratory Assignment 6 In the last lab, you learned how to perform basic multiple sequence alignments. While useful in themselves for determining conserved residues

### Sequence Analysis 15: lecture 5. Substitution matrices Multiple sequence alignment

Sequence Analysis 15: lecture 5 Substitution matrices Multiple sequence alignment A teacher's dilemma To understand... Multiple sequence alignment Substitution matrices Phylogenetic trees You first need

### Maximum-Likelihood Estimation of Phylogeny from DNA Sequences When Substitution Rates Differ over Sites1

Maximum-Likelihood Estimation of Phylogeny from DNA Sequences When Substitution Rates Differ over Sites1 Ziheng Yang Department of Animal Science, Beijing Agricultural University Felsenstein s maximum-likelihood

### Gene Trees in Species Trees

Computational Biology Class Presentation Gene Trees in Species Trees Wayne P. Maddison Published in Systematic Biology, Vol. 46, No.3 (Sept 1997), pp. 523-536 Overview Gene Trees and Species Trees Processes

### Bio-Informatics Lectures. A Short Introduction

Bio-Informatics Lectures A Short Introduction The History of Bioinformatics Sanger Sequencing PCR in presence of fluorescent, chain-terminating dideoxynucleotides Massively Parallel Sequencing Massively

### Taxonomy Five Kingdoms

Taxonomy Five Kingdoms R.H Whittaker 1969 Three Domains Evolutionary Trees Cladistics: Cladograms and molecular data What are the tools used by scientists to observe and understand evolutionary relationships?

### Student Guide for Mesquite

MESQUITE Student User Guide 1 Student Guide for Mesquite This guide describes how to 1. create a project file, 2. construct phylogenetic trees, and 3. map trait evolution on branches (e.g. morphological

### FROM PROTEIN SEQUENCES TO PHYLOGENETIC TREES

FROM PROTEIN SEQUENCES TO PHYLOGENETIC TREES Robert Hirt Department of Zoology, The Natural History Museum, London Agenda Remind you that molecular phylogenetics is complex the more you know about the

### A comparison of methods for estimating the transition:transversion ratio from DNA sequences

Molecular Phylogenetics and Evolution 32 (2004) 495 503 MOLECULAR PHYLOGENETICS AND EVOLUTION www.elsevier.com/locate/ympev A comparison of methods for estimating the transition:transversion ratio from

### Final Project Report

CPSC545 by Introduction to Data Mining Prof. Martin Schultz & Prof. Mark Gerstein Student Name: Yu Kor Hugo Lam Student ID : 904907866 Due Date : May 7, 2007 Introduction Final Project Report Pseudogenes

### NSilico Life Science Introductory Bioinformatics Course

NSilico Life Science Introductory Bioinformatics Course INTRODUCTORY BIOINFORMATICS COURSE A public course delivered over three days on the fundamentals of bioinformatics and illustrated with lectures,

### PROC. CAIRO INTERNATIONAL BIOMEDICAL ENGINEERING CONFERENCE 2006 1. E-mail: msm_eng@k-space.org

BIOINFTool: Bioinformatics and sequence data analysis in molecular biology using Matlab Mai S. Mabrouk 1, Marwa Hamdy 2, Marwa Mamdouh 2, Marwa Aboelfotoh 2,Yasser M. Kadah 2 1 Biomedical Engineering Department,

### Phylogenetic Analysis using MapReduce Programming Model

2015 IEEE International Parallel and Distributed Processing Symposium Workshops Phylogenetic Analysis using MapReduce Programming Model Siddesh G M, K G Srinivasa*, Ishank Mishra, Abhinav Anurag, Eklavya

### Principles of Evolution - Origin of Species

Theories of Organic Evolution X Multiple Centers of Creation (de Buffon) developed the concept of "centers of creation throughout the world organisms had arisen, which other species had evolved from X

### Biology 164 Laboratory PHYLOGENETIC SYSTEMATICS

Biology 164 Laboratory PHYLOGENETIC SYSTEMATICS Objectives 1. To become familiar with the cladistic approach to reconstruction of phylogenies. 2. To construct a character matrix and phylogeny for a group

### Name Class Date. binomial nomenclature. MAIN IDEA: Linnaeus developed the scientific naming system still used today.

Section 1: The Linnaean System of Classification 17.1 Reading Guide KEY CONCEPT Organisms can be classified based on physical similarities. VOCABULARY taxonomy taxon binomial nomenclature genus MAIN IDEA:

### Overview. bayesian coalescent analysis (of viruses) The coalescent. Coalescent inference

Overview bayesian coalescent analysis (of viruses) Introduction to the Coalescent Phylodynamics and the Bayesian skyline plot Molecular population genetics for viruses Testing model assumptions Alexei

### An experimental study comparing linguistic phylogenetic reconstruction methods *

An experimental study comparing linguistic phylogenetic reconstruction methods * François Barbançon, a Steven N. Evans, b Luay Nakhleh c, Don Ringe, d and Tandy Warnow, e, a Palantir Technologies, 100

### Multiple Sequence Alignment

Multiple Sequence Alignment Definition Given N sequences x 1, x 2,, x N : Insert gaps (-) in each sequence x i, such that All sequences have the same length L Score of the global map is maximum Applications

### A branch-and-bound algorithm for the inference of ancestral. amino-acid sequences when the replacement rate varies among

A branch-and-bound algorithm for the inference of ancestral amino-acid sequences when the replacement rate varies among sites Tal Pupko 1,*, Itsik Pe er 2, Masami Hasegawa 1, Dan Graur 3, and Nir Friedman

### Unit 6: Classification and Diversity. KEY CONCEPT Organisms can be classified based on physical similarities.

KEY CONCEPT Organisms can be classified based on physical similarities. Linnaeus developed the scientific naming system still used today. Taxonomy is the science of naming and classifying organisms. White

### Name: Class: Date: Multiple Choice Identify the choice that best completes the statement or answers the question.

Name: Class: Date: Chapter 17 Practice Multiple Choice Identify the choice that best completes the statement or answers the question. 1. The correct order for the levels of Linnaeus's classification system,

### Accuracies of Ancestral Amino Acid Sequences Inferred by the Parsimony, Likelihood, and Distance Methods

J Mol Evol (1997) 44(Suppl 1):S139 S146 Springer-Verlag New York Inc. 1997 Accuracies of Ancestral Amino Acid Sequences Inferred by the Parsimony, Likelihood, and Distance Methods Jianzhi Zhang, Masatoshi

### 1 Phylogenetic History: The Evolution of Marine Mammals

1 Phylogenetic History: The Evolution of Marine Mammals Think for a moment about marine mammals: seals, walruses, dugongs and whales. Seals and walruses are primarily cold-water species that eat mostly

### Performance study of supertree methods

Q&A How did you become involved in doing research? I applied for an REU (Research for Undergraduates) at KU last summer and worked with Dr. Mark Holder for ten weeks. It was an amazing experience! How

### Schools of Systematics. Schools of Systematics. What is Evolutionary Systematics? What is Evolutionary Systematics? What is Evolutionary Systematics?

Topic 3: Systematics II What are the schools of thought of systematics? How does one make a cladogram? What are higher taxa & ranks? How should they be used in this course? Applying phylogenies Evolution

### Simon Tavarg Statistics Department Colorado State University Fort Collins, CO ABSTRACT

Simon Tavarg Statistics Department Colorado State University Fort Collins, CO 80523 ABSTRACT Some models for the estimation of substitution rates from pairs of functionally homologous DNA sequences are

### Mammalian Housekeeping Genes Evolve more Slowly. than Tissue-specific Genes

MBE Advance Access published October 31, 2003 Mammalian Housekeeping Genes Evolve more Slowly than Tissue-specific Genes Liqing Zhang and Wen-Hsiung Li Department of Ecology and Evolution, University of

### A Step-by-Step Tutorial: Divergence Time Estimation with Approximate Likelihood Calculation Using MCMCTREE in PAML

9 June 2011 A Step-by-Step Tutorial: Divergence Time Estimation with Approximate Likelihood Calculation Using MCMCTREE in PAML by Jun Inoue, Mario dos Reis, and Ziheng Yang In this tutorial we will analyze

### Ch. 13 How Populations Evolve Period. 4. Describe Lamarck s proposed theory of evolution, The Theory of Acquired Traits.

Ch. 13 How Populations Evolve Name Period California State Standards covered by this chapter: Evolution 7. The frequency of an allele in a gene pool of a population depends on many factors and may be stable

### I. Use BLAST to Find DNA Sequences in Databases (Electronic PCR)

Using DNA Barcodes to Identify and Classify Living Things: Bioinformatics I. Use BLAST to Find DNA Sequences in Databases (Electronic PCR) 1. Perform a BLAST search as follows: a) Do an Internet search

### How to report the percentage of explained common variance in exploratory factor analysis

UNIVERSITAT ROVIRA I VIRGILI How to report the percentage of explained common variance in exploratory factor analysis Tarragona 2013 Please reference this document as: Lorenzo-Seva, U. (2013). How to report

### Chapter 7. Hierarchical cluster analysis. Contents 7-1

7-1 Chapter 7 Hierarchical cluster analysis In Part 2 (Chapters 4 to 6) we defined several different ways of measuring distance (or dissimilarity as the case may be) between the rows or between the columns

### A short guide to phylogeny reconstruction

A short guide to phylogeny reconstruction E. Michu Institute of Biophysics, Academy of Sciences of the Czech Republic, Brno, Czech Republic ABSTRACT This review is a short introduction to phylogenetic

### Bayesian Phylogeny and Measures of Branch Support

Bayesian Phylogeny and Measures of Branch Support Bayesian Statistics Imagine we have a bag containing 100 dice of which we know that 90 are fair and 10 are biased. The

### RETRIEVING SEQUENCE INFORMATION. Nucleotide sequence databases. Database search. Sequence alignment and comparison

RETRIEVING SEQUENCE INFORMATION Nucleotide sequence databases Database search Sequence alignment and comparison Biological sequence databases Originally just a storage place for sequences. Currently the

### UCHIME in practice Single-region sequencing Reference database mode

UCHIME in practice Single-region sequencing UCHIME is designed for experiments that perform community sequencing of a single region such as the 16S rrna gene or fungal ITS region. While UCHIME may prove

### Minimum Spanning Trees

Minimum Spanning Trees Algorithms and 18.304 Presentation Outline 1 Graph Terminology Minimum Spanning Trees 2 3 Outline Graph Terminology Minimum Spanning Trees 1 Graph Terminology Minimum Spanning Trees

### of interrelatedness and decent with modification. However, technological advances are opening another powerful avenue of comparison: DNA sequences.

SCI115SC Module 3 AVP Transcript Title: Genetic Similarity Title Slide Narrator: Welcome to this presentation on genetic similarity. Slide 2 Title: Underlying the Outwardly Visible Similarities, We Find

### Distributions of Statistics Used for the Comparison of Models of Sequence Evolution in Phylogenetics

Distributions of Statistics Used for the Comparison of Models of Sequence Evolution in Phylogenetics Simon Whelan and Nick Goldman Department of Genetics, University of Cambridge Asymptotic statistical

### Parallel Adaptations to High Temperatures in the Archean Eon

Parallel Adaptations to High Temperatures in the Archean Eon Samuel Blanquart a1 Bastien Boussau b1 Anamaria Necşulea b Nicolas Lartillot a Manolo Gouy b June 9, 2008 a LIRMM, CNRS. b BBE, CNRS, Université

### 6 Creating the Animation

6 Creating the Animation Now that the animation can be represented, stored, and played back, all that is left to do is understand how it is created. This is where we will use genetic algorithms, and this

### DnaSP, DNA polymorphism analyses by the coalescent and other methods.

DnaSP, DNA polymorphism analyses by the coalescent and other methods. Author affiliation: Julio Rozas 1, *, Juan C. Sánchez-DelBarrio 2,3, Xavier Messeguer 2 and Ricardo Rozas 1 1 Departament de Genètica,

### Pairwise Sequence Alignment

Pairwise Sequence Alignment carolin.kosiol@vetmeduni.ac.at SS 2013 Outline Pairwise sequence alignment global - Needleman Wunsch Gotoh algorithm local - Smith Waterman algorithm BLAST - heuristics What

### Consensus trees and tree support

Newsletter 64 28 Consensus trees and tree support In this article I will look at two separate issues; consensus trees and support for the nodes on your tree. There is a tenuous link between these as we

### THREE DIMENSIONAL REPRESENTATION OF AMINO ACID CHARAC- TERISTICS

THREE DIMENSIONAL REPRESENTATION OF AMINO ACID CHARAC- TERISTICS O.U. Sezerman 1, R. Islamaj 2, E. Alpaydin 2 1 Laborotory of Computational Biology, Sabancı University, Istanbul, Turkey. 2 Computer Engineering

### Bayesian coalescent inference of population size history

Bayesian coalescent inference of population size history Alexei Drummond University of Auckland Workshop on Population and Speciation Genomics, 2016 1st February 2016 1 / 39 BEAST tutorials Population

### LAB 21 Using Bioinformatics to Investigate Evolutionary Relationships; Have a BLAST!

LAB 21 Using Bioinformatics to Investigate Evolutionary Relationships; Have a BLAST! Introduction: Between 1990-2003, scientists working on an international research project known as the Human Genome Project,

### Inference of Large Phylogenetic Trees on Parallel Architectures. Michael Ott

Inference of Large Phylogenetic Trees on Parallel Architectures Michael Ott TECHNISCHE UNIVERSITÄT MÜNCHEN Lehrstuhl für Rechnertechnik und Rechnerorganisation / Parallelrechnerarchitektur Inference of

### Hierarchical Bayesian Modeling of the HIV Response to Therapy

Hierarchical Bayesian Modeling of the HIV Response to Therapy Shane T. Jensen Department of Statistics, The Wharton School, University of Pennsylvania March 23, 2010 Joint Work with Alex Braunstein and

### Amino Acids and Their Properties

Amino Acids and Their Properties Recap: ss-rrna and mutations Ribosomal RNA (rrna) evolves very slowly Much slower than proteins ss-rrna is typically used So by aligning ss-rrna of one organism with that

### Molecular Clocks and Tree Dating with r8s and BEAST

Integrative Biology 200B University of California, Berkeley Principals of Phylogenetics: Ecology and Evolution Spring 2011 Updated by Nick Matzke Molecular Clocks and Tree Dating with r8s and BEAST Today

### Evolutionary Background Key Concepts

Evolutionary Background Key Concepts Adaptation Descent with modification (Evolution as history) Natural Selection (Evolution as process) Some Special Problems What do we mean by adaptation? Altruism Group

### BIO 3350: ELEMENTS OF BIOINFORMATICS PARTIALLY ONLINE SYLLABUS

BIO 3350: ELEMENTS OF BIOINFORMATICS PARTIALLY ONLINE SYLLABUS NEW YORK CITY COLLEGE OF TECHNOLOGY The City University Of New York School of Arts and Sciences Biological Sciences Department Course title:

### A response to charges of error in Biology by Miller & Levine

A response to charges of error in Biology by Miller & Levine According to TEA, a citizen disputes two sentences on page 767 of our textbook, Biology, by Miller & Levine. These sentences are: SE 767, par.

### A Randomized Algorithm for Distance Matrix Calculations in Multiple Sequence Alignment *

33 A Randomized Algorithm for Distance Matrix Calculations in Multiple Sequence Alignment * Sanguthevar Rajasekaran, Vishal Thapar, Hardik Dave, and Chun-Hsi Huang School of Computer Science and Engineering,

### MAKING AN EVOLUTIONARY TREE

Student manual MAKING AN EVOLUTIONARY TREE THEORY The relationship between different species can be derived from different information sources. The connection between species may turn out by similarities

### A Non-Linear Schema Theorem for Genetic Algorithms

A Non-Linear Schema Theorem for Genetic Algorithms William A Greene Computer Science Department University of New Orleans New Orleans, LA 70148 bill@csunoedu 504-280-6755 Abstract We generalize Holland

### Activity IT S ALL RELATIVES The Role of DNA Evidence in Forensic Investigations

Activity IT S ALL RELATIVES The Role of DNA Evidence in Forensic Investigations SCENARIO You have responded, as a result of a call from the police to the Coroner s Office, to the scene of the death of

### Teaching Animal Molecular Phylogenetics By C. Leon Harris

Teaching Animal Molecular Phylogenetics By C. Leon Harris PREFACE To many zoology instructors, the phylogeny of animals is merely a convenience for organizing textbooks and courses, and not to be taken

### How Populations Evolve

How Populations Evolve Darwin and the Origin of the Species Charles Darwin published On the Origin of Species by Means of Natural Selection, November 24, 1859. Darwin presented two main concepts: Life

### Pairwise sequence alignments

Pairwise sequence alignments Volker Flegel Vassilios Ioannidis VI - 2004 Page 1 Outline Introduction Definitions Biological context of pairwise alignments Computing of pairwise alignments Some programs

### From Sequence Analysis to Simulations:

From Sequence Analysis to Simulations: Applications of HPC in Modern Biology R. Sankararamakrishnan Department of Biological Sciences & Bioengineering IIT-Kanpur IIT-K REACH Symposium 2010 Oct 9 th 2010

### S = {1, 2,..., n}. P (1, 1) P (1, 2)... P (1, n) P (2, 1) P (2, 2)... P (2, n) P = . P (n, 1) P (n, 2)... P (n, n)

last revised: 26 January 2009 1 Markov Chains A Markov chain process is a simple type of stochastic process with many social science applications. We ll start with an abstract description before moving

### Biology Performance Level Descriptors

Limited A student performing at the Limited Level demonstrates a minimal command of Ohio s Learning Standards for Biology. A student at this level has an emerging ability to describe genetic patterns of

### Introduction to Bioinformatics 3. DNA editing and contig assembly

Introduction to Bioinformatics 3. DNA editing and contig assembly Benjamin F. Matthews United States Department of Agriculture Soybean Genomics and Improvement Laboratory Beltsville, MD 20708 matthewb@ba.ars.usda.gov

### Rapid alignment methods: FASTA and BLAST. p The biological problem p Search strategies p FASTA p BLAST

Rapid alignment methods: FASTA and BLAST p The biological problem p Search strategies p FASTA p BLAST 257 BLAST: Basic Local Alignment Search Tool p BLAST (Altschul et al., 1990) and its variants are some

### Molecules, Morphology and Species Concepts. Speciation Occurs at Widely Differing Rates. Speciation Rates

Molecules, Morphology and Species Concepts Speciation Occurs at Widely Differing Rates Horseshoe crabs (Limulus), the same as fossils 300 Mya Darwin s Finches: 13 species within 100,000 years. Speciation

### Worksheet - COMPARATIVE MAPPING 1

Worksheet - COMPARATIVE MAPPING 1 The arrangement of genes and other DNA markers is compared between species in Comparative genome mapping. As early as 1915, the geneticist J.B.S Haldane reported that

### Scaling the gene duplication problem towards the Tree of Life: Accelerating the rspr heuristic search

Scaling the gene duplication problem towards the Tree of Life: Accelerating the rspr heuristic search André Wehe 1 and J. Gordon Burleigh 2 1 Department of Computer Science, Iowa State University, Ames,

### 18.1 Finding Order in Diversity

18.1 Finding Order in Diversity Lesson Objectives Describe the goals of binomial nomenclature and systematics. Identify the taxa in the classification system devised by Linnaeus. Lesson Summary Assigning

### Systematics - BIO 615

Outline - and introduction to phylogenetic inference 1. Pre Lamarck, Pre Darwin Classification without phylogeny 2. Lamarck & Darwin to Hennig (et al.) Classification with phylogeny but without a reproducible

### Continued fractions and good approximations.

Continued fractions and good approximations We will study how to find good approximations for important real life constants A good approximation must be both accurate and easy to use For instance, our

### An Indian Journal FULL PAPER ABSTRACT KEYWORDS. Trade Science Inc. XML data mining research based on multi-level technology

[Type text] [Type text] [Type text] ISSN : 0974-7435 Volume 10 Issue 13 BioTechnology 2014 An Indian Journal FULL PAPER BTAIJ, 10(13), 2014 [7602-7609] XML data mining research based on multi-level technology

### 14.10.2014. Overview. Swarms in nature. Fish, birds, ants, termites, Introduction to swarm intelligence principles Particle Swarm Optimization (PSO)

Overview Kyrre Glette kyrrehg@ifi INF3490 Swarm Intelligence Particle Swarm Optimization Introduction to swarm intelligence principles Particle Swarm Optimization (PSO) 3 Swarms in nature Fish, birds,

### Analysis of two genes from the X and Y chromosomes of plant species from the genus Silene

Analysis of DNA sequence data p. 1 Analysis of DNA sequence data using MEGA and DNAsp. Analysis of two genes from the X and Y chromosomes of plant species from the genus Silene The first two computer classes

### 1. Which tree below is the most accurate? 1. They re exactly the same.

Tree Thinking Assessment Quiz Complete on your own without any other resources. This is just an assessment to determine your current understanding. 1. Which tree below is the most accurate? 1. They re

### C1. A gene pool is all of the genes present in a particular population. Each type of gene within a gene pool may exist in one or more alleles.

C1. A gene pool is all of the genes present in a particular population. Each type of gene within a gene pool may exist in one or more alleles. The prevalence of an allele within the gene pool is described

### Bioinformatics and handwriting/speech recognition: unconventional applications of similarity search tools

Bioinformatics and handwriting/speech recognition: unconventional applications of similarity search tools Kyle Jensen and Gregory Stephanopoulos Department of Chemical Engineering, Massachusetts Institute

### BIRCH: An Efficient Data Clustering Method For Very Large Databases

BIRCH: An Efficient Data Clustering Method For Very Large Databases Tian Zhang, Raghu Ramakrishnan, Miron Livny CPSC 504 Presenter: Discussion Leader: Sophia (Xueyao) Liang HelenJr, Birches. Online Image.

### Genetic Algorithms. What is Evolutionary Computation? The Argument. 22c: 145, Chapter 4.3

Genetic Algorithms 22c: 145, Chapter 4.3 What is Evolutionary Computation? An abstraction from the theory of biological evolution that is used to create optimization procedures or methodologies, usually

### The Story of Human Evolution Part 1: From ape-like ancestors to modern humans

The Story of Human Evolution Part 1: From ape-like ancestors to modern humans Slide 1 The Story of Human Evolution This powerpoint presentation tells the story of who we are and where we came from - how

### A fairly quick tempo of solutions discussions can be kept during the arithmetic problems.

Distributivity and related number tricks Notes: No calculators are to be used Each group of exercises is preceded by a short discussion of the concepts involved and one or two examples to be worked out

### Section 5.1 Number Theory: Prime & Composite Numbers

Section 5.1 Number Theory: Prime & Composite Numbers Objectives 1. Determine divisibility. 2. Write the prime factorization of a composite number. 3. Find the greatest common divisor of two numbers. 4.

### Evolutionary Computation

Evolutionary Computation Cover evolutionary computation theory and paradigms Emphasize use of EC to solve practical problems Compare with other techniques - see how EC fits in with other approaches Definition:

### Data Mining Cluster Analysis: Basic Concepts and Algorithms. Lecture Notes for Chapter 8. Introduction to Data Mining

Data Mining Cluster Analysis: Basic Concepts and Algorithms Lecture Notes for Chapter 8 Introduction to Data Mining by Tan, Steinbach, Kumar Tan,Steinbach, Kumar Introduction to Data Mining 4/8/2004 Hierarchical

### Advanced Algebra 2. I. Equations and Inequalities

Advanced Algebra 2 I. Equations and Inequalities A. Real Numbers and Number Operations 6.A.5, 6.B.5, 7.C.5 1) Graph numbers on a number line 2) Order real numbers 3) Identify properties of real numbers

### DNA Insertions and Deletions in the Human Genome. Philipp W. Messer

DNA Insertions and Deletions in the Human Genome Philipp W. Messer Genetic Variation CGACAATAGCGCTCTTACTACGTGTATCG : : CGACAATGGCGCT---ACTACGTGCATCG 1. Nucleotide mutations 2. Genomic rearrangements 3.

### Measuring the Performance of an Agent

25 Measuring the Performance of an Agent The rational agent that we are aiming at should be successful in the task it is performing To assess the success we need to have a performance measure What is rational