Reconstructing phylogenies: how? how well? why?

Size: px
Start display at page:

Download "Reconstructing phylogenies: how? how well? why?"

Transcription

1 Joe Felsenstein Department of Genome Sciences and Department of Biology University of Washington, Seattle p.1/29

2 A review that asks these questions What are some of the strengths and weaknesses of different ways of reconstructing evolutionary trees (phylogenies)? p.2/29

3 A review that asks these questions What are some of the strengths and weaknesses of different ways of reconstructing evolutionary trees (phylogenies)? How can we find out how accurate we may have been in reconstructing the phylogeny? p.2/29

4 A review that asks these questions What are some of the strengths and weaknesses of different ways of reconstructing evolutionary trees (phylogenies)? How can we find out how accurate we may have been in reconstructing the phylogeny? Why do we want to reconstruct it? What are phylogenies used for? p.2/29

5 What does tree space (with branch lengths) look like? an example: three species with a clock A B C trifurcation t 1 t 2 t 1 not possible OK etc. t 2 when we consider all three possible topologies, the space looks like: t 1 t 2 p.3/29

6 For one tree topology The space of trees varying all 2n 3 branch lengths, each a nonegative number, defines an orthant" (open corner) of a 2n 3-dimensional real space: B A v 1 F v 6 v 2 3 C wall 8v 9 v v v 7 v 5 v 4 D floor wall v 9 E p.4/29

7 Through the looking-glass Shrinking one of the n 1 interior branches to 0, we arrive at a trifurcation: B A v 1 F v 6 v 2 3 v v v 8 7 v 9 v 5 v 4 C D A v 1 F v 6 v 7 B v 2 v3 v C 8 v 4 D v 9 v 5 E A v 1 F v 6 B v 2 3 v v v 8 7 E v 5 v 4 D C A v 1 F v 6 v 7 v 9 B v 2 v3 v8 v v 4 5 C E E Here, as we pass through the looking glass" we are also touch the space for two other tree topologies, and we could decide to enter either. D p.5/29

8 The graph of all trees of 5 species The space of all these orthants, one for each topology, connecting ones that share faces (looking glasses): A C D B E A D B E C B C E D A A E B C D B A C D E B D C A E A B C D E A B C E D A B D C E A D B C E A C B D E C A B D E A E B D C A E C B D D A B E C The Schoenberg graph (all 15 trees of size 5 connected by NNI s) p.6/29

9 There are very large numbers of trees For 21 species, the number of possible unrooted tree topologies exceeds Avogadro s Number: it is = 8, 200, 794, 532, 637, 891, 559, and that s not even asking about how hard it is to optimize the 39 branch lengths for each of these trees. What this goes with is that most methods of finding the best tree are NP-hard, and not easy to approximate either. p.7/29

10 Parsimony methods Alpha Delta Gamma Beta Epsilon Sites Species Alpha T A G C A T Beta C A A G C T Gamma T C G G C T Delta T C G C A A Epsilon C A A C A T p.8/29

11 Advantages and disadvantages of parsimony methods Disadvantage: not model-based so people think it makes no assumptions. p.9/29

12 Advantages and disadvantages of parsimony methods Disadvantage: not model-based so people think it makes no assumptions. Advantage: reasonably fast, no search of branch lengths needed and quick to compute the criterion. p.9/29

13 Advantages and disadvantages of parsimony methods Disadvantage: not model-based so people think it makes no assumptions. Advantage: reasonably fast, no search of branch lengths needed and quick to compute the criterion. Advantage: good statistical properties when amounts of change are small. p.9/29

14 Advantages and disadvantages of parsimony methods Disadvantage: not model-based so people think it makes no assumptions. Advantage: reasonably fast, no search of branch lengths needed and quick to compute the criterion. Advantage: good statistical properties when amounts of change are small. Disadvantage: statistical misbehavior (inconsistency) when some nearby branches on the tree are long (Long Branch Attraction). p.9/29

15 Advantages and disadvantages of parsimony methods Disadvantage: not model-based so people think it makes no assumptions. Advantage: reasonably fast, no search of branch lengths needed and quick to compute the criterion. Advantage: good statistical properties when amounts of change are small. Disadvantage: statistical misbehavior (inconsistency) when some nearby branches on the tree are long (Long Branch Attraction). Disadvantage: likely to make you think you have William of Ockham s endorsement. p.9/29

16 Advantages and disadvantages of parsimony methods Disadvantage: not model-based so people think it makes no assumptions. Advantage: reasonably fast, no search of branch lengths needed and quick to compute the criterion. Advantage: good statistical properties when amounts of change are small. Disadvantage: statistical misbehavior (inconsistency) when some nearby branches on the tree are long (Long Branch Attraction). Disadvantage: likely to make you think you have William of Ockham s endorsement. Disadvantage: may lead to the delusion that you know exactly what happened in evolution, in detail. p.9/29

17 The sequences: Distance matrix methods yield distances: A B C D E A B C D E CCTAACCTCTGACCC... CGTAACCTCCGGCCC... CGTAACCTCTGGCCC... CGCAACCTCTGGCTC... CCTAACCTCTGGCCC... A B C D alter tree until predictions match observed distances as closely as possible E A suggested tree: A C D B E predicts: A B C D E compare: A B C D E p.10/29

18 Advantages and disadvantages of distance methods Advantage: model-based so assumptions are clearer. p.11/29

19 Advantages and disadvantages of distance methods Advantage: model-based so assumptions are clearer. Advantage: it s geometry so mathematical scientists love it. p.11/29

20 Advantages and disadvantages of distance methods Advantage: model-based so assumptions are clearer. Advantage: it s geometry so mathematical scientists love it. Advantage: often fast (especially Neighbor-Joining method), can handle large numbers of sequences. p.11/29

21 Advantages and disadvantages of distance methods Advantage: model-based so assumptions are clearer. Advantage: it s geometry so mathematical scientists love it. Advantage: often fast (especially Neighbor-Joining method), can handle large numbers of sequences. Disadvantage: not using data fully statistically efficiently. p.11/29

22 Advantages and disadvantages of distance methods Advantage: model-based so assumptions are clearer. Advantage: it s geometry so mathematical scientists love it. Advantage: often fast (especially Neighbor-Joining method), can handle large numbers of sequences. Disadvantage: not using data fully statistically efficiently. Advantage: when tested by simulation, found to be surprisingly efficient anyway. p.11/29

23 Advantages and disadvantages of distance methods Advantage: model-based so assumptions are clearer. Advantage: it s geometry so mathematical scientists love it. Advantage: often fast (especially Neighbor-Joining method), can handle large numbers of sequences. Disadvantage: not using data fully statistically efficiently. Advantage: when tested by simulation, found to be surprisingly efficient anyway. Disadvantage: cannot easily propagate some information about local features in the sequences from one distance calculation to another. p.11/29

24 Advantages and disadvantages of distance methods Advantage: model-based so assumptions are clearer. Advantage: it s geometry so mathematical scientists love it. Advantage: often fast (especially Neighbor-Joining method), can handle large numbers of sequences. Disadvantage: not using data fully statistically efficiently. Advantage: when tested by simulation, found to be surprisingly efficient anyway. Disadvantage: cannot easily propagate some information about local features in the sequences from one distance calculation to another. Disadvantage: it s geometry so mathematical scientists hang onto it beyond the point of reason. p.11/29

25 Maximum likelihood A C C C G t i are t 1 "branch lengths", x t2 t 7 t 3 z t 4 t 5 y t 6 (rate X time) w t 8 To compute the likelihood for one site, sum over all possible states (bases) at interior nodes: L (i) = x y z w Prob (w) Prob (x w, t 7 ) Prob (A x, t 1 ) Prob (C x, t 2 ) Prob (z w, t 8 ) Prob (C z, t 3 ) Prob (y z, t 6 ) Prob (C y, t 4 ) Prob (G y, t 5 ) p.12/29

26 Advantages and disadvantages of likelihood Advantage: uses a model, so assumptions are clear. p.13/29

27 Advantages and disadvantages of likelihood Advantage: uses a model, so assumptions are clear. Advantage: fully statistically efficient. p.13/29

28 Advantages and disadvantages of likelihood Advantage: uses a model, so assumptions are clear. Advantage: fully statistically efficient. Disadvantage: computationally slower. p.13/29

29 Advantages and disadvantages of likelihood Advantage: uses a model, so assumptions are clear. Advantage: fully statistically efficient. Disadvantage: computationally slower. Advantage: statistical testing by likelihood ratio tests available p.13/29

30 Advantages and disadvantages of likelihood Advantage: uses a model, so assumptions are clear. Advantage: fully statistically efficient. Disadvantage: computationally slower. Advantage: statistical testing by likelihood ratio tests available Disadvantage: can t use the LRT test to test tree topologies. p.13/29

31 Bayesian inference methods Basically uses the likelihood machinery, and adds priors on parameters and on trees. Implemented by Markov chain Monte Carlo methods to sample from the posterior on trees (or parameters, or both). Very popular right now. Advantage: interpretation is straightforward, once the assumptions are met. p.14/29

32 Bayesian inference methods Basically uses the likelihood machinery, and adds priors on parameters and on trees. Implemented by Markov chain Monte Carlo methods to sample from the posterior on trees (or parameters, or both). Very popular right now. Advantage: interpretation is straightforward, once the assumptions are met. Advantage: gives you what you want, the probability of the result. p.14/29

33 Bayesian inference methods Basically uses the likelihood machinery, and adds priors on parameters and on trees. Implemented by Markov chain Monte Carlo methods to sample from the posterior on trees (or parameters, or both). Very popular right now. Advantage: interpretation is straightforward, once the assumptions are met. Advantage: gives you what you want, the probability of the result. Disadvantage: how long is long enough to run the MCMC? p.14/29

34 Bayesian inference methods Basically uses the likelihood machinery, and adds priors on parameters and on trees. Implemented by Markov chain Monte Carlo methods to sample from the posterior on trees (or parameters, or both). Very popular right now. Advantage: interpretation is straightforward, once the assumptions are met. Advantage: gives you what you want, the probability of the result. Disadvantage: how long is long enough to run the MCMC? Disadvantage: where do we get priors from, what effect do they have? p.14/29

35 Bayesian inference methods Basically uses the likelihood machinery, and adds priors on parameters and on trees. Implemented by Markov chain Monte Carlo methods to sample from the posterior on trees (or parameters, or both). Very popular right now. Advantage: interpretation is straightforward, once the assumptions are met. Advantage: gives you what you want, the probability of the result. Disadvantage: how long is long enough to run the MCMC? Disadvantage: where do we get priors from, what effect do they have? Disadvantage: they keep chanting in unison We are the statisticians of Bayes you will be assimilated. p.14/29

36 Aren t these graphical models? x x x x x x x x x x x x x x x v 1 x x x x x v 5 v 6 x x x x x v v 3 x x x x x x x x x x v 8 v 7 x x x x x v 9 x x x x x (You have to imaging it going back 500 layers or so). The problem is to use the data, which is at the tips but not available for the interior nodes, to infer the topology and branch lengths of the tree that is shared by all sites. p.15/29

37 Could we use graphical model machinery here? Like Moliére s character who is delighted to discover that he s been speaking prose all his life, we found we had already been using the relevant Graphical Model machinery since about So alas there was nothing to gain. The same thing is true for statistical genetics, where the graphical model machinery reinvents the standard peeling algorithms for computing likelihoods on pedigrees, in use since p.16/29

38 Bootstrap sampling of phylogenies Original Data sites sequences p.17/29

39 Draw columns randomly with replacement Original Data sites sequences Bootstrap sample #1 sites Estimate of the tree sequences sample same number of sites, with replacement p.18/29

40 Make a tree from that resampled data set Original Data sites sequences Bootstrap sample #1 sites Estimate of the tree sequences sample same number of sites, with replacement Bootstrap estimate of the tree, #1 p.19/29

41 Draw another bootstrap sample Original Data sites sequences Bootstrap sample #1 sites Estimate of the tree sequences sample same number of sites, with replacement Bootstrap sample #2 sequences sites sample same number of sites, with replacement Bootstrap estimate of the tree, #1 (and so on) p.20/29

42 ... and get a tree for it too. And so on. Original Data sites sequences Bootstrap sample #1 sites Estimate of the tree sequences sample same number of sites, with replacement Bootstrap sample #2 sequences sites sample same number of sites, with replacement Bootstrap estimate of the tree, #1 (and so on) Bootstrap estimate of the tree, #2 p.21/29

43 Summarizing the cloud of trees by support for branches Bovine Mouse Squir Monk Chimp Human Gorilla Orang Gibbon Rhesus Mac Jpn Macaq Crab E.Mac BarbMacaq Tarsier Lemur p.22/29

44 Some alternatives to bootstrapping Parametric bootstrapping same, but simulate data sets from our best estimate of the tree instead of sampling sites. p.23/29

45 Some alternatives to bootstrapping Parametric bootstrapping same, but simulate data sets from our best estimate of the tree instead of sampling sites. Bayesian inference of course gets statistical support information from the posterior. p.23/29

46 Some alternatives to bootstrapping Parametric bootstrapping same, but simulate data sets from our best estimate of the tree instead of sampling sites. Bayesian inference of course gets statistical support information from the posterior. The Kishino-Hasegawa-Templeton test (KHT test) which compares prespecified trees to each other by paired sites tests. p.23/29

47 Why want to know the tree? It affects all parts of the genomes it is the essential part of propagating information about the evolution of one part of the genome to inquiries about another part. The standard method for finding functional regions of the genome is now using PhyloHMMs which use Hidden Markov Model machinery together with phylogenies to find regions that have unusually low rates of evolution. p.24/29

48 Another kind of tree: the coalescent Coalescent trees are trees of ancestry of copies of a single gene locus within a species. They are weakly inferrable as most have only a few sites (SNPs) varying among individuals. Since each coalescent tree applies to a very short region of genome, maybe as little as one gene, there is less interest in the tree. But they do illuminate the values of parameters such as population size, migration rates, recombination rates etc. This allows us to accumulate information across different loci (genes). To don this we have to sum over our uncertainty about the tree by using MCMC methods, accumulating the information (as log likelihood or using Bayesian machinery) to make inferences about the parameters. This is the interface between within-species population genetics and between-species work on phylogenies. It is also the statistical foundation of inferences from mitochondrial genealogies ( mitochondrial Eve ) and Y chromosome genealogies, and of the samples from the rest of the genome that are now being added to this. p.25/29

49 A coalescent Time p.26/29

50 Yet another kind of tree: trees of gene families Gene duplications in evolution create new genes. Both the new gene and the original one then evolve. Frog Human Monkey Squirrel a b a b a b species boundary gene duplication tree of genes Some forks are gene duplications, leading to subtrees that are all supposed to have the same phylogeny as they are in the same set of species. Example: Hemoglobin proteins. p.27/29

51 Yet another kind of tree: trees of gene families Gene duplications in evolution create new genes. Both the new gene and the original one then evolve. Frog Human Monkey Squirrel a b a b a b Frog Human Monkey Squirrel Human Monkey Squirrel a a a b b b species boundary gene duplication tree of genes These two trees should be identical Some forks are gene duplications, leading to subtrees that are all supposed to have the same phylogeny as they are in the same set of species. Example: Hemoglobin proteins. p.28/29

52 References Felsenstein, J Inferring Phylogenies. Sinauer Associates, Sunderland, Massachusetts. [Book you and all your friends must rush out and buy] Semple, C. and M. Steel Phylogenetics. Oxford Lecture Series in Mathematics and Its Applications, 24. Oxford University Press. [More rigorous mathematical treatment] Yang, Z Computational Molecular Evolution. Oxford Series in Ecology and Evolution. Oxford University Press, Oxford. [Careful survey of molecular phylogeny methods, from a leader] For a list of 348 phylogeny programs, many available free, see p.29/29

Phylogenetic Trees Made Easy

Phylogenetic Trees Made Easy Phylogenetic Trees Made Easy A How-To Manual Fourth Edition Barry G. Hall University of Rochester, Emeritus and Bellingham Research Institute Sinauer Associates, Inc. Publishers Sunderland, Massachusetts

More information

Bayesian Phylogeny and Measures of Branch Support

Bayesian Phylogeny and Measures of Branch Support Bayesian Phylogeny and Measures of Branch Support Bayesian Statistics Imagine we have a bag containing 100 dice of which we know that 90 are fair and 10 are biased. The

More information

Bayesian coalescent inference of population size history

Bayesian coalescent inference of population size history Bayesian coalescent inference of population size history Alexei Drummond University of Auckland Workshop on Population and Speciation Genomics, 2016 1st February 2016 1 / 39 BEAST tutorials Population

More information

Model-based Synthesis. Tony O Hagan

Model-based Synthesis. Tony O Hagan Model-based Synthesis Tony O Hagan Stochastic models Synthesising evidence through a statistical model 2 Evidence Synthesis (Session 3), Helsinki, 28/10/11 Graphical modelling The kinds of models that

More information

Introduction to Phylogenetic Analysis

Introduction to Phylogenetic Analysis Subjects of this lecture Introduction to Phylogenetic nalysis Irit Orr 1 Introducing some of the terminology of phylogenetics. 2 Introducing some of the most commonly used methods for phylogenetic analysis.

More information

PHYML Online: A Web Server for Fast Maximum Likelihood-Based Phylogenetic Inference

PHYML Online: A Web Server for Fast Maximum Likelihood-Based Phylogenetic Inference PHYML Online: A Web Server for Fast Maximum Likelihood-Based Phylogenetic Inference Stephane Guindon, F. Le Thiec, Patrice Duroux, Olivier Gascuel To cite this version: Stephane Guindon, F. Le Thiec, Patrice

More information

A comparison of methods for estimating the transition:transversion ratio from DNA sequences

A comparison of methods for estimating the transition:transversion ratio from DNA sequences Molecular Phylogenetics and Evolution 32 (2004) 495 503 MOLECULAR PHYLOGENETICS AND EVOLUTION www.elsevier.com/locate/ympev A comparison of methods for estimating the transition:transversion ratio from

More information

Bio-Informatics Lectures. A Short Introduction

Bio-Informatics Lectures. A Short Introduction Bio-Informatics Lectures A Short Introduction The History of Bioinformatics Sanger Sequencing PCR in presence of fluorescent, chain-terminating dideoxynucleotides Massively Parallel Sequencing Massively

More information

Algorithms in Computational Biology (236522) spring 2007 Lecture #1

Algorithms in Computational Biology (236522) spring 2007 Lecture #1 Algorithms in Computational Biology (236522) spring 2007 Lecture #1 Lecturer: Shlomo Moran, Taub 639, tel 4363 Office hours: Tuesday 11:00-12:00/by appointment TA: Ilan Gronau, Taub 700, tel 4894 Office

More information

Hierarchical Bayesian Modeling of the HIV Response to Therapy

Hierarchical Bayesian Modeling of the HIV Response to Therapy Hierarchical Bayesian Modeling of the HIV Response to Therapy Shane T. Jensen Department of Statistics, The Wharton School, University of Pennsylvania March 23, 2010 Joint Work with Alex Braunstein and

More information

Arbres formels et Arbre(s) de la Vie

Arbres formels et Arbre(s) de la Vie Arbres formels et Arbre(s) de la Vie A bit of history and biology Definitions Numbers Topological distances Consensus Random models Algorithms to build trees Basic principles DATA sequence alignment distance

More information

Introduction to Bioinformatics AS 250.265 Laboratory Assignment 6

Introduction to Bioinformatics AS 250.265 Laboratory Assignment 6 Introduction to Bioinformatics AS 250.265 Laboratory Assignment 6 In the last lab, you learned how to perform basic multiple sequence alignments. While useful in themselves for determining conserved residues

More information

What mathematical optimization can, and cannot, do for biologists. Steven Kelk Department of Knowledge Engineering (DKE) Maastricht University, NL

What mathematical optimization can, and cannot, do for biologists. Steven Kelk Department of Knowledge Engineering (DKE) Maastricht University, NL What mathematical optimization can, and cannot, do for biologists Steven Kelk Department of Knowledge Engineering (DKE) Maastricht University, NL Introduction There is no shortage of literature about the

More information

Visualization of Phylogenetic Trees and Metadata

Visualization of Phylogenetic Trees and Metadata Visualization of Phylogenetic Trees and Metadata November 27, 2015 Sample to Insight CLC bio, a QIAGEN Company Silkeborgvej 2 Prismet 8000 Aarhus C Denmark Telephone: +45 70 22 32 44 www.clcbio.com support-clcbio@qiagen.com

More information

Sequence Analysis 15: lecture 5. Substitution matrices Multiple sequence alignment

Sequence Analysis 15: lecture 5. Substitution matrices Multiple sequence alignment Sequence Analysis 15: lecture 5 Substitution matrices Multiple sequence alignment A teacher's dilemma To understand... Multiple sequence alignment Substitution matrices Phylogenetic trees You first need

More information

Statistics Graduate Courses

Statistics Graduate Courses Statistics Graduate Courses STAT 7002--Topics in Statistics-Biological/Physical/Mathematics (cr.arr.).organized study of selected topics. Subjects and earnable credit may vary from semester to semester.

More information

Maximum-Likelihood Estimation of Phylogeny from DNA Sequences When Substitution Rates Differ over Sites1

Maximum-Likelihood Estimation of Phylogeny from DNA Sequences When Substitution Rates Differ over Sites1 Maximum-Likelihood Estimation of Phylogeny from DNA Sequences When Substitution Rates Differ over Sites1 Ziheng Yang Department of Animal Science, Beijing Agricultural University Felsenstein s maximum-likelihood

More information

Missing data and the accuracy of Bayesian phylogenetics

Missing data and the accuracy of Bayesian phylogenetics Journal of Systematics and Evolution 46 (3): 307 314 (2008) (formerly Acta Phytotaxonomica Sinica) doi: 10.3724/SP.J.1002.2008.08040 http://www.plantsystematics.com Missing data and the accuracy of Bayesian

More information

Name Class Date. binomial nomenclature. MAIN IDEA: Linnaeus developed the scientific naming system still used today.

Name Class Date. binomial nomenclature. MAIN IDEA: Linnaeus developed the scientific naming system still used today. Section 1: The Linnaean System of Classification 17.1 Reading Guide KEY CONCEPT Organisms can be classified based on physical similarities. VOCABULARY taxonomy taxon binomial nomenclature genus MAIN IDEA:

More information

A Non-Linear Schema Theorem for Genetic Algorithms

A Non-Linear Schema Theorem for Genetic Algorithms A Non-Linear Schema Theorem for Genetic Algorithms William A Greene Computer Science Department University of New Orleans New Orleans, LA 70148 bill@csunoedu 504-280-6755 Abstract We generalize Holland

More information

Molecular Clocks and Tree Dating with r8s and BEAST

Molecular Clocks and Tree Dating with r8s and BEAST Integrative Biology 200B University of California, Berkeley Principals of Phylogenetics: Ecology and Evolution Spring 2011 Updated by Nick Matzke Molecular Clocks and Tree Dating with r8s and BEAST Today

More information

Genome Explorer For Comparative Genome Analysis

Genome Explorer For Comparative Genome Analysis Genome Explorer For Comparative Genome Analysis Jenn Conn 1, Jo L. Dicks 1 and Ian N. Roberts 2 Abstract Genome Explorer brings together the tools required to build and compare phylogenies from both sequence

More information

A Step-by-Step Tutorial: Divergence Time Estimation with Approximate Likelihood Calculation Using MCMCTREE in PAML

A Step-by-Step Tutorial: Divergence Time Estimation with Approximate Likelihood Calculation Using MCMCTREE in PAML 9 June 2011 A Step-by-Step Tutorial: Divergence Time Estimation with Approximate Likelihood Calculation Using MCMCTREE in PAML by Jun Inoue, Mario dos Reis, and Ziheng Yang In this tutorial we will analyze

More information

Towards running complex models on big data

Towards running complex models on big data Towards running complex models on big data Working with all the genomes in the world without changing the model (too much) Daniel Lawson Heilbronn Institute, University of Bristol 2013 1 / 17 Motivation

More information

Simulation and Risk Analysis

Simulation and Risk Analysis Simulation and Risk Analysis Using Analytic Solver Platform REVIEW BASED ON MANAGEMENT SCIENCE What We ll Cover Today Introduction Frontline Systems Session Ι Beta Training Program Goals Overview of Analytic

More information

DnaSP, DNA polymorphism analyses by the coalescent and other methods.

DnaSP, DNA polymorphism analyses by the coalescent and other methods. DnaSP, DNA polymorphism analyses by the coalescent and other methods. Author affiliation: Julio Rozas 1, *, Juan C. Sánchez-DelBarrio 2,3, Xavier Messeguer 2 and Ricardo Rozas 1 1 Departament de Genètica,

More information

STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.cs.toronto.edu/~rsalakhu/ Lecture 6 Three Approaches to Classification Construct

More information

Approximating the Coalescent with Recombination. Niall Cardin Corpus Christi College, University of Oxford April 2, 2007

Approximating the Coalescent with Recombination. Niall Cardin Corpus Christi College, University of Oxford April 2, 2007 Approximating the Coalescent with Recombination A Thesis submitted for the Degree of Doctor of Philosophy Niall Cardin Corpus Christi College, University of Oxford April 2, 2007 Approximating the Coalescent

More information

Protein Sequence Analysis - Overview -

Protein Sequence Analysis - Overview - Protein Sequence Analysis - Overview - UDEL Workshop Raja Mazumder Research Associate Professor, Department of Biochemistry and Molecular Biology Georgetown University Medical Center Topics Why do protein

More information

Likelihood: Frequentist vs Bayesian Reasoning

Likelihood: Frequentist vs Bayesian Reasoning "PRINCIPLES OF PHYLOGENETICS: ECOLOGY AND EVOLUTION" Integrative Biology 200B University of California, Berkeley Spring 2009 N Hallinan Likelihood: Frequentist vs Bayesian Reasoning Stochastic odels and

More information

Web-based Supplementary Materials for Bayesian Effect Estimation. Accounting for Adjustment Uncertainty by Chi Wang, Giovanni

Web-based Supplementary Materials for Bayesian Effect Estimation. Accounting for Adjustment Uncertainty by Chi Wang, Giovanni 1 Web-based Supplementary Materials for Bayesian Effect Estimation Accounting for Adjustment Uncertainty by Chi Wang, Giovanni Parmigiani, and Francesca Dominici In Web Appendix A, we provide detailed

More information

Lab 2/Phylogenetics/September 16, 2002 1 PHYLOGENETICS

Lab 2/Phylogenetics/September 16, 2002 1 PHYLOGENETICS Lab 2/Phylogenetics/September 16, 2002 1 Read: Tudge Chapter 2 PHYLOGENETICS Objective of the Lab: To understand how DNA and protein sequence information can be used to make comparisons and assess evolutionary

More information

Methods for big data in medical genomics

Methods for big data in medical genomics Methods for big data in medical genomics Parallel Hidden Markov Models in Population Genetics Chris Holmes, (Peter Kecskemethy & Chris Gamble) Department of Statistics and, Nuffield Department of Medicine

More information

Activity IT S ALL RELATIVES The Role of DNA Evidence in Forensic Investigations

Activity IT S ALL RELATIVES The Role of DNA Evidence in Forensic Investigations Activity IT S ALL RELATIVES The Role of DNA Evidence in Forensic Investigations SCENARIO You have responded, as a result of a call from the police to the Coroner s Office, to the scene of the death of

More information

Sample Size Designs to Assess Controls

Sample Size Designs to Assess Controls Sample Size Designs to Assess Controls B. Ricky Rambharat, PhD, PStat Lead Statistician Office of the Comptroller of the Currency U.S. Department of the Treasury Washington, DC FCSM Research Conference

More information

PROC. CAIRO INTERNATIONAL BIOMEDICAL ENGINEERING CONFERENCE 2006 1. E-mail: msm_eng@k-space.org

PROC. CAIRO INTERNATIONAL BIOMEDICAL ENGINEERING CONFERENCE 2006 1. E-mail: msm_eng@k-space.org BIOINFTool: Bioinformatics and sequence data analysis in molecular biology using Matlab Mai S. Mabrouk 1, Marwa Hamdy 2, Marwa Mamdouh 2, Marwa Aboelfotoh 2,Yasser M. Kadah 2 1 Biomedical Engineering Department,

More information

Inference on Phase-type Models via MCMC

Inference on Phase-type Models via MCMC Inference on Phase-type Models via MCMC with application to networks of repairable redundant systems Louis JM Aslett and Simon P Wilson Trinity College Dublin 28 th June 202 Toy Example : Redundant Repairable

More information

PRINCIPLES OF POPULATION GENETICS

PRINCIPLES OF POPULATION GENETICS PRINCIPLES OF POPULATION GENETICS FOURTH EDITION Daniel L. Hartl Harvard University Andrew G. Clark Cornell University UniversitSts- und Landesbibliothek Darmstadt Bibliothek Biologie Sinauer Associates,

More information

Understanding by Design. Title: BIOLOGY/LAB. Established Goal(s) / Content Standard(s): Essential Question(s) Understanding(s):

Understanding by Design. Title: BIOLOGY/LAB. Established Goal(s) / Content Standard(s): Essential Question(s) Understanding(s): Understanding by Design Title: BIOLOGY/LAB Standard: EVOLUTION and BIODIVERSITY Grade(s):9/10/11/12 Established Goal(s) / Content Standard(s): 5. Evolution and Biodiversity Central Concepts: Evolution

More information

Gaussian Processes to Speed up Hamiltonian Monte Carlo

Gaussian Processes to Speed up Hamiltonian Monte Carlo Gaussian Processes to Speed up Hamiltonian Monte Carlo Matthieu Lê Murray, Iain http://videolectures.net/mlss09uk_murray_mcmc/ Rasmussen, Carl Edward. "Gaussian processes to speed up hybrid Monte Carlo

More information

The Story of Human Evolution Part 1: From ape-like ancestors to modern humans

The Story of Human Evolution Part 1: From ape-like ancestors to modern humans The Story of Human Evolution Part 1: From ape-like ancestors to modern humans Slide 1 The Story of Human Evolution This powerpoint presentation tells the story of who we are and where we came from - how

More information

Y Chromosome Markers

Y Chromosome Markers Y Chromosome Markers Lineage Markers Autosomal chromosomes recombine with each meiosis Y and Mitochondrial DNA does not This means that the Y and mtdna remains constant from generation to generation Except

More information

SENSITIVITY ANALYSIS AND INFERENCE. Lecture 12

SENSITIVITY ANALYSIS AND INFERENCE. Lecture 12 This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike License. Your use of this material constitutes acceptance of that license and the conditions of use of materials on this

More information

Statistics & Probability PhD Research. 15th November 2014

Statistics & Probability PhD Research. 15th November 2014 Statistics & Probability PhD Research 15th November 2014 1 Statistics Statistical research is the development and application of methods to infer underlying structure from data. Broad areas of statistics

More information

Bayesian Statistics: Indian Buffet Process

Bayesian Statistics: Indian Buffet Process Bayesian Statistics: Indian Buffet Process Ilker Yildirim Department of Brain and Cognitive Sciences University of Rochester Rochester, NY 14627 August 2012 Reference: Most of the material in this note

More information

Phylogenetic systematics turns over a new leaf

Phylogenetic systematics turns over a new leaf 30 Review Phylogenetic systematics turns over a new leaf Paul O. Lewis Long restricted to the domain of molecular systematics and studies of molecular evolution, likelihood methods are now being used in

More information

Learning outcomes. Knowledge and understanding. Competence and skills

Learning outcomes. Knowledge and understanding. Competence and skills Syllabus Master s Programme in Statistics and Data Mining 120 ECTS Credits Aim The rapid growth of databases provides scientists and business people with vast new resources. This programme meets the challenges

More information

An Introduction to Phylogenetics

An Introduction to Phylogenetics An Introduction to Phylogenetics Bret Larget larget@stat.wisc.edu Departments of Botany and of Statistics University of Wisconsin Madison February 4, 2008 1 / 70 Phylogenetics and Darwin A phylogeny is

More information

Comparison of frequentist and Bayesian inference. Class 20, 18.05, Spring 2014 Jeremy Orloff and Jonathan Bloom

Comparison of frequentist and Bayesian inference. Class 20, 18.05, Spring 2014 Jeremy Orloff and Jonathan Bloom Comparison of frequentist and Bayesian inference. Class 20, 18.05, Spring 2014 Jeremy Orloff and Jonathan Bloom 1 Learning Goals 1. Be able to explain the difference between the p-value and a posterior

More information

Lecture/Recitation Topic SMA 5303 L1 Sampling and statistical distributions

Lecture/Recitation Topic SMA 5303 L1 Sampling and statistical distributions SMA 50: Statistical Learning and Data Mining in Bioinformatics (also listed as 5.077: Statistical Learning and Data Mining ()) Spring Term (Feb May 200) Faculty: Professor Roy Welsch Wed 0 Feb 7:00-8:0

More information

Final Project Report

Final Project Report CPSC545 by Introduction to Data Mining Prof. Martin Schultz & Prof. Mark Gerstein Student Name: Yu Kor Hugo Lam Student ID : 904907866 Due Date : May 7, 2007 Introduction Final Project Report Pseudogenes

More information

A Primer of Genome Science THIRD

A Primer of Genome Science THIRD A Primer of Genome Science THIRD EDITION GREG GIBSON-SPENCER V. MUSE North Carolina State University Sinauer Associates, Inc. Publishers Sunderland, Massachusetts USA Contents Preface xi 1 Genome Projects:

More information

An Application of Inverse Reinforcement Learning to Medical Records of Diabetes Treatment

An Application of Inverse Reinforcement Learning to Medical Records of Diabetes Treatment An Application of Inverse Reinforcement Learning to Medical Records of Diabetes Treatment Hideki Asoh 1, Masanori Shiro 1 Shotaro Akaho 1, Toshihiro Kamishima 1, Koiti Hasida 1, Eiji Aramaki 2, and Takahide

More information

AP Biology Essential Knowledge Student Diagnostic

AP Biology Essential Knowledge Student Diagnostic AP Biology Essential Knowledge Student Diagnostic Background The Essential Knowledge statements provided in the AP Biology Curriculum Framework are scientific claims describing phenomenon occurring in

More information

Finding Clusters in Phylogenetic Trees: A Special Type of Cluster Analysis

Finding Clusters in Phylogenetic Trees: A Special Type of Cluster Analysis Finding lusters in Phylogenetic Trees: Special Type of luster nalysis Why try to identify clusters in phylogenetic trees? xample: origin of HIV. NUMR: Why are there so many distinct clusters? LUR04-7 SYNHRONY:

More information

Bayesian Machine Learning (ML): Modeling And Inference in Big Data. Zhuhua Cai Google, Rice University caizhua@gmail.com

Bayesian Machine Learning (ML): Modeling And Inference in Big Data. Zhuhua Cai Google, Rice University caizhua@gmail.com Bayesian Machine Learning (ML): Modeling And Inference in Big Data Zhuhua Cai Google Rice University caizhua@gmail.com 1 Syllabus Bayesian ML Concepts (Today) Bayesian ML on MapReduce (Next morning) Bayesian

More information

Data Integration. Lectures 16 & 17. ECS289A, WQ03, Filkov

Data Integration. Lectures 16 & 17. ECS289A, WQ03, Filkov Data Integration Lectures 16 & 17 Lectures Outline Goals for Data Integration Homogeneous data integration time series data (Filkov et al. 2002) Heterogeneous data integration microarray + sequence microarray

More information

Bayesian Statistics in One Hour. Patrick Lam

Bayesian Statistics in One Hour. Patrick Lam Bayesian Statistics in One Hour Patrick Lam Outline Introduction Bayesian Models Applications Missing Data Hierarchical Models Outline Introduction Bayesian Models Applications Missing Data Hierarchical

More information

A branch-and-bound algorithm for the inference of ancestral. amino-acid sequences when the replacement rate varies among

A branch-and-bound algorithm for the inference of ancestral. amino-acid sequences when the replacement rate varies among A branch-and-bound algorithm for the inference of ancestral amino-acid sequences when the replacement rate varies among sites Tal Pupko 1,*, Itsik Pe er 2, Masami Hasegawa 1, Dan Graur 3, and Nir Friedman

More information

Parallelization Strategies for Multicore Data Analysis

Parallelization Strategies for Multicore Data Analysis Parallelization Strategies for Multicore Data Analysis Wei-Chen Chen 1 Russell Zaretzki 2 1 University of Tennessee, Dept of EEB 2 University of Tennessee, Dept. Statistics, Operations, and Management

More information

High Throughput Network Analysis

High Throughput Network Analysis High Throughput Network Analysis Sumeet Agarwal 1,2, Gabriel Villar 1,2,3, and Nick S Jones 2,4,5 1 Systems Biology Doctoral Training Centre, University of Oxford, Oxford OX1 3QD, United Kingdom 2 Department

More information

14.10.2014. Overview. Swarms in nature. Fish, birds, ants, termites, Introduction to swarm intelligence principles Particle Swarm Optimization (PSO)

14.10.2014. Overview. Swarms in nature. Fish, birds, ants, termites, Introduction to swarm intelligence principles Particle Swarm Optimization (PSO) Overview Kyrre Glette kyrrehg@ifi INF3490 Swarm Intelligence Particle Swarm Optimization Introduction to swarm intelligence principles Particle Swarm Optimization (PSO) 3 Swarms in nature Fish, birds,

More information

Using simulation to calculate the NPV of a project

Using simulation to calculate the NPV of a project Using simulation to calculate the NPV of a project Marius Holtan Onward Inc. 5/31/2002 Monte Carlo simulation is fast becoming the technology of choice for evaluating and analyzing assets, be it pure financial

More information

Handling attrition and non-response in longitudinal data

Handling attrition and non-response in longitudinal data Longitudinal and Life Course Studies 2009 Volume 1 Issue 1 Pp 63-72 Handling attrition and non-response in longitudinal data Harvey Goldstein University of Bristol Correspondence. Professor H. Goldstein

More information

Section 3 Comparative Genomics and Phylogenetics

Section 3 Comparative Genomics and Phylogenetics Section 3 Section 3 Comparative enomics and Phylogenetics At the end of this section you should be able to: Describe what is meant by DNA sequencing. Explain what is meant by Bioinformatics and Comparative

More information

Hidden Markov Models in Bioinformatics. By Máthé Zoltán Kőrösi Zoltán 2006

Hidden Markov Models in Bioinformatics. By Máthé Zoltán Kőrösi Zoltán 2006 Hidden Markov Models in Bioinformatics By Máthé Zoltán Kőrösi Zoltán 2006 Outline Markov Chain HMM (Hidden Markov Model) Hidden Markov Models in Bioinformatics Gene Finding Gene Finding Model Viterbi algorithm

More information

Similarity Search and Mining in Uncertain Spatial and Spatio Temporal Databases. Andreas Züfle

Similarity Search and Mining in Uncertain Spatial and Spatio Temporal Databases. Andreas Züfle Similarity Search and Mining in Uncertain Spatial and Spatio Temporal Databases Andreas Züfle Geo Spatial Data Huge flood of geo spatial data Modern technology New user mentality Great research potential

More information

Core Bioinformatics. Degree Type Year Semester. 4313473 Bioinformàtica/Bioinformatics OB 0 1

Core Bioinformatics. Degree Type Year Semester. 4313473 Bioinformàtica/Bioinformatics OB 0 1 Core Bioinformatics 2014/2015 Code: 42397 ECTS Credits: 12 Degree Type Year Semester 4313473 Bioinformàtica/Bioinformatics OB 0 1 Contact Name: Sònia Casillas Viladerrams Email: Sonia.Casillas@uab.cat

More information

Network Tomography Based on to-end Measurements

Network Tomography Based on to-end Measurements Network Tomography Based on end-to to-end Measurements Francesco Lo Presti Dipartimento di Informatica - Università dell Aquila The First COST-IST(EU)-NSF(USA) Workshop on EXCHANGES & TRENDS IN NETWORKING

More information

Name: Date: Problem How do amino acid sequences provide evidence for evolution? Procedure Part A: Comparing Amino Acid Sequences

Name: Date: Problem How do amino acid sequences provide evidence for evolution? Procedure Part A: Comparing Amino Acid Sequences Name: Date: Amino Acid Sequences and Evolutionary Relationships Introduction Homologous structures those structures thought to have a common origin but not necessarily a common function provide some of

More information

Borges, J. L. 1998. On exactitude in science. P. 325, In, Jorge Luis Borges, Collected Fictions (Trans. Hurley, H.) Penguin Books.

Borges, J. L. 1998. On exactitude in science. P. 325, In, Jorge Luis Borges, Collected Fictions (Trans. Hurley, H.) Penguin Books. ... In that Empire, the Art of Cartography attained such Perfection that the map of a single Province occupied the entirety of a City, and the map of the Empire, the entirety of a Province. In time, those

More information

The Basics of Graphical Models

The Basics of Graphical Models The Basics of Graphical Models David M. Blei Columbia University October 3, 2015 Introduction These notes follow Chapter 2 of An Introduction to Probabilistic Graphical Models by Michael Jordan. Many figures

More information

1 Prior Probability and Posterior Probability

1 Prior Probability and Posterior Probability Math 541: Statistical Theory II Bayesian Approach to Parameter Estimation Lecturer: Songfeng Zheng 1 Prior Probability and Posterior Probability Consider now a problem of statistical inference in which

More information

Principles of Data Mining by Hand&Mannila&Smyth

Principles of Data Mining by Hand&Mannila&Smyth Principles of Data Mining by Hand&Mannila&Smyth Slides for Textbook Ari Visa,, Institute of Signal Processing Tampere University of Technology October 4, 2010 Data Mining: Concepts and Techniques 1 Differences

More information

MATCH Commun. Math. Comput. Chem. 61 (2009) 781-788

MATCH Commun. Math. Comput. Chem. 61 (2009) 781-788 MATCH Communications in Mathematical and in Computer Chemistry MATCH Commun. Math. Comput. Chem. 61 (2009) 781-788 ISSN 0340-6253 Three distances for rapid similarity analysis of DNA sequences Wei Chen,

More information

Objections to Bayesian statistics

Objections to Bayesian statistics Bayesian Analysis (2008) 3, Number 3, pp. 445 450 Objections to Bayesian statistics Andrew Gelman Abstract. Bayesian inference is one of the more controversial approaches to statistics. The fundamental

More information

A Bootstrap Metropolis-Hastings Algorithm for Bayesian Analysis of Big Data

A Bootstrap Metropolis-Hastings Algorithm for Bayesian Analysis of Big Data A Bootstrap Metropolis-Hastings Algorithm for Bayesian Analysis of Big Data Faming Liang University of Florida August 9, 2015 Abstract MCMC methods have proven to be a very powerful tool for analyzing

More information

Bayesian Tutorial (Sheet Updated 20 March)

Bayesian Tutorial (Sheet Updated 20 March) Bayesian Tutorial (Sheet Updated 20 March) Practice Questions (for discussing in Class) Week starting 21 March 2016 1. What is the probability that the total of two dice will be greater than 8, given that

More information

Name: Class: Date: Multiple Choice Identify the choice that best completes the statement or answers the question.

Name: Class: Date: Multiple Choice Identify the choice that best completes the statement or answers the question. Name: Class: Date: Chapter 17 Practice Multiple Choice Identify the choice that best completes the statement or answers the question. 1. The correct order for the levels of Linnaeus's classification system,

More information

Online Model-Based Clustering for Crisis Identification in Distributed Computing

Online Model-Based Clustering for Crisis Identification in Distributed Computing Online Model-Based Clustering for Crisis Identification in Distributed Computing Dawn Woodard School of Operations Research and Information Engineering & Dept. of Statistical Science, Cornell University

More information

SYSM 6304: Risk and Decision Analysis Lecture 5: Methods of Risk Analysis

SYSM 6304: Risk and Decision Analysis Lecture 5: Methods of Risk Analysis SYSM 6304: Risk and Decision Analysis Lecture 5: Methods of Risk Analysis M. Vidyasagar Cecil & Ida Green Chair The University of Texas at Dallas Email: M.Vidyasagar@utdallas.edu October 17, 2015 Outline

More information

INDISTINGUISHABILITY OF ABSOLUTELY CONTINUOUS AND SINGULAR DISTRIBUTIONS

INDISTINGUISHABILITY OF ABSOLUTELY CONTINUOUS AND SINGULAR DISTRIBUTIONS INDISTINGUISHABILITY OF ABSOLUTELY CONTINUOUS AND SINGULAR DISTRIBUTIONS STEVEN P. LALLEY AND ANDREW NOBEL Abstract. It is shown that there are no consistent decision rules for the hypothesis testing problem

More information

Heuristics for the Sorting by Length-Weighted Inversions Problem on Signed Permutations

Heuristics for the Sorting by Length-Weighted Inversions Problem on Signed Permutations Heuristics for the Sorting by Length-Weighted Inversions Problem on Signed Permutations AlCoB 2014 First International Conference on Algorithms for Computational Biology Thiago da Silva Arruda Institute

More information

Full and Complete Binary Trees

Full and Complete Binary Trees Full and Complete Binary Trees Binary Tree Theorems 1 Here are two important types of binary trees. Note that the definitions, while similar, are logically independent. Definition: a binary tree T is full

More information

PERFORMANCE ENHANCEMENTS IN TreeAge Pro 2014 R1.0

PERFORMANCE ENHANCEMENTS IN TreeAge Pro 2014 R1.0 PERFORMANCE ENHANCEMENTS IN TreeAge Pro 2014 R1.0 15 th January 2014 Al Chrosny Director, Software Engineering TreeAge Software, Inc. achrosny@treeage.com Andrew Munzer Director, Training and Customer

More information

Data Partitions and Complex Models in Bayesian Analysis: The Phylogeny of Gymnophthalmid Lizards

Data Partitions and Complex Models in Bayesian Analysis: The Phylogeny of Gymnophthalmid Lizards Syst. Biol. 53(3):448 469, 2004 Copyright c Society of Systematic Biologists ISSN: 1063-5157 print / 1076-836X online DOI: 10.1080/10635150490445797 Data Partitions and Complex Models in Bayesian Analysis:

More information

6 Creating the Animation

6 Creating the Animation 6 Creating the Animation Now that the animation can be represented, stored, and played back, all that is left to do is understand how it is created. This is where we will use genetic algorithms, and this

More information

A Bayesian Antidote Against Strategy Sprawl

A Bayesian Antidote Against Strategy Sprawl A Bayesian Antidote Against Strategy Sprawl Benjamin Scheibehenne (benjamin.scheibehenne@unibas.ch) University of Basel, Missionsstrasse 62a 4055 Basel, Switzerland & Jörg Rieskamp (joerg.rieskamp@unibas.ch)

More information

PHYLOGENY AND COMPARATIVE METHODS SYMBIOMICS WORKSHOP

PHYLOGENY AND COMPARATIVE METHODS SYMBIOMICS WORKSHOP PHYLOGENY AND COMPARATIVE METHODS SYMBIOMICS WORKSHOP March 4-7, 2013 Valencia, Spain Parc Cientific of the University of Valencia Goals The aim of this workshop is to provide the attendees with a broad

More information

Lecture 10: Regression Trees

Lecture 10: Regression Trees Lecture 10: Regression Trees 36-350: Data Mining October 11, 2006 Reading: Textbook, sections 5.2 and 10.5. The next three lectures are going to be about a particular kind of nonlinear predictive model,

More information

MCMC-Based Assessment of the Error Characteristics of a Surface-Based Combined Radar - Passive Microwave Cloud Property Retrieval

MCMC-Based Assessment of the Error Characteristics of a Surface-Based Combined Radar - Passive Microwave Cloud Property Retrieval MCMC-Based Assessment of the Error Characteristics of a Surface-Based Combined Radar - Passive Microwave Cloud Property Retrieval Derek J. Posselt University of Michigan Jay G. Mace University of Utah

More information

Worksheet - COMPARATIVE MAPPING 1

Worksheet - COMPARATIVE MAPPING 1 Worksheet - COMPARATIVE MAPPING 1 The arrangement of genes and other DNA markers is compared between species in Comparative genome mapping. As early as 1915, the geneticist J.B.S Haldane reported that

More information

Book Review of Rosenhouse, The Monty Hall Problem. Leslie Burkholder 1

Book Review of Rosenhouse, The Monty Hall Problem. Leslie Burkholder 1 Book Review of Rosenhouse, The Monty Hall Problem Leslie Burkholder 1 The Monty Hall Problem, Jason Rosenhouse, New York, Oxford University Press, 2009, xii, 195 pp, US $24.95, ISBN 978-0-19-5#6789-8 (Source

More information

Bayesian Statistical Analysis in Medical Research

Bayesian Statistical Analysis in Medical Research Bayesian Statistical Analysis in Medical Research David Draper Department of Applied Mathematics and Statistics University of California, Santa Cruz draper@ams.ucsc.edu www.ams.ucsc.edu/ draper ROLE Steering

More information

Genetics Lecture Notes 7.03 2005. Lectures 1 2

Genetics Lecture Notes 7.03 2005. Lectures 1 2 Genetics Lecture Notes 7.03 2005 Lectures 1 2 Lecture 1 We will begin this course with the question: What is a gene? This question will take us four lectures to answer because there are actually several

More information

Principles of Evolution - Origin of Species

Principles of Evolution - Origin of Species Theories of Organic Evolution X Multiple Centers of Creation (de Buffon) developed the concept of "centers of creation throughout the world organisms had arisen, which other species had evolved from X

More information

Matthew Kaplan and Taylor Edwards. University of Arizona Tucson, Arizona

Matthew Kaplan and Taylor Edwards. University of Arizona Tucson, Arizona Matthew Kaplan and Taylor Edwards University of Arizona Tucson, Arizona Unresolved paternity Consent for testing Ownership of Samples Uncovering genetic disorders SRY Reversal / Klinefelter s Syndrome

More information

An experimental study comparing linguistic phylogenetic reconstruction methods *

An experimental study comparing linguistic phylogenetic reconstruction methods * An experimental study comparing linguistic phylogenetic reconstruction methods * François Barbançon, a Steven N. Evans, b Luay Nakhleh c, Don Ringe, d and Tandy Warnow, e, a Palantir Technologies, 100

More information

Statistics in Applications III. Distribution Theory and Inference

Statistics in Applications III. Distribution Theory and Inference 2.2 Master of Science Degrees The Department of Statistics at FSU offers three different options for an MS degree. 1. The applied statistics degree is for a student preparing for a career as an applied

More information

Protein Protein Interaction Networks

Protein Protein Interaction Networks Functional Pattern Mining from Genome Scale Protein Protein Interaction Networks Young-Rae Cho, Ph.D. Assistant Professor Department of Computer Science Baylor University it My Definition of Bioinformatics

More information