Contents. list of contributors. Preface. Basic concepts of molecular evolution 3
|
|
- Lilian Jackson
- 3 years ago
- Views:
From this document you will learn the answers to the following questions:
Nucleotide substitutions are considered a homogeneous process?
What is used to determine phylogenetic inference?
Transcription
1 list of contributors Foreword Preface page xix xxiii XXV Section I: Introduction Basic concepts of molecular evolution 3 Anne-Mieke Vandamme 1.1 Genetic information Population dynamics Evolution and speciation Data used for molecular phylogenetics What is a phylogenetic tree? Methods for inferring phylogenetic trees Is evolution always tree-like? 28 Section II: Data preparation 2 Sequence databases and database searching 33 Theory 33 Guy Bottu 2.1 Introduction Sequence databases General nucleic acid sequence databases General protein sequence databases Specialized sequence databases, reference databases, and genome databases Composite databases, database mirroring, and search tools Entrez The phylogenetic handbook 2009 digitalisiert durch: IDS Basel Bern
2 vi Sequence Retrieval System (SRS) Some general considerations about database searching by keyword Database searching by sequence similarity Optimal alignment Basic Local Alignment Search Tool (BLAST) FASTA Other tools and some general considerations 52 Practice 55 Marc Van Ranst and Philippe Lemey 2.5 Database searching using ENTREZ BLAST FASTA 66 Multiple sequence alignment 68 Theory Des Higgins and Philippe Lemey Introduction The problem of repeats The problem of substitutions The problem of gaps Pairwise sequence alignment Dot-matrix sequence comparison Dynamic programming Multiple alignment algorithms Progressive alignment Consistency-based scoring Iterative refinement methods Genetic algorithms Hidden Markov models Other algorithms Testing multiple alignment methods Which program to choose? Nucleotide sequences vs. amino acid sequences 3.10 Visualizing alignments and manual editing 96 Practice 100 Des Higgins and Philippe Lemey 3.11 CLUSTAL alignment File formats and availability Aligning the primate Trim5o/ amino acid sequences
3 vii 3.12 T-COFFEE alignment MUSCLE alignment Comparing alignments using the ALTAVisT web tool From protein to nucleotide alignment Editing and viewing multiple alignments Databases of alignments 106 Section III: Phylogenetic inference 109 Genetic distances and nucleotide substitution models 111 Theory 111 Korbinian Strimmer and Arndt von Haeseler 4.1 Introduction Observed and expected distances Number of mutations in a given time interval * (optional) Nucleotide substitutions as a homogeneous Markov process The fukes and Cantor (IC69) model Derivation of Markov Process *( optional) Inferring the expected distances Nucleotide substitution models Rate heterogeneity among sites 123 Practice 126 Marco Salemi 4.7 Software packages Observed vs. estimated genetic distances: the JC69 model Kimura 2-parameters (K80) and F84 genetic distances More complex models Modeling rate heterogeneity among sites Estimating standard errors using MEGA The problem of substitution saturation Choosing among different evolutionary models Phylogenetic inference based on distance methods 142 Theory 142 Yves Van de Peer 5.1 Introduction Tree-inference methods based on genetic distances Cluster analysis (UPGMA and WPGMA) Minimum evolution and neighbor-joining Other distance methods 156
4 viii 5.3 Evaluating the reliability of inferred trees Bootstrap analysis Jackknifing Conclusions 159 Practice 161 Marco Salemi Programs to display and manipulate phylogenetic trees Distance-based phylogenetic inference in PHYLIP Inferring a Neighbor-foining tree for the primates data set Outgroup rooting Inferring a Fitch-Margoliash tree for the mtdna data set Bootstrap analysis using PHYLIP Impact of genetic distances on tree topology: an example using MEGM Other programs Phylogenetic inference using maximum likelihood methods Theory Heiko A. Schmidt and Arndt von Haeseler Introduction The formal framework The simple case: maximum-likelihood tree for two sequences The complex case Computing the probability of an alignment for a fixed tree Felsenstein's pruning algorithm Finding a maximum-likelihood tree Early heuristics Full-tree rearrangement DNAML and FASTDNAML PHYML and PHYML-SPR IQPNNI RAxML Simulated annealing Genetic algorithms Branch support The quartet puzzling algorithm Parameter estimation ML step Puzzling step Consensus step Likelihood-mapping analysis
5 ix Practice 199 Heiko A. Schmidt and Arndt von Haeseler 6.8 Software packages An illustrative example of an ML tree reconstruction Reconstructing an ML tree with IQPNNI Getting a tree with branch support values using quartet puzzling Likelihood-mapping analysis of the HIV data set Conclusions 207 Bayesian phylogenetic analysis using MRBAYES 210 Theory 210 Fredrik Ronquist, Paul van der Mark, and John P. Huelsenbeck 7.1 Introduction Bayesian phylogenetic inference Markov chain Monte Carlo sampling Burn-in, mixing and convergence Metropolis coupling Summarizing the results An introduction to phylogenetic models Bayesian model choice and model averaging Prior probability distributions 236 Practice 237 Fredrik Ronquist, Paul van der Mark, and John P. Huelsenbeck 7.10 Introduction to MRBAYES Acquiring and installing the program Getting started Changing the size of the MRBAYES window Getting help A simple analysis Quick start version Getting data into MRBAYES Specifying a model Setting the priors Checking the model Setting up the analysis Running the analysis When to stop the analysis Summarizing samples of substitution model parameters Summarizing samples of trees and branch lengths 257
6 7.12 Analyzing a partitioned data set Getting mixed data into MRBAYES Dividing the data into partitions Specifying a partitioned model Running the analysis Some practical advice Phytogeny inference based on parsimony and other methods using PAUP* 267 Theory 267 David L Swofford and Jack Sullivan 8.1 Introduction Parsimony analysis - background Parsimony analysis - methodology Calculating the length of a given tree under the parsimony criterion Searching for optimal trees Exact methods Approximate methods 282 Practice 289 David L Swofford and Jack Sullivan 8.5 Analyzing data with PAUP* through the command-line interface Basic parsimony analysis and tree-searching Analysis using distance methods Analysis using maximum likelihood methods Phylogenetic analysis using protein sequences 313 Theory 313 Fred R. Opperdoes 9.1 Introduction Protein evolution Why analyze protein sequences? The genetic code and codon bias Look-back time Nature of sequence divergence in proteins (the PAM unit) Introns and non-coding DNA Choosing DNA or protein? Construction of phylogenetic trees Preparation of the data set Tree-building 329
7 xi Practice 332 Fred R. Opperdoes and Philippe Lemey 9.4 A phylogenetic analysis of the Leishmanial glyceraldehyde- 3-phosphate dehydrogenase gene carried out via the Internet A phylogenetic analysis of trypanosomatid glyceraldehyde- 3-phosphate dehydrogenase protein sequences using Bayesian inference 337 Section IV: Testing models and trees 10 Selecting models of evolution Theory David Posada Models of evolution and phylogeny reconstruction Model fit Hierarchical likelihood ratio tests (hlrts) Potential problems with the hlrts Information criteria Bayesian approaches Performance-based selection Model selection uncertainty Model averaging Practice David Posada The model selection procedure MODELTEST PROTTEST Selecting the best-fit model in the example data sets Vertebrate mtdna HIV-1 envelope gene G3PDH protein Molecular clock analysis 362 Theory 362 Philippe Lemey and David Posada 11.1 Introduction The relative rate test 364
8 xii 11.3 Likelihood ratio test of the global molecular clock Dated tips Relaxing the molecular clock Discussion and future directions 371 Practice 373 Philippe Lemey and David Posada 11.7 Molecular clock analysis using PAML Analysis of the primate sequences Analysis of the viral sequences Testing tree topologies 381 Theory 381 Heiko A. Schmidt 12.1 Introduction Some definitions for distributions and testing Likelihood ratio tests for nested models How to get the distribution of likelihood ratios Non-parametric bootstrap Parametric bootstrap Testing tree topologies Tree tests - a general structure The original Kishino-Hasegawa (KH) test One-sided Kishino-Hasegawa test Shimodaira-Hasegawa (SH) test Weighted test variants The approximately unbiased test Swofford-Olsen-Waddell-Hillis (SOWH) test Confidence sets based on likelihood weights Conclusions 395 Practice 397 Heiko A Schmidt 12.8 Software packages Testing a set of trees with TREE-PUZZLE and CONSEL Testing and obtaining site-likelihood with TREE-PUZZLE Testing with CONSEL Conclusions 403
9 xiii Section V: Molecular adaptation r 13 Natural selection and adaptation of molecular sequences 407 Oliver G. Pybus and Beth Shapiro 13.1 Basic concepts The molecular footprint of selection Summary statistic methods d^lds methods Codon volatility Conclusion Estimating selection pressures on alignments of coding sequences 419 Theory 419 Sergei L. Kosakovsky Pond, Art F. Y. Poon, and Simon D. W. Frost 14.1 Introduction Prerequisites Codon substitution models Simulated data: how and why? Statistical estimation procedures Distance-based approaches Maximum likelihood approaches Estimating ds and dn Correcting for nucleotide substitution biases Bayesian approaches Estimating branch-by-branch variation in rates Local vs. global model Specifying branches a priori Data-driven branch selection Estimating site-by-site variation in rates Random effects likelihood (REL) Fixed effects likelihood (FEL) Counting methods Which method to use? The importance of synonymous rate variation Comparing rates at a site in different branches Discussion and further directions 450 Practice 452 Sergei L Kosakovsky Pond, Art F. Y. Poon, and Simon D. W. Frost Software for estimating selection PAML ADAPTSITE
10 xiv MEGA HYPHY DATAMONKEY Influenza A as a case study Prerequisites Getting acquainted with HYPHY Importing alignments and trees Previewing sequences in HYPHY Previewing trees in HYPHY Making an alignment Estimating a tree Estimating nucleotide biases Detecting recombination Estimating global rates Fitting a global model in the HYPHY GUI Fitting a global model with a HYPHY batch file Estimating branch-by-branch variation in rates Fitting a local codon model in HYPHY Interclade variation in substitution rates Comparing internal and terminal branches Estimating site-by-site variation in rates Preliminary analysis set-up Estimating filot Ml Single-likelihood ancestor counting (SLAC) Fixed effects likelihood (FEL) REL methods in HYPHY Estimating gene-by-gene variation in rates Comparing selection in different populations Comparing selection between different genes Automating choices for HYPHY analyses Simulations Summary of standard analyses Discussion 490 Section VI: Recombination 13 Introduction to recombination detection 493 Philippe Lemey and David Posada 15.1 Introduction Mechanisms of recombination
11 xv 15.3 Linkage disequilibrium, substitution patterns, and evolutionary inference Evolutionary implications of recombination Impact on phylogenetic analyses Recombination analysis as a multifaceted discipline Detecting recombination Recombinant identification and breakpoint detection Recombination rate Overview of recombination detection tools Performance of recombination detection tools 517 Detecting and characterizing individual recombination events 519 Theory 519 Mika Salminen and Darren Martin 16.1 Introduction Requirements for detecting recombination Theoretical basis for recombination detection methods Identifying and characterizing actual recombination events 530 Practice 532 Mika Salminen and Darren Martin 16.5 Existing tools for recombination analysis Analyzing example sequences to detect and characterize individual recombination events Exercise 1: Working with SIM PLOT Exercise 2: Mapping recombination with SIMPLOT Exercise 3: Using the "groups" feature of SIMPLOT Exercise 4: Setting up RDP3 to do an exploratory analysis Exercise 5: Doing a simple exploratory analysis with RDP Exercise 6: Using RDP3 to refine a recombination hypothesis 546 Section VII: Population genetics The coalescent: population genetic inference using genealogies 551 Allen Rodrigo 17.1 Introduction The Kingman coalescent Effective population size 554
12 xvi 17.4 The mutation clock Demographic history and the coalescent Coalescent-based inference The serial coalescent Advanced topics 561 Bayesian evolutionary analysis by sampling trees 564 Theory 564 Alexei J. Drummond and Andrew Rambaut 18.1 Background Bayesian MCMC for genealogy-based population genetics Implementation Input format Output and results Computational performance Results and discussion Substitution models and rate models among sites Rate models among branches, divergence time estimation, and time-stamped data Tree priors Multiple data partitions and linking and unlinking parameters Definitions and units of the standard parameters and variables Model comparison Conclusions 575 Practice 576 Alexei J. Drummond and Andrew Rambaut 18.4 The BEAST software package Running BEAUTI Loading the NEXUS file Setting the dates of the taxa Translating the data in amino acid sequences Setting the evolutionary model Setting up the operators Setting the MCMC options Running BEAST Analyzing the BEAST output Summarizing the trees Viewing the annotated tree Conclusion and resources 590
13 xvii i 9 LAMARC: Estimating population genetic parameters from molecular data 592 Theory 592 Mary K. Kuhner 19.1 Introduction Basis of the Metropolis-Hastings MCMC sampler Bayesian vs. likelihood sampling Random sample Stability No other forces Evolutionary model Large population relative to sample Adequate run time 597 Practice 598 Mary K. Kuhner 19.3 The LAMARC software package FLUCTUATE (COALESCE) MIGRATE-N RECOMBINE LAMARC Starting values Space and time Sample size considerations Virus-specific issues Multiple loci Rapid growth rates Sequential samples An exercise with LAMARC Converting data using the LAMARCfile converter Estimating the population parameters Analyzing the output Conclusions 611 Section VIII: Additional topics Assessing substitution saturation with DAMBE 615 Theory 615 Xuhua Xia 20.1 The problem of substitution saturation Steel's method: potential problem, limitation, and implementation in DAMBE 616
14 xviii 20.3 Xia's method: its problem, limitation, and implementation in DAMBE 621 Practice 624 Xuhua Xia and Philippe Lemey 20.4 Working with the VertebrateMtCOLFAS file Working with the InvertebrateEFl a.fas file Working with the SIV.FAS file i Split networks. A tool for exploring complex evolutionary relationships in molecular data 631 Theory 631 Vincent Moulton and Katharina T. Huber 21.1 Understanding evolutionary relationships through networks An introduction to split decomposition theory The Buneman tree Split decomposition From weakly compatible splits to networks Alternative ways to compute split networks NeighborNet Median networks Consensus networks and supernetworks 640 Practice 642 Vincent Moulton and Katharina T. Huber 21.5 The SPLITSTREE program Introduction Downloading SPLITSTREE Using SPLITSTREE on the mtdna data set Getting started The fit index Laying out split networks Recomputing split networks Computing trees Computing different networks Bootstrapping Printing Using SPLITSTREE on other data sets 648 Glossary 654 References 672 Index 709
Phylogenetic Trees Made Easy
Phylogenetic Trees Made Easy A How-To Manual Fourth Edition Barry G. Hall University of Rochester, Emeritus and Bellingham Research Institute Sinauer Associates, Inc. Publishers Sunderland, Massachusetts
More informationBayesian Phylogeny and Measures of Branch Support
Bayesian Phylogeny and Measures of Branch Support Bayesian Statistics Imagine we have a bag containing 100 dice of which we know that 90 are fair and 10 are biased. The
More informationBayesian coalescent inference of population size history
Bayesian coalescent inference of population size history Alexei Drummond University of Auckland Workshop on Population and Speciation Genomics, 2016 1st February 2016 1 / 39 BEAST tutorials Population
More informationPHYML Online: A Web Server for Fast Maximum Likelihood-Based Phylogenetic Inference
PHYML Online: A Web Server for Fast Maximum Likelihood-Based Phylogenetic Inference Stephane Guindon, F. Le Thiec, Patrice Duroux, Olivier Gascuel To cite this version: Stephane Guindon, F. Le Thiec, Patrice
More informationDnaSP, DNA polymorphism analyses by the coalescent and other methods.
DnaSP, DNA polymorphism analyses by the coalescent and other methods. Author affiliation: Julio Rozas 1, *, Juan C. Sánchez-DelBarrio 2,3, Xavier Messeguer 2 and Ricardo Rozas 1 1 Departament de Genètica,
More informationIntroduction to Phylogenetic Analysis
Subjects of this lecture Introduction to Phylogenetic nalysis Irit Orr 1 Introducing some of the terminology of phylogenetics. 2 Introducing some of the most commonly used methods for phylogenetic analysis.
More informationA comparison of methods for estimating the transition:transversion ratio from DNA sequences
Molecular Phylogenetics and Evolution 32 (2004) 495 503 MOLECULAR PHYLOGENETICS AND EVOLUTION www.elsevier.com/locate/ympev A comparison of methods for estimating the transition:transversion ratio from
More informationBio-Informatics Lectures. A Short Introduction
Bio-Informatics Lectures A Short Introduction The History of Bioinformatics Sanger Sequencing PCR in presence of fluorescent, chain-terminating dideoxynucleotides Massively Parallel Sequencing Massively
More informationEvaluating the Performance of a Successive-Approximations Approach to Parameter Optimization in Maximum-Likelihood Phylogeny Estimation
Evaluating the Performance of a Successive-Approximations Approach to Parameter Optimization in Maximum-Likelihood Phylogeny Estimation Jack Sullivan,* Zaid Abdo, à Paul Joyce, à and David L. Swofford
More informationPROC. CAIRO INTERNATIONAL BIOMEDICAL ENGINEERING CONFERENCE 2006 1. E-mail: msm_eng@k-space.org
BIOINFTool: Bioinformatics and sequence data analysis in molecular biology using Matlab Mai S. Mabrouk 1, Marwa Hamdy 2, Marwa Mamdouh 2, Marwa Aboelfotoh 2,Yasser M. Kadah 2 1 Biomedical Engineering Department,
More informationArbres formels et Arbre(s) de la Vie
Arbres formels et Arbre(s) de la Vie A bit of history and biology Definitions Numbers Topological distances Consensus Random models Algorithms to build trees Basic principles DATA sequence alignment distance
More informationA Primer of Genome Science THIRD
A Primer of Genome Science THIRD EDITION GREG GIBSON-SPENCER V. MUSE North Carolina State University Sinauer Associates, Inc. Publishers Sunderland, Massachusetts USA Contents Preface xi 1 Genome Projects:
More informationDetecting signatures of selection from. DNA sequences using Datamonkey
Detecting signatures of selection from DNA sequences using Datamonkey Art F.Y. Poon, Simon D.W. Frost and Sergei L. Kosakovsky Pond* Antiviral Research Center, Department of Pathology, University of California
More informationHierarchical Bayesian Modeling of the HIV Response to Therapy
Hierarchical Bayesian Modeling of the HIV Response to Therapy Shane T. Jensen Department of Statistics, The Wharton School, University of Pennsylvania March 23, 2010 Joint Work with Alex Braunstein and
More informationIntroduction to Bioinformatics AS 250.265 Laboratory Assignment 6
Introduction to Bioinformatics AS 250.265 Laboratory Assignment 6 In the last lab, you learned how to perform basic multiple sequence alignments. While useful in themselves for determining conserved residues
More informationA Rough Guide to BEAST 1.4
A Rough Guide to BEAST 1.4 Alexei J. Drummond 1, Simon Y.W. Ho, Nic Rawlence and Andrew Rambaut 2 1 Department of Computer Science The University of Auckland, Private Bag 92019 Auckland, New Zealand alexei@cs.auckland.ac.nz
More informationBIO 3350: ELEMENTS OF BIOINFORMATICS PARTIALLY ONLINE SYLLABUS
BIO 3350: ELEMENTS OF BIOINFORMATICS PARTIALLY ONLINE SYLLABUS NEW YORK CITY COLLEGE OF TECHNOLOGY The City University Of New York School of Arts and Sciences Biological Sciences Department Course title:
More informationGenome Explorer For Comparative Genome Analysis
Genome Explorer For Comparative Genome Analysis Jenn Conn 1, Jo L. Dicks 1 and Ian N. Roberts 2 Abstract Genome Explorer brings together the tools required to build and compare phylogenies from both sequence
More informationDNA Sequence Alignment Analysis
Analysis of DNA sequence data p. 1 Analysis of DNA sequence data using MEGA and DNAsp. Analysis of two genes from the X and Y chromosomes of plant species from the genus Silene The first two computer classes
More informationMolecular Clocks and Tree Dating with r8s and BEAST
Integrative Biology 200B University of California, Berkeley Principals of Phylogenetics: Ecology and Evolution Spring 2011 Updated by Nick Matzke Molecular Clocks and Tree Dating with r8s and BEAST Today
More informationA short guide to phylogeny reconstruction
A short guide to phylogeny reconstruction E. Michu Institute of Biophysics, Academy of Sciences of the Czech Republic, Brno, Czech Republic ABSTRACT This review is a short introduction to phylogenetic
More informationPHYLOGENETIC ANALYSIS
Bioinformatics: A Practical Guide to the Analysis of Genes and Proteins, Second Edition Andreas D. Baxevanis, B.F. Francis Ouellette Copyright 2001 John Wiley & Sons, Inc. ISBNs: 0-471-38390-2 (Hardback);
More informationUser Manual for SplitsTree4 V4.14.2
User Manual for SplitsTree4 V4.14.2 Daniel H. Huson and David Bryant November 4, 2015 Contents Contents 1 1 Introduction 4 2 Getting Started 5 3 Obtaining and Installing the Program 5 4 Program Overview
More informationA Step-by-Step Tutorial: Divergence Time Estimation with Approximate Likelihood Calculation Using MCMCTREE in PAML
9 June 2011 A Step-by-Step Tutorial: Divergence Time Estimation with Approximate Likelihood Calculation Using MCMCTREE in PAML by Jun Inoue, Mario dos Reis, and Ziheng Yang In this tutorial we will analyze
More informationBioinformatics Resources at a Glance
Bioinformatics Resources at a Glance A Note about FASTA Format There are MANY free bioinformatics tools available online. Bioinformaticists have developed a standard format for nucleotide and protein sequences
More informationPhylogenetic systematics turns over a new leaf
30 Review Phylogenetic systematics turns over a new leaf Paul O. Lewis Long restricted to the domain of molecular systematics and studies of molecular evolution, likelihood methods are now being used in
More informationCore Bioinformatics. Degree Type Year Semester. 4313473 Bioinformàtica/Bioinformatics OB 0 1
Core Bioinformatics 2014/2015 Code: 42397 ECTS Credits: 12 Degree Type Year Semester 4313473 Bioinformàtica/Bioinformatics OB 0 1 Contact Name: Sònia Casillas Viladerrams Email: Sonia.Casillas@uab.cat
More informationPhylogenetic Models of Rate Heterogeneity: A High Performance Computing Perspective
Phylogenetic Models of Rate Heterogeneity: A High Performance Computing Perspective Alexandros Stamatakis Institute of Computer Science, Foundation for Research and Technology-Hellas P.O. Box 1385, Heraklion,
More informationBioinformatics Grid - Enabled Tools For Biologists.
Bioinformatics Grid - Enabled Tools For Biologists. What is Grid-Enabled Tools (GET)? As number of data from the genomics and proteomics experiment increases. Problems arise for the current sequence analysis
More informationID of alternative translational initiation events. Description of gene function Reference of NCBI database access and relative literatures
Data resource: In this database, 650 alternatively translated variants assigned to a total of 300 genes are contained. These database records of alternative translational initiation have been collected
More informationjmodeltest 0.1.1 (April 2008) David Posada 2008 onwards
jmodeltest 0.1.1 (April 2008) David Posada 2008 onwards dposada@uvigo.es http://darwin.uvigo.es/ See the jmodeltest FORUM and FAQs at http://darwin.uvigo.es/ INDEX 1 1. DISCLAIMER 3 2. PURPOSE 3 3. CITATION
More informationMultiple Losses of Flight and Recent Speciation in Steamer Ducks Tara L. Fulton, Brandon Letts, and Beth Shapiro
Supplementary Material for: Multiple Losses of Flight and Recent Speciation in Steamer Ducks Tara L. Fulton, Brandon Letts, and Beth Shapiro 1. Supplementary Tables Supplementary Table S1. Sample information.
More informationDetection. Perspective. Network Anomaly. Bhattacharyya. Jugal. A Machine Learning »C) Dhruba Kumar. Kumar KaKta. CRC Press J Taylor & Francis Croup
Network Anomaly Detection A Machine Learning Perspective Dhruba Kumar Bhattacharyya Jugal Kumar KaKta»C) CRC Press J Taylor & Francis Croup Boca Raton London New York CRC Press is an imprint of the Taylor
More informationLecture/Recitation Topic SMA 5303 L1 Sampling and statistical distributions
SMA 50: Statistical Learning and Data Mining in Bioinformatics (also listed as 5.077: Statistical Learning and Data Mining ()) Spring Term (Feb May 200) Faculty: Professor Roy Welsch Wed 0 Feb 7:00-8:0
More informationData Partitions and Complex Models in Bayesian Analysis: The Phylogeny of Gymnophthalmid Lizards
Syst. Biol. 53(3):448 469, 2004 Copyright c Society of Systematic Biologists ISSN: 1063-5157 print / 1076-836X online DOI: 10.1080/10635150490445797 Data Partitions and Complex Models in Bayesian Analysis:
More informationPRINCIPLES OF POPULATION GENETICS
PRINCIPLES OF POPULATION GENETICS FOURTH EDITION Daniel L. Hartl Harvard University Andrew G. Clark Cornell University UniversitSts- und Landesbibliothek Darmstadt Bibliothek Biologie Sinauer Associates,
More informationRegression Modeling Strategies
Frank E. Harrell, Jr. Regression Modeling Strategies With Applications to Linear Models, Logistic Regression, and Survival Analysis With 141 Figures Springer Contents Preface Typographical Conventions
More informationStatistics Graduate Courses
Statistics Graduate Courses STAT 7002--Topics in Statistics-Biological/Physical/Mathematics (cr.arr.).organized study of selected topics. Subjects and earnable credit may vary from semester to semester.
More informationModule 1. Sequence Formats and Retrieval. Charles Steward
The Open Door Workshop Module 1 Sequence Formats and Retrieval Charles Steward 1 Aims Acquaint you with different file formats and associated annotations. Introduce different nucleotide and protein databases.
More informationHeuristics for the Sorting by Length-Weighted Inversions Problem on Signed Permutations
Heuristics for the Sorting by Length-Weighted Inversions Problem on Signed Permutations AlCoB 2014 First International Conference on Algorithms for Computational Biology Thiago da Silva Arruda Institute
More informationInference of Large Phylogenetic Trees on Parallel Architectures. Michael Ott
Inference of Large Phylogenetic Trees on Parallel Architectures Michael Ott TECHNISCHE UNIVERSITÄT MÜNCHEN Lehrstuhl für Rechnertechnik und Rechnerorganisation / Parallelrechnerarchitektur Inference of
More informationProtein Sequence Analysis - Overview -
Protein Sequence Analysis - Overview - UDEL Workshop Raja Mazumder Research Associate Professor, Department of Biochemistry and Molecular Biology Georgetown University Medical Center Topics Why do protein
More informationContents. Dedication List of Figures List of Tables. Acknowledgments
Contents Dedication List of Figures List of Tables Foreword Preface Acknowledgments v xiii xvii xix xxi xxv Part I Concepts and Techniques 1. INTRODUCTION 3 1 The Quest for Knowledge 3 2 Problem Description
More informationPAML FAQ... 1 Table of Contents... 1. Data Files...3. Windows, UNIX, and MAC OS X basics...4 Common mistakes and pitfalls...5. Windows Essentials...
PAML FAQ Ziheng Yang Last updated: 5 January 2005 (not all items are up to date) Table of Contents PAML FAQ... 1 Table of Contents... 1 Data Files...3 Why don t paml programs read my files correctly?...3
More informationDivergence Time Estimation using BEAST v1.7.5
Divergence Time Estimation using BEAST v1.7.5 Central among the questions explored in biology are those that seek to understand the timing and rates of evolutionary processes. Accurate estimates of species
More informationALTER: program-oriented conversion of DNA and protein alignments
Nucleic Acids Research Advance Access published May 3, 2010 Nucleic Acids Research, 2010, 1 5 doi:10.1093/nar/gkq321 ALTER: program-oriented conversion of DNA and protein alignments Daniel Glez-Peña 1,
More informationFinding Clusters in Phylogenetic Trees: A Special Type of Cluster Analysis
Finding lusters in Phylogenetic Trees: Special Type of luster nalysis Why try to identify clusters in phylogenetic trees? xample: origin of HIV. NUMR: Why are there so many distinct clusters? LUR04-7 SYNHRONY:
More informationWorkflow Administration of Windchill 10.2
Workflow Administration of Windchill 10.2 Overview Course Code Course Length TRN-4339-T 2 Days In this course, you will learn about Windchill workflow features and how to design, configure, and test workflow
More informationSequence Analysis 15: lecture 5. Substitution matrices Multiple sequence alignment
Sequence Analysis 15: lecture 5 Substitution matrices Multiple sequence alignment A teacher's dilemma To understand... Multiple sequence alignment Substitution matrices Phylogenetic trees You first need
More informationREVIEWS. Computer programs for population genetics data analysis: a survival guide FOCUS ON STATISTICAL ANALYSIS
FOCUS ON STATISTICAL ANALYSIS REVIEWS Computer programs for population genetics data analysis: a survival guide Laurent Excoffier and Gerald Heckel Abstract The analysis of genetic diversity within species
More informationCOPYRIGHTED MATERIAL. Contents. List of Figures. Acknowledgments
Contents List of Figures Foreword Preface xxv xxiii xv Acknowledgments xxix Chapter 1 Fraud: Detection, Prevention, and Analytics! 1 Introduction 2 Fraud! 2 Fraud Detection and Prevention 10 Big Data for
More informationA data management framework for the Fungal Tree of Life
Web Accessible Sequence Analysis for Biological Inference A data management framework for the Fungal Tree of Life Kauff F, Cox CJ, Lutzoni F. 2007. WASABI: An automated sequence processing system for multi-gene
More informationMaximum-Likelihood Estimation of Phylogeny from DNA Sequences When Substitution Rates Differ over Sites1
Maximum-Likelihood Estimation of Phylogeny from DNA Sequences When Substitution Rates Differ over Sites1 Ziheng Yang Department of Animal Science, Beijing Agricultural University Felsenstein s maximum-likelihood
More informationKeywords: evolution, genomics, software, data mining, sequence alignment, distance, phylogenetics, selection
Sudhir Kumar has been Director of the Center for Evolutionary Functional Genomics in The Biodesign Institute at Arizona State University since 2002. His research interests include development of software,
More informationFinal Project Report
CPSC545 by Introduction to Data Mining Prof. Martin Schultz & Prof. Mark Gerstein Student Name: Yu Kor Hugo Lam Student ID : 904907866 Due Date : May 7, 2007 Introduction Final Project Report Pseudogenes
More informationAlgorithms in Computational Biology (236522) spring 2007 Lecture #1
Algorithms in Computational Biology (236522) spring 2007 Lecture #1 Lecturer: Shlomo Moran, Taub 639, tel 4363 Office hours: Tuesday 11:00-12:00/by appointment TA: Ilan Gronau, Taub 700, tel 4894 Office
More informationPhyML Manual. Version 3.0 September 17, 2008. http://www.atgc-montpellier.fr/phyml
PhyML Manual Version 3.0 September 17, 2008 http://www.atgc-montpellier.fr/phyml Contents 1 Citation 3 2 Authors 3 3 Overview 4 4 Installing PhyML 4 4.1 Sources and compilation.............................
More informationComparing Bootstrap and Posterior Probability Values in the Four-Taxon Case
Syst. Biol. 52(4):477 487, 2003 Copyright c Society of Systematic Biologists ISSN: 1063-5157 print / 1076-836X online DOI: 10.1080/10635150390218213 Comparing Bootstrap and Posterior Probability Values
More informationMissing data and the accuracy of Bayesian phylogenetics
Journal of Systematics and Evolution 46 (3): 307 314 (2008) (formerly Acta Phytotaxonomica Sinica) doi: 10.3724/SP.J.1002.2008.08040 http://www.plantsystematics.com Missing data and the accuracy of Bayesian
More informationDNA Insertions and Deletions in the Human Genome. Philipp W. Messer
DNA Insertions and Deletions in the Human Genome Philipp W. Messer Genetic Variation CGACAATAGCGCTCTTACTACGTGTATCG : : CGACAATGGCGCT---ACTACGTGCATCG 1. Nucleotide mutations 2. Genomic rearrangements 3.
More informationNEURAL NETWORKS A Comprehensive Foundation
NEURAL NETWORKS A Comprehensive Foundation Second Edition Simon Haykin McMaster University Hamilton, Ontario, Canada Prentice Hall Prentice Hall Upper Saddle River; New Jersey 07458 Preface xii Acknowledgments
More informationPrinciples of Data Mining by Hand&Mannila&Smyth
Principles of Data Mining by Hand&Mannila&Smyth Slides for Textbook Ari Visa,, Institute of Signal Processing Tampere University of Technology October 4, 2010 Data Mining: Concepts and Techniques 1 Differences
More informationData Mining: Concepts and Techniques. Jiawei Han. Micheline Kamber. Simon Fräser University К MORGAN KAUFMANN PUBLISHERS. AN IMPRINT OF Elsevier
Data Mining: Concepts and Techniques Jiawei Han Micheline Kamber Simon Fräser University К MORGAN KAUFMANN PUBLISHERS AN IMPRINT OF Elsevier Contents Foreword Preface xix vii Chapter I Introduction I I.
More informationVisualization of Phylogenetic Trees and Metadata
Visualization of Phylogenetic Trees and Metadata November 27, 2015 Sample to Insight CLC bio, a QIAGEN Company Silkeborgvej 2 Prismet 8000 Aarhus C Denmark Telephone: +45 70 22 32 44 www.clcbio.com support-clcbio@qiagen.com
More informationA successful market segmentation initiative answers the following critical business questions: * How can we a. Customer Status.
MARKET SEGMENTATION The simplest and most effective way to operate an organization is to deliver one product or service that meets the needs of one type of customer. However, to the delight of many organizations
More informationData Algorithms. Mahmoud Parsian. Tokyo O'REILLY. Beijing. Boston Farnham Sebastopol
Data Algorithms Mahmoud Parsian Beijing Boston Farnham Sebastopol Tokyo O'REILLY Table of Contents Foreword xix Preface xxi 1. Secondary Sort: Introduction 1 Solutions to the Secondary Sort Problem 3 Implementation
More informationProtein & DNA Sequence Analysis. Bobbie-Jo Webb-Robertson May 3, 2004
Protein & DNA Sequence Analysis Bobbie-Jo Webb-Robertson May 3, 2004 Sequence Analysis Anything connected to identifying higher biological meaning out of raw sequence data. 2 Genomic & Proteomic Data Sequence
More informationPreface. Table of Contents. List of Figures. List of Tables. List of Abbreviations. 1 Introduction 1. 2 Problem 23.
XI Outline Foreword Preface Outline Table of Contents List of Figures List of Tables List of Abbreviations VII IX XI XIII XXI XXIII XXV 1 Introduction 1 2 Problem 23 3 Related Work 35 4 Development of
More informationRETRIEVING SEQUENCE INFORMATION. Nucleotide sequence databases. Database search. Sequence alignment and comparison
RETRIEVING SEQUENCE INFORMATION Nucleotide sequence databases Database search Sequence alignment and comparison Biological sequence databases Originally just a storage place for sequences. Currently the
More informationMethods for Meta-analysis in Medical Research
Methods for Meta-analysis in Medical Research Alex J. Sutton University of Leicester, UK Keith R. Abrams University of Leicester, UK David R. Jones University of Leicester, UK Trevor A. Sheldon University
More informationA branch-and-bound algorithm for the inference of ancestral. amino-acid sequences when the replacement rate varies among
A branch-and-bound algorithm for the inference of ancestral amino-acid sequences when the replacement rate varies among sites Tal Pupko 1,*, Itsik Pe er 2, Masami Hasegawa 1, Dan Graur 3, and Nir Friedman
More informationCore Bioinformatics. Degree Type Year Semester
Core Bioinformatics 2015/2016 Code: 42397 ECTS Credits: 12 Degree Type Year Semester 4313473 Bioinformatics OB 0 1 Contact Name: Sònia Casillas Viladerrams Email: Sonia.Casillas@uab.cat Teachers Use of
More informationSupporting Online Material for
www.sciencemag.org/cgi/content/full/312/5781/1762/dc1 Supporting Online Material for Silk Genes Support the Single Origin of Orb Webs Jessica E. Garb,* Teresa DiMauro, Victoria Vo, Cheryl Y. Hayashi *To
More informationClone Manager. Getting Started
Clone Manager for Windows Professional Edition Volume 2 Alignment, Primer Operations Version 9.5 Getting Started Copyright 1994-2015 Scientific & Educational Software. All rights reserved. The software
More informationFocusing on results not data comprehensive data analysis for targeted next generation sequencing
Focusing on results not data comprehensive data analysis for targeted next generation sequencing Daniel Swan, Jolyon Holdstock, Angela Matchan, Richard Stark, John Shovelton, Duarte Mohla and Simon Hughes
More informationLearning outcomes. Knowledge and understanding. Competence and skills
Syllabus Master s Programme in Statistics and Data Mining 120 ECTS Credits Aim The rapid growth of databases provides scientists and business people with vast new resources. This programme meets the challenges
More informationCore Bioinformatics. Titulació Tipus Curs Semestre. 4313473 Bioinformàtica/Bioinformatics OB 0 1
Core Bioinformatics 2014/2015 Codi: 42397 Crèdits: 12 Titulació Tipus Curs Semestre 4313473 Bioinformàtica/Bioinformatics OB 0 1 Professor de contacte Nom: Sònia Casillas Viladerrams Correu electrònic:
More informationPairwise Sequence Alignment
Pairwise Sequence Alignment carolin.kosiol@vetmeduni.ac.at SS 2013 Outline Pairwise sequence alignment global - Needleman Wunsch Gotoh algorithm local - Smith Waterman algorithm BLAST - heuristics What
More informationmitochondrial DNA data part 1
SUMMER SCHOOL 2008 PIACENZA, ITALY - Methods for the analysis of mitochondrial DNA data part 1 Licia Colli, U.C.S.C. di Piacenza licia.colli@unicatt.itcolli@unicatt it The mitochondrial genome Sequence
More informationSimilarity Searches on Sequence Databases: BLAST, FASTA. Lorenza Bordoli Swiss Institute of Bioinformatics EMBnet Course, Basel, October 2003
Similarity Searches on Sequence Databases: BLAST, FASTA Lorenza Bordoli Swiss Institute of Bioinformatics EMBnet Course, Basel, October 2003 Outline Importance of Similarity Heuristic Sequence Alignment:
More informationHidden Markov Models in Bioinformatics. By Máthé Zoltán Kőrösi Zoltán 2006
Hidden Markov Models in Bioinformatics By Máthé Zoltán Kőrösi Zoltán 2006 Outline Markov Chain HMM (Hidden Markov Model) Hidden Markov Models in Bioinformatics Gene Finding Gene Finding Model Viterbi algorithm
More informationData Integration. Lectures 16 & 17. ECS289A, WQ03, Filkov
Data Integration Lectures 16 & 17 Lectures Outline Goals for Data Integration Homogeneous data integration time series data (Filkov et al. 2002) Heterogeneous data integration microarray + sequence microarray
More informationBASIC STATISTICAL METHODS FOR GENOMIC DATA ANALYSIS
BASIC STATISTICAL METHODS FOR GENOMIC DATA ANALYSIS SEEMA JAGGI Indian Agricultural Statistics Research Institute Library Avenue, New Delhi-110 012 seema@iasri.res.in Genomics A genome is an organism s
More informationImputing Values to Missing Data
Imputing Values to Missing Data In federated data, between 30%-70% of the data points will have at least one missing attribute - data wastage if we ignore all records with a missing value Remaining data
More informationWhat mathematical optimization can, and cannot, do for biologists. Steven Kelk Department of Knowledge Engineering (DKE) Maastricht University, NL
What mathematical optimization can, and cannot, do for biologists Steven Kelk Department of Knowledge Engineering (DKE) Maastricht University, NL Introduction There is no shortage of literature about the
More informationmorephyml User Guide [Version 1.14] August 2011 by Alexis Criscuolo
morephyml User Guide [Version 1.14] August 2011 by Alexis Criscuolo ftp://ftp.pasteur.fr/pub/gensoft/projects/morephyml/ http://mobyle.pasteur.fr/cgi-bin/portal.py Please cite this paper if you use this
More informationContents. Introduction and System Engineering 1. Introduction 2. Software Process and Methodology 16. System Engineering 53
Preface xvi Part I Introduction and System Engineering 1 Chapter 1 Introduction 2 1.1 What Is Software Engineering? 2 1.2 Why Software Engineering? 3 1.3 Software Life-Cycle Activities 4 1.3.1 Software
More informationA combinatorial test for significant codivergence between cool-season grasses and their symbiotic fungal endophytes
A combinatorial test for significant codivergence between cool-season grasses and their symbiotic fungal endophytes Ruriko Yoshida Dept. of Statistics University of Kentucky Joint work with C.L. Schardl,
More informationGuide for Bioinformatics Project Module 3
Structure- Based Evidence and Multiple Sequence Alignment In this module we will revisit some topics we started to look at while performing our BLAST search and looking at the CDD database in the first
More informationNetworks in phylogenetic analysis: new tools for population biology
International Journal for Parasitology 35 (2005) 567 582 www.parasitology-online.com Invited review Networks in phylogenetic analysis: new tools for population biology David A. Morrison Department of Parasitology
More informationNetwork Protocol Analysis using Bioinformatics Algorithms
Network Protocol Analysis using Bioinformatics Algorithms Marshall A. Beddoe Marshall_Beddoe@McAfee.com ABSTRACT Network protocol analysis is currently performed by hand using only intuition and a protocol
More informationMolecular typing of VTEC: from PFGE to NGS-based phylogeny
Molecular typing of VTEC: from PFGE to NGS-based phylogeny Valeria Michelacci 10th Annual Workshop of the National Reference Laboratories for E. coli in the EU Rome, November 5 th 2015 Molecular typing
More informationGenBank, Entrez, & FASTA
GenBank, Entrez, & FASTA Nucleotide Sequence Databases First generation GenBank is a representative example started as sort of a museum to preserve knowledge of a sequence from first discovery great repositories,
More informationAn experimental study comparing linguistic phylogenetic reconstruction methods *
An experimental study comparing linguistic phylogenetic reconstruction methods * François Barbançon, a Steven N. Evans, b Luay Nakhleh c, Don Ringe, d and Tandy Warnow, e, a Palantir Technologies, 100
More informationIntroduction to Windchill Projectlink 10.2
Introduction to Windchill Projectlink 10.2 Overview Course Code Course Length TRN-4270 1 Day In this course, you will learn how to participate in and manage projects using Windchill ProjectLink 10.2. Emphasis
More informationTHE CERTIFIED SIX SIGMA BLACK BELT HANDBOOK
THE CERTIFIED SIX SIGMA BLACK BELT HANDBOOK SECOND EDITION T. M. Kubiak Donald W. Benbow ASQ Quality Press Milwaukee, Wisconsin Table of Contents list of Figures and Tables Preface to the Second Edition
More informationThe Central Dogma of Molecular Biology
Vierstraete Andy (version 1.01) 1/02/2000 -Page 1 - The Central Dogma of Molecular Biology Figure 1 : The Central Dogma of molecular biology. DNA contains the complete genetic information that defines
More informationA Multiple DNA Sequence Translation Tool Incorporating Web Robot and Intelligent Recommendation Techniques
Proceedings of the 2007 WSEAS International Conference on Computer Engineering and Applications, Gold Coast, Australia, January 17-19, 2007 402 A Multiple DNA Sequence Translation Tool Incorporating Web
More informationModel Calibration with Open Source Software: R and Friends. Dr. Heiko Frings Mathematical Risk Consulting
Model with Open Source Software: and Friends Dr. Heiko Frings Mathematical isk Consulting Bern, 01.09.2011 Agenda in a Friends Model with & Friends o o o Overview First instance: An Extreme Value Example
More information