Bayesian coalescent analysis

Size: px
Start display at page:

Download "Bayesian coalescent analysis"

Transcription

1 Bayesian coalescent analysis Alexei J Drummond Department of Computer Science, University of Auckland, Auckland, New Zealand August 4, 2011

2 Overview 1 Coalescent theory 2 Hepatitis C in Egypt 4 Phylodynamics 3 Bayesian skyline plot 5 Multi-species coalescent

3 The coalescent Data: a small genetic sample from a large background population. The coalescent is a model of the ancestral relationships of a sample of individuals taken from a larger population. describes a probability distribution on ancestral genealogies (trees) given a population history, N(t). Therefore the coalescent can convert information from ancestral genealogies into information about population history and vice versa. a model of ancestral genealogies, not sequences, and its simplest form assumes neutral evolution. can be thought of as a prior on the tree, in a Bayesian setting.

4 Theoretical population genetics Most of theoretical population genetics is based on the idealized Wright-Fisher model of population which assumes Constant population size N Discrete generations Complete mixing For the purposes of this presentation the population will be assumed to be haploid, as is the case for many pathogens.

5 Kingman s n-coalescent Consider tracing the ancestry of a sample of k individuals from the present, back into the past. This process eventually coalesces to a single common ancestor (concestor) of the sample of individuals. Kingman s n-coalescent describes the statistical properties of such an ancestry when k is small compared to the total population size N.

6 The coalescence of two ancestral lineages First, consider two random members from a population of fixed size N. By perfect mixing, the probability they share a concestor in the previous generation is 1/N. The probability the concestor is t generations back is Pr{t} = 1 N (1 1 N )t 1. It follows that g = t 1, has a geometric distribution with a success rate of λ = 1/N, and so has mean N and variance of N 3 /(N 1).

7 The coalescence of k lineages With k lineages the time to the first coalescence is derived in the same way, only now there are ( k 2) possible pairs that may coalesce, resulting in a success rate of λ = ( k 2) /N and mean time to first coalescence (t k ) of E[t k ] = N ( k 2). This implicitly assumes that N is much larger than O(k 2 ), so that the probability of two coalescent events in the same generation is small.

8 The coalescent likelihood for a genealogy For a genealogy with known coalescent times t = {t 2, t 3,..., t n } we can write the likelihood: ( f (t N) = 1 n k ) ) 2 tk N n 1 exp (. N k=2

9 Constant size Exponential growth The coalescent with serial samples Many epidemiological agents, like RNA viruses, evolve very rapidly, so that the effect of sampling the population at different times becomes important.

10 Bayesian integration of uncertainty in genealogies How similar are these two trees? Both of them are plausible given the data. We can use Bayesian Markov-chain Monte Carlo to average the coalescent over all plausible trees.

11 Hepatitis C in Egypt A case study of the coalescent approach to molecular epidemiology Hepatitis C (HCV) Identified in kb single-stranded RNA genome Polyprotein cleaved by proteases Tissue culture system only recently developed

12 How important is Hepatitis C? 170m+ infected 80% infections are chronic Liver cirrhosis and cancer risk 10,000 deaths per year in USA No protective immunity?

13 HCV Transmission By percutaneous exposure to infected blood Blood transfusion / blood products Injecting and nasal drug use Sexual and vertical transmission Unsafe injections Unidentified routes

14 Estimating demographic history of HCV using the coalescent Pybus et al (2003) Molecular Biology and Evolution Egyptian HCV gene sequences n=61 E1 gene, 411bp All sequence contemporaneous Egypt has highest prevalence of HCV worldwide (10-20%) But low prevalence in neighbouring states Why is Egypt so seriously affected? Parenteral antischistosomal therapy (PAT)?

15 Demographic model for Hepatitis C in Egypt The coalescent can be extended to model any integrable function of varying population size. The model we used was a const-exp-const model. A Bayesian MCMC method was developed to sample the gene genealogy, the substitution model and demographic function simultaneously. N c N(x) = N c e r(t x) N A if t x if x < t < y if t y

16 Estimated demographic history of Hepatitis C in Egypt

17 Uncertainty in parameter estimates Demographic parameters Mutational parameters Growth rate of the growth phase Grey box is the prior Rates at different codon positions, All significantly different

18 Full Bayesian Estimation Marginalized over uncertainty in genealogy and mutational processes Yellow band represents time over which PAT was employed in Egypt

19 Virus Phylodynamics

20 Skyline coalescent model

21 The generalized skyline plot Introduced by Pybus and Strimmer (2001). Visual framework for exploring the demographic history of sampled DNA sequences Input: a single estimated ancestral genealogy (a tree) Output: nonparametric plot of the population size through time Groups adjacent coalescent intervals Converts information within these intervals to estimates of population size for a single coalescent interval: ( ) k ˆN k = t k 2 for j adjacent intervals: ˆN k,j = k(k j) 2j k i=k j+1 t i

22 The generalized skyline plot - simulated data Constant population size, N(t) = N 0 Exponential growth, N(t) = N 0 e rt

23 The generalized skyline plot - HIV-1 group M The tree used here was estimated in Yusim et al (2001) Phil. Trans. Roy. Soc. Lond. B 356: The black curve is a parametric coalescent estimate obtained from the same data under the expansion model, N(t) = (N 0 N A )e rt + N A

24 The Bayesian skyline plot (Drummond et al, 2005) The Bayesian skyline plot estimates a demographic function that has a certain fixed number of steps (in this example 15) and then integrates over all possible positions of the break points, and population sizes within each epoch.

25 Extending the BSP with Stochastic Variable Selection Heled and Drummond (2008)

26 Comparison of EBSP to BSP on Egypt Hepatitis C

27 Detecting evolutionary bottlenecks using EBSP 480 contemporaneous samples from a single locus

28 Detecting evolutionary bottlenecks using EBSP 16 contemporaneous samples from each of 32 loci

29 Detecting evolutionary bottlenecks using EBSP 480 samples sampled through time from a single locus

30 The population dynamics of genetic diversity in Influenza A Rambaut et al (2008) Nature 453:

31 Validating the Bayesian skyline plot

32 Comparison of BSP to parametric coalescent model Hepatitis C in Egypt

33 Modeling complex demographic history Dengue 4 in Puerto Rico N(t) = N 0 exp( rt) log marginal likelihood = N(t) = scaled-translated case data log marginal likelihood =

34 Comparing BSP to incidence data Dengue 4 in Puerto Rico

35 The murky boundary between population genetics and phylogenetics There has been increased interest in analyses of closely related species, where the effect of population genetic processes, such as the coalescent can t be ignored. Different gene trees can have different topologies due to incomplete lineage sorting Divergence times of species can be overestimated due to ancestral polymorphism Sometimes the exact species identities of individuals are not known Sometimes researchers identify species based on a split in a single gene tree. Enter the multi-species coalescent.

36 All these patterns are the consequence of Òlineage driftó within a single population

37 Gene trees and species trees

38 Why Ancestral Polymorphisms?

39 Probabilities of Monophyly, Paraphyly and Polyphyly

40 Problems with estimation of divergence times Typically we use gene phylogenies to estimate species phylogenies But divergence time for genes will be longer than that of species.

41 Problems with estimation of divergence times From Edwards and Beerli, 2000, Evolution 54:

42 The multispecies coalescent

43 Population sizes on a species tree Define N i to be the effective population size at the present for the species i {1, 2,..., n}, and A i the ancestral effective population of species i at the time of the species origin. Then for all ancestral species j {n + 1, n + 2,..., 2n} (represented by internal branches in the species tree) define N j = A left(j) + A right(j), where left(i) is the left descendent of species i and right(i) is the right descendent species. A k Exp(Θ), 1 k 2n N i Gamma(2, Θ), 1 i n N j = A left(j) + A right(j), n < j < 2n

44 Population sizes prior

45 Species divergence times prior For a species tree of n species, define T i to be the time at which the species tree goes from having i to i 1 species, back in time. Additionally define τ i = T i T i+1, i {2,..., n 1} and τ n = T n. The Yule speciation prior supposes a uniform rate species birth (λ) on all lineages, implying a prior of: τ i Exp(1/iλ) More complex species tree priors that admit species extinction (Birth-death prior; Gernhard, 2008) and incomplete sampling (Birth-death-sampling; Stadler, 2009) are also possible. All of these species tree priors imply a uniform prior on labelled histories.

46 Coalescent prior for gene trees Consider a single species in the species tree, spanned by k = u v coalescent intervals (and a final interval without a coalescent event). t k is the time during which there are k lineages. Define N(s) as the population size of this species at time s. Define s i = ik=u t k. The prior density for each interval ending in a coalescent is: f (t k ) = 1 N(s k ) exp ˆs k s k 1 ( n 2) N(x) dx

47 Coalescent prior for gene trees The coalescent prior density for the final interval that does not end in a coalescent event is: ˆs v ( n f (t v ) = exp 2) N(x) dx s v 1 Define f g (g S) to be the total coalescent density for gene tree g, obtained from the product of all the intervals over all species in the species tree (S).

48 Posterior probability of multi-species coalescent If we additionally define Pr(D g g) as the likelihood of the sequence data for gene tree g, and f S (S λ, Θ) as the prior on the species tree times and population sizes, then the posterior distribution over gene trees and species trees looks like: h(g, S D) = 1 Z [Pr(D g g)f g (g S)] f S (S λ, Θ)f Λ (λ)f Θ (Θ) g G where f Λ (λ) is the prior for the speciation rate and f Θ (Θ) is the prior on the ancestral population size parameter.

49 *BEAST Coestimate Species tree, Gene trees, Population sizes and all other parameters. Any combination of number of individual and genes from each individual. Gene trees estimated using any BEAST models, any type of linkage between parameters. For example, all mutations rates may be equal or separate.

50 *BEAST Given data D we define the posterior distribution of the species tree S as follows, ˆ ( n ) P(S D) P(d i g i )P(g i S) P(S)dG (1) G i=1 The data D = d 1, d 2,..., d n is composed of n alignments, one per locus. G = (G 1 G 2 G n ) is the space of all gene trees over the respective alignments where g i G i is one specific gene tree on the i th alignment. The term P(d i g i ) is the Felsenstein likelihood of the data given a gene tree, P(g i S) is the multispecies coalescent and P(S) is a prior distribution on the space of species trees. Like most coalescent models this assumes no recombination within loci and free recombination between loci.

51 Population sizes prior

52 Four gene trees inside a 3-species tree

53 Simulating a rapid radiation of 7 species Repeat 100 times: 1 Simulate a random species tree of 7 extant species from Birth-death process 2 Simulate gene trees in species tree (1, 2, 3,... genes) 3 Simulate sequences down each gene tree (400, 800, 1600 nucleotides) Species tree Species tree and one gene tree

54 Gene concatenation - a terrible idea For each simulated data set we concatenated the gene alignments and estimated a (multi-individual) tree. Species tree Gene tree from concatenated genes The concatenated tree always had > 99% posterior support on all nodes. Despite this, in 72 out of 100 simulations (4 loci), there was no pruning (down to 1 individual per species) of the concatenated tree that was consistent with the true species tree. Concatenation is positively misleading!

55

56

57

58

59

60 Comparing *BEAST and BEST and Supermatrix methods 100 simulated species trees (each with 7 species) with four individuals per species and four loci. topology inside 95% mean 95% size normalized branch score *BEAST BEST Supermatrix tree score speciation time inside 95% speciation time error/ci size *BEAST % 0.41/1.49 BEST % 1.32/2.06 Supermatrix N/A 0.6% 21.12/5.28 population size inside 95% population size error/ci size *BEAST 98% 0.33/1.11 BEST 56% 0.59/2.15 Supermatrix N/A N/A

61 Pocket Gophers Data from (Belfiore, 2008) 27 individuals, 7 loci (12 from T. bottae, 23 from others, 1 from outgroup)

62

63

64 Open questions Are there better priors for the population sizes? What are efficient proposal distributions for MCMC on the multispecies coalescent? What about uncertain species identification? How do we characterize the prior distribution on species associations? (geography, morphology...) What about uncertain numbers of species?! How do we characterize the hypothesis space over a species trees with a random number of species, and gene tree tips with an uncertain species identity? Reversible-jump, yes, but what proprosal distributions? ;-) What about Bayesian stochastic variable selection?

Bayesian coalescent inference of population size history

Bayesian coalescent inference of population size history Bayesian coalescent inference of population size history Alexei Drummond University of Auckland Workshop on Population and Speciation Genomics, 2016 1st February 2016 1 / 39 BEAST tutorials Population

More information

Hierarchical Bayesian Modeling of the HIV Response to Therapy

Hierarchical Bayesian Modeling of the HIV Response to Therapy Hierarchical Bayesian Modeling of the HIV Response to Therapy Shane T. Jensen Department of Statistics, The Wharton School, University of Pennsylvania March 23, 2010 Joint Work with Alex Braunstein and

More information

A Step-by-Step Tutorial: Divergence Time Estimation with Approximate Likelihood Calculation Using MCMCTREE in PAML

A Step-by-Step Tutorial: Divergence Time Estimation with Approximate Likelihood Calculation Using MCMCTREE in PAML 9 June 2011 A Step-by-Step Tutorial: Divergence Time Estimation with Approximate Likelihood Calculation Using MCMCTREE in PAML by Jun Inoue, Mario dos Reis, and Ziheng Yang In this tutorial we will analyze

More information

A Rough Guide to BEAST 1.4

A Rough Guide to BEAST 1.4 A Rough Guide to BEAST 1.4 Alexei J. Drummond 1, Simon Y.W. Ho, Nic Rawlence and Andrew Rambaut 2 1 Department of Computer Science The University of Auckland, Private Bag 92019 Auckland, New Zealand alexei@cs.auckland.ac.nz

More information

Bayesian Phylogeny and Measures of Branch Support

Bayesian Phylogeny and Measures of Branch Support Bayesian Phylogeny and Measures of Branch Support Bayesian Statistics Imagine we have a bag containing 100 dice of which we know that 90 are fair and 10 are biased. The

More information

Divergence Time Estimation using BEAST v1.7.5

Divergence Time Estimation using BEAST v1.7.5 Divergence Time Estimation using BEAST v1.7.5 Central among the questions explored in biology are those that seek to understand the timing and rates of evolutionary processes. Accurate estimates of species

More information

DnaSP, DNA polymorphism analyses by the coalescent and other methods.

DnaSP, DNA polymorphism analyses by the coalescent and other methods. DnaSP, DNA polymorphism analyses by the coalescent and other methods. Author affiliation: Julio Rozas 1, *, Juan C. Sánchez-DelBarrio 2,3, Xavier Messeguer 2 and Ricardo Rozas 1 1 Departament de Genètica,

More information

Approximating the Coalescent with Recombination. Niall Cardin Corpus Christi College, University of Oxford April 2, 2007

Approximating the Coalescent with Recombination. Niall Cardin Corpus Christi College, University of Oxford April 2, 2007 Approximating the Coalescent with Recombination A Thesis submitted for the Degree of Doctor of Philosophy Niall Cardin Corpus Christi College, University of Oxford April 2, 2007 Approximating the Coalescent

More information

BASIC STATISTICAL METHODS FOR GENOMIC DATA ANALYSIS

BASIC STATISTICAL METHODS FOR GENOMIC DATA ANALYSIS BASIC STATISTICAL METHODS FOR GENOMIC DATA ANALYSIS SEEMA JAGGI Indian Agricultural Statistics Research Institute Library Avenue, New Delhi-110 012 seema@iasri.res.in Genomics A genome is an organism s

More information

People have thought about, and defined, probability in different ways. important to note the consequences of the definition:

People have thought about, and defined, probability in different ways. important to note the consequences of the definition: PROBABILITY AND LIKELIHOOD, A BRIEF INTRODUCTION IN SUPPORT OF A COURSE ON MOLECULAR EVOLUTION (BIOL 3046) Probability The subject of PROBABILITY is a branch of mathematics dedicated to building models

More information

DNA Sequence Alignment Analysis

DNA Sequence Alignment Analysis Analysis of DNA sequence data p. 1 Analysis of DNA sequence data using MEGA and DNAsp. Analysis of two genes from the X and Y chromosomes of plant species from the genus Silene The first two computer classes

More information

Introduction to Phylogenetic Analysis

Introduction to Phylogenetic Analysis Subjects of this lecture Introduction to Phylogenetic nalysis Irit Orr 1 Introducing some of the terminology of phylogenetics. 2 Introducing some of the most commonly used methods for phylogenetic analysis.

More information

PRINCIPLES OF POPULATION GENETICS

PRINCIPLES OF POPULATION GENETICS PRINCIPLES OF POPULATION GENETICS FOURTH EDITION Daniel L. Hartl Harvard University Andrew G. Clark Cornell University UniversitSts- und Landesbibliothek Darmstadt Bibliothek Biologie Sinauer Associates,

More information

Imperfect Debugging in Software Reliability

Imperfect Debugging in Software Reliability Imperfect Debugging in Software Reliability Tevfik Aktekin and Toros Caglar University of New Hampshire Peter T. Paul College of Business and Economics Department of Decision Sciences and United Health

More information

Lecture 6: Single nucleotide polymorphisms (SNPs) and Restriction Fragment Length Polymorphisms (RFLPs)

Lecture 6: Single nucleotide polymorphisms (SNPs) and Restriction Fragment Length Polymorphisms (RFLPs) Lecture 6: Single nucleotide polymorphisms (SNPs) and Restriction Fragment Length Polymorphisms (RFLPs) Single nucleotide polymorphisms or SNPs (pronounced "snips") are DNA sequence variations that occur

More information

Evolution (18%) 11 Items Sample Test Prep Questions

Evolution (18%) 11 Items Sample Test Prep Questions Evolution (18%) 11 Items Sample Test Prep Questions Grade 7 (Evolution) 3.a Students know both genetic variation and environmental factors are causes of evolution and diversity of organisms. (pg. 109 Science

More information

Phylogenetic Trees Made Easy

Phylogenetic Trees Made Easy Phylogenetic Trees Made Easy A How-To Manual Fourth Edition Barry G. Hall University of Rochester, Emeritus and Bellingham Research Institute Sinauer Associates, Inc. Publishers Sunderland, Massachusetts

More information

Protein Protein Interaction Networks

Protein Protein Interaction Networks Functional Pattern Mining from Genome Scale Protein Protein Interaction Networks Young-Rae Cho, Ph.D. Assistant Professor Department of Computer Science Baylor University it My Definition of Bioinformatics

More information

AP Biology Essential Knowledge Student Diagnostic

AP Biology Essential Knowledge Student Diagnostic AP Biology Essential Knowledge Student Diagnostic Background The Essential Knowledge statements provided in the AP Biology Curriculum Framework are scientific claims describing phenomenon occurring in

More information

Dating Phylogenies with Sequentially Sampled Tips

Dating Phylogenies with Sequentially Sampled Tips Syst. Biol. 62(5):674 688, 2013 The Author(s) 2013. Published by Oxford University Press, on behalf of the Society of Systematic Biologists. All rights reserved. For Permissions, please email: journals.permissions@oup.com

More information

Principles of Evolution - Origin of Species

Principles of Evolution - Origin of Species Theories of Organic Evolution X Multiple Centers of Creation (de Buffon) developed the concept of "centers of creation throughout the world organisms had arisen, which other species had evolved from X

More information

Name Class Date. binomial nomenclature. MAIN IDEA: Linnaeus developed the scientific naming system still used today.

Name Class Date. binomial nomenclature. MAIN IDEA: Linnaeus developed the scientific naming system still used today. Section 1: The Linnaean System of Classification 17.1 Reading Guide KEY CONCEPT Organisms can be classified based on physical similarities. VOCABULARY taxonomy taxon binomial nomenclature genus MAIN IDEA:

More information

Course Descriptions. I. Professional Courses: MSEG 7216: Introduction to Infectious Diseases (Medical Students)

Course Descriptions. I. Professional Courses: MSEG 7216: Introduction to Infectious Diseases (Medical Students) Course Descriptions I. Professional Courses: MSEG 7216: Introduction to Infectious Diseases (Medical Students) This course is offered during the first semester of the second year of the MD Program. It

More information

Understanding by Design. Title: BIOLOGY/LAB. Established Goal(s) / Content Standard(s): Essential Question(s) Understanding(s):

Understanding by Design. Title: BIOLOGY/LAB. Established Goal(s) / Content Standard(s): Essential Question(s) Understanding(s): Understanding by Design Title: BIOLOGY/LAB Standard: EVOLUTION and BIODIVERSITY Grade(s):9/10/11/12 Established Goal(s) / Content Standard(s): 5. Evolution and Biodiversity Central Concepts: Evolution

More information

Innovations in Molecular Epidemiology

Innovations in Molecular Epidemiology Innovations in Molecular Epidemiology Molecular Epidemiology Measure current rates of active transmission Determine whether recurrent tuberculosis is attributable to exogenous reinfection Determine whether

More information

FAQs: Gene drives - - What is a gene drive?

FAQs: Gene drives - - What is a gene drive? FAQs: Gene drives - - What is a gene drive? During normal sexual reproduction, each of the two versions of a given gene has a 50 percent chance of being inherited by a particular offspring (Fig 1A). Gene

More information

Global Under Diagnosis of Viral Hepatitis

Global Under Diagnosis of Viral Hepatitis Global Under Diagnosis of Viral Hepatitis Mel Krajden MD, FRCPC Medical Head, Hepatitis Clinical Prevention Services Associate Medical Director, Public Health Microbiology & Reference Laboratory BC Centre

More information

Borges, J. L. 1998. On exactitude in science. P. 325, In, Jorge Luis Borges, Collected Fictions (Trans. Hurley, H.) Penguin Books.

Borges, J. L. 1998. On exactitude in science. P. 325, In, Jorge Luis Borges, Collected Fictions (Trans. Hurley, H.) Penguin Books. ... In that Empire, the Art of Cartography attained such Perfection that the map of a single Province occupied the entirety of a City, and the map of the Empire, the entirety of a Province. In time, those

More information

Protein Sequence Analysis - Overview -

Protein Sequence Analysis - Overview - Protein Sequence Analysis - Overview - UDEL Workshop Raja Mazumder Research Associate Professor, Department of Biochemistry and Molecular Biology Georgetown University Medical Center Topics Why do protein

More information

RETRIEVING SEQUENCE INFORMATION. Nucleotide sequence databases. Database search. Sequence alignment and comparison

RETRIEVING SEQUENCE INFORMATION. Nucleotide sequence databases. Database search. Sequence alignment and comparison RETRIEVING SEQUENCE INFORMATION Nucleotide sequence databases Database search Sequence alignment and comparison Biological sequence databases Originally just a storage place for sequences. Currently the

More information

Statistics Graduate Courses

Statistics Graduate Courses Statistics Graduate Courses STAT 7002--Topics in Statistics-Biological/Physical/Mathematics (cr.arr.).organized study of selected topics. Subjects and earnable credit may vary from semester to semester.

More information

http://www.univcan.ca/programs-services/international-programs/canadianqueen-elizabeth-ii-diamond-jubilee-scholarships/

http://www.univcan.ca/programs-services/international-programs/canadianqueen-elizabeth-ii-diamond-jubilee-scholarships/ QUANTITATIVE BIOLOGY AND MEDICAL GENETICS FOR THE WORLD PROGRAM ANNOUNCEMENT The "Quantitative Biology and Medical Genetics for the World" program at McGill is offering a number of prestigious Queen Elizabeth

More information

Lab 2/Phylogenetics/September 16, 2002 1 PHYLOGENETICS

Lab 2/Phylogenetics/September 16, 2002 1 PHYLOGENETICS Lab 2/Phylogenetics/September 16, 2002 1 Read: Tudge Chapter 2 PHYLOGENETICS Objective of the Lab: To understand how DNA and protein sequence information can be used to make comparisons and assess evolutionary

More information

SIXTY-SEVENTH WORLD HEALTH ASSEMBLY. Agenda item 12.3 24 May 2014. Hepatitis

SIXTY-SEVENTH WORLD HEALTH ASSEMBLY. Agenda item 12.3 24 May 2014. Hepatitis SIXTY-SEVENTH WORLD HEALTH ASSEMBLY WHA67.6 Agenda item 12.3 24 May 2014 Hepatitis The Sixty-seventh World Health Assembly, Having considered the report on hepatitis; 1 Reaffirming resolution WHA63.18,

More information

Summary. 16 1 Genes and Variation. 16 2 Evolution as Genetic Change. Name Class Date

Summary. 16 1 Genes and Variation. 16 2 Evolution as Genetic Change. Name Class Date Chapter 16 Summary Evolution of Populations 16 1 Genes and Variation Darwin s original ideas can now be understood in genetic terms. Beginning with variation, we now know that traits are controlled by

More information

Finding Clusters in Phylogenetic Trees: A Special Type of Cluster Analysis

Finding Clusters in Phylogenetic Trees: A Special Type of Cluster Analysis Finding lusters in Phylogenetic Trees: Special Type of luster nalysis Why try to identify clusters in phylogenetic trees? xample: origin of HIV. NUMR: Why are there so many distinct clusters? LUR04-7 SYNHRONY:

More information

Y Chromosome Markers

Y Chromosome Markers Y Chromosome Markers Lineage Markers Autosomal chromosomes recombine with each meiosis Y and Mitochondrial DNA does not This means that the Y and mtdna remains constant from generation to generation Except

More information

Hepatitis C. Eliot Godofsky, MD University Hepatitis Center Bradenton, FL

Hepatitis C. Eliot Godofsky, MD University Hepatitis Center Bradenton, FL Hepatitis C Eliot Godofsky, MD University Hepatitis Center Bradenton, FL Recent Advances in Hepatitis C Appreciation that many patients are undiagnosed Improved screening to identify infected persons Assessment

More information

Hepatitis C Infections in Oregon September 2014

Hepatitis C Infections in Oregon September 2014 Public Health Division Hepatitis C Infections in Oregon September 214 Chronic HCV in Oregon Since 25, when positive laboratory results for HCV infection became reportable in Oregon, 47,252 persons with

More information

A comparison of methods for estimating the transition:transversion ratio from DNA sequences

A comparison of methods for estimating the transition:transversion ratio from DNA sequences Molecular Phylogenetics and Evolution 32 (2004) 495 503 MOLECULAR PHYLOGENETICS AND EVOLUTION www.elsevier.com/locate/ympev A comparison of methods for estimating the transition:transversion ratio from

More information

Sequence Analysis 15: lecture 5. Substitution matrices Multiple sequence alignment

Sequence Analysis 15: lecture 5. Substitution matrices Multiple sequence alignment Sequence Analysis 15: lecture 5 Substitution matrices Multiple sequence alignment A teacher's dilemma To understand... Multiple sequence alignment Substitution matrices Phylogenetic trees You first need

More information

PHYLOGENY AND EVOLUTION OF NEWCASTLE DISEASE VIRUS GENOTYPES

PHYLOGENY AND EVOLUTION OF NEWCASTLE DISEASE VIRUS GENOTYPES Eötvös Lóránd University Biology Doctorate School Classical and molecular genetics program Project leader: Dr. László Orosz, corresponding member of HAS PHYLOGENY AND EVOLUTION OF NEWCASTLE DISEASE VIRUS

More information

Epidemiology of HCV and HIV/HCV Infection in Guangzhou, China. Title. Charles Wang, M.D.

Epidemiology of HCV and HIV/HCV Infection in Guangzhou, China. Title. Charles Wang, M.D. Epidemiology of HCV and HIV/HCV Infection in Guangzhou, China Title Charles Wang, M.D. Background Hepatitis C infects 170 million worldwide & 3 million Chinese HIV infects 33.3 million worldwide & 740,000

More information

Molecular Clocks and Tree Dating with r8s and BEAST

Molecular Clocks and Tree Dating with r8s and BEAST Integrative Biology 200B University of California, Berkeley Principals of Phylogenetics: Ecology and Evolution Spring 2011 Updated by Nick Matzke Molecular Clocks and Tree Dating with r8s and BEAST Today

More information

Algorithms in Computational Biology (236522) spring 2007 Lecture #1

Algorithms in Computational Biology (236522) spring 2007 Lecture #1 Algorithms in Computational Biology (236522) spring 2007 Lecture #1 Lecturer: Shlomo Moran, Taub 639, tel 4363 Office hours: Tuesday 11:00-12:00/by appointment TA: Ilan Gronau, Taub 700, tel 4894 Office

More information

Liver Disease and Therapy of Hepatitis B Virus Infections

Liver Disease and Therapy of Hepatitis B Virus Infections Liver Disease and Therapy of Hepatitis B Virus Infections University of Adelaide Catherine Scougall Arend Grosse Huey-Chi Low Allison Jilbert Fox Chase Cancer Center Chunxiao Xu Carol Aldrich Sam Litwin

More information

Hepatitis C Glossary of Terms

Hepatitis C Glossary of Terms Acute Hepatitis C A short-term illness that usually occurs within the first six months after someone is exposed to the hepatitis C virus (HCV). 1 Antibodies Proteins produced as part of the body s immune

More information

Chapter 8: Recombinant DNA 2002 by W. H. Freeman and Company Chapter 8: Recombinant DNA 2002 by W. H. Freeman and Company

Chapter 8: Recombinant DNA 2002 by W. H. Freeman and Company Chapter 8: Recombinant DNA 2002 by W. H. Freeman and Company Genetic engineering: humans Gene replacement therapy or gene therapy Many technical and ethical issues implications for gene pool for germ-line gene therapy what traits constitute disease rather than just

More information

Practice Questions 1: Evolution

Practice Questions 1: Evolution Practice Questions 1: Evolution 1. Which concept is best illustrated in the flowchart below? A. natural selection B. genetic manipulation C. dynamic equilibrium D. material cycles 2. The diagram below

More information

Evolution of Retroviruses: Fossils in our DNA 1

Evolution of Retroviruses: Fossils in our DNA 1 Evolution of Retroviruses: Fossils in our DNA 1 JOHN M. COFFIN Professor of Molecular Biology and Microbiology Tufts University UNIQUE AMONG INFECTIOUS AGENTS, retroviruses provide the opportunity for

More information

Forensic DNA Testing Terminology

Forensic DNA Testing Terminology Forensic DNA Testing Terminology ABI 310 Genetic Analyzer a capillary electrophoresis instrument used by forensic DNA laboratories to separate short tandem repeat (STR) loci on the basis of their size.

More information

Viral hepatitis. Report by the Secretariat

Viral hepatitis. Report by the Secretariat SIXTY-THIRD WORLD HEALTH ASSEMBLY A63/15 Provisional agenda item 11.12 25 March 2010 Viral hepatitis Report by the Secretariat THE DISEASES AND BURDEN 1. The group of viruses (hepatitis A, B, C, D and

More information

Contents. list of contributors. Preface. Basic concepts of molecular evolution 3

Contents. list of contributors. Preface. Basic concepts of molecular evolution 3 list of contributors Foreword Preface page xix xxiii XXV Section I: Introduction Basic concepts of molecular evolution 3 Anne-Mieke Vandamme 1.1 Genetic information 3 1.2 Population dynamics 9 1.3 Evolution

More information

Mutations: 2 general ways to alter DNA. Mutations. What is a mutation? Mutations are rare. Changes in a single DNA base. Change a single DNA base

Mutations: 2 general ways to alter DNA. Mutations. What is a mutation? Mutations are rare. Changes in a single DNA base. Change a single DNA base Mutations Mutations: 2 general ways to alter DNA Change a single DNA base Or entire sections of DNA can move from one place to another What is a mutation? Any change in the nucleotide sequence of DNA Here

More information

Beginner's guide to Hepatitis C testing and immunisation against hepatitis A+B in general practice

Beginner's guide to Hepatitis C testing and immunisation against hepatitis A+B in general practice Beginner's guide to Hepatitis C testing and immunisation against hepatitis A+B in general practice Dr Chris Ford GP & SMMGP Clinical Lead Kate Halliday Telford & Wrekin Shared Care Coordinator Aims Discuss:

More information

Chapter 13: Meiosis and Sexual Life Cycles

Chapter 13: Meiosis and Sexual Life Cycles Name Period Chapter 13: Meiosis and Sexual Life Cycles Concept 13.1 Offspring acquire genes from parents by inheriting chromosomes 1. Let s begin with a review of several terms that you may already know.

More information

Poor access to HCV treatment is undermining Universal Access A briefing note to the UNITAID Board

Poor access to HCV treatment is undermining Universal Access A briefing note to the UNITAID Board Poor access to HCV treatment is undermining Universal Access A briefing note to the UNITAID Board The growing crisis of HIV/HCV coinfection It is estimated that 4-5 million people living with HIV (PLHIV)

More information

Amino Acids and Their Properties

Amino Acids and Their Properties Amino Acids and Their Properties Recap: ss-rrna and mutations Ribosomal RNA (rrna) evolves very slowly Much slower than proteins ss-rrna is typically used So by aligning ss-rrna of one organism with that

More information

Section 3 Comparative Genomics and Phylogenetics

Section 3 Comparative Genomics and Phylogenetics Section 3 Section 3 Comparative enomics and Phylogenetics At the end of this section you should be able to: Describe what is meant by DNA sequencing. Explain what is meant by Bioinformatics and Comparative

More information

BAPS: Bayesian Analysis of Population Structure

BAPS: Bayesian Analysis of Population Structure BAPS: Bayesian Analysis of Population Structure Manual v. 6.0 NOTE: ANY INQUIRIES CONCERNING THE PROGRAM SHOULD BE SENT TO JUKKA CORANDER (first.last at helsinki.fi). http://www.helsinki.fi/bsg/software/baps/

More information

PROC. CAIRO INTERNATIONAL BIOMEDICAL ENGINEERING CONFERENCE 2006 1. E-mail: msm_eng@k-space.org

PROC. CAIRO INTERNATIONAL BIOMEDICAL ENGINEERING CONFERENCE 2006 1. E-mail: msm_eng@k-space.org BIOINFTool: Bioinformatics and sequence data analysis in molecular biology using Matlab Mai S. Mabrouk 1, Marwa Hamdy 2, Marwa Mamdouh 2, Marwa Aboelfotoh 2,Yasser M. Kadah 2 1 Biomedical Engineering Department,

More information

Methods for big data in medical genomics

Methods for big data in medical genomics Methods for big data in medical genomics Parallel Hidden Markov Models in Population Genetics Chris Holmes, (Peter Kecskemethy & Chris Gamble) Department of Statistics and, Nuffield Department of Medicine

More information

Molecular Diagnosis of Hepatitis B and Hepatitis D infections

Molecular Diagnosis of Hepatitis B and Hepatitis D infections Molecular Diagnosis of Hepatitis B and Hepatitis D infections Acute infection Detection of HBsAg in serum is a fundamental diagnostic marker of HBV infection HBsAg shows a strong correlation with HBV replication

More information

SENSITIVITY ANALYSIS AND INFERENCE. Lecture 12

SENSITIVITY ANALYSIS AND INFERENCE. Lecture 12 This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike License. Your use of this material constitutes acceptance of that license and the conditions of use of materials on this

More information

Lecture 13: DNA Technology. DNA Sequencing. DNA Sequencing Genetic Markers - RFLPs polymerase chain reaction (PCR) products of biotechnology

Lecture 13: DNA Technology. DNA Sequencing. DNA Sequencing Genetic Markers - RFLPs polymerase chain reaction (PCR) products of biotechnology Lecture 13: DNA Technology DNA Sequencing Genetic Markers - RFLPs polymerase chain reaction (PCR) products of biotechnology DNA Sequencing determine order of nucleotides in a strand of DNA > bases = A,

More information

Service courses for graduate students in degree programs other than the MS or PhD programs in Biostatistics.

Service courses for graduate students in degree programs other than the MS or PhD programs in Biostatistics. Course Catalog In order to be assured that all prerequisites are met, students must acquire a permission number from the education coordinator prior to enrolling in any Biostatistics course. Courses are

More information

Asexual Versus Sexual Reproduction in Genetic Algorithms 1

Asexual Versus Sexual Reproduction in Genetic Algorithms 1 Asexual Versus Sexual Reproduction in Genetic Algorithms Wendy Ann Deslauriers (wendyd@alumni.princeton.edu) Institute of Cognitive Science,Room 22, Dunton Tower Carleton University, 25 Colonel By Drive

More information

STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.cs.toronto.edu/~rsalakhu/ Lecture 6 Three Approaches to Classification Construct

More information

Model-based Synthesis. Tony O Hagan

Model-based Synthesis. Tony O Hagan Model-based Synthesis Tony O Hagan Stochastic models Synthesising evidence through a statistical model 2 Evidence Synthesis (Session 3), Helsinki, 28/10/11 Graphical modelling The kinds of models that

More information

PHYML Online: A Web Server for Fast Maximum Likelihood-Based Phylogenetic Inference

PHYML Online: A Web Server for Fast Maximum Likelihood-Based Phylogenetic Inference PHYML Online: A Web Server for Fast Maximum Likelihood-Based Phylogenetic Inference Stephane Guindon, F. Le Thiec, Patrice Duroux, Olivier Gascuel To cite this version: Stephane Guindon, F. Le Thiec, Patrice

More information

Building a phylogenetic tree

Building a phylogenetic tree bioscience explained 134567 Wojciech Grajkowski Szkoła Festiwalu Nauki, ul. Ks. Trojdena 4, 02-109 Warszawa Building a phylogenetic tree Aim This activity shows how phylogenetic trees are constructed using

More information

Bacteria vs. Virus: What s the Difference? Grade 11-12

Bacteria vs. Virus: What s the Difference? Grade 11-12 Bacteria vs. Virus: What s the Difference? Grade 11-12 Subject: Biology Topic: Bacteria, viruses, and the differences between them. The role that water plays in spreading bacteria and viruses, and the

More information

Introduction to Bioinformatics 3. DNA editing and contig assembly

Introduction to Bioinformatics 3. DNA editing and contig assembly Introduction to Bioinformatics 3. DNA editing and contig assembly Benjamin F. Matthews United States Department of Agriculture Soybean Genomics and Improvement Laboratory Beltsville, MD 20708 matthewb@ba.ars.usda.gov

More information

Biology 1406 - Notes for exam 5 - Population genetics Ch 13, 14, 15

Biology 1406 - Notes for exam 5 - Population genetics Ch 13, 14, 15 Biology 1406 - Notes for exam 5 - Population genetics Ch 13, 14, 15 Species - group of individuals that are capable of interbreeding and producing fertile offspring; genetically similar 13.7, 14.2 Population

More information

Comparative Drug Ranking for Clinical Decision Support in HIV Treatment

Comparative Drug Ranking for Clinical Decision Support in HIV Treatment Comparative Drug Ranking for Clinical Decision Support in HIV Treatment Emiliano Mancini University of Amsterdam 1 Prevalence of HIV 2 Global Overview HIV Infection People living with HIV: 34 million (2010

More information

Understanding West Nile Virus Infection

Understanding West Nile Virus Infection Understanding West Nile Virus Infection The QIAGEN Bioinformatics Solution: Biomedical Genomics Workbench (BXWB) + Ingenuity Pathway Analysis (IPA) Functional Genomics & Predictive Medicine, May 21-22,

More information

The Central Dogma of Molecular Biology

The Central Dogma of Molecular Biology Vierstraete Andy (version 1.01) 1/02/2000 -Page 1 - The Central Dogma of Molecular Biology Figure 1 : The Central Dogma of molecular biology. DNA contains the complete genetic information that defines

More information

The Epidemiology of Hepatitis A, B, and C

The Epidemiology of Hepatitis A, B, and C The Epidemiology of Hepatitis A, B, and C Jamie Berkes M.D. University of Illinois at Chicago jberkes@uic.edu Epidemiology: Definitions The study of the incidence and prevalence of diseases in large populations

More information

Master's projects at ITMO University. Daniil Chivilikhin PhD Student @ ITMO University

Master's projects at ITMO University. Daniil Chivilikhin PhD Student @ ITMO University Master's projects at ITMO University Daniil Chivilikhin PhD Student @ ITMO University General information Guidance from our lab's researchers Publishable results 2 Research areas Research at ITMO Evolutionary

More information

The Human Genome Project. From genome to health From human genome to other genomes and to gene function Structural Genomics initiative

The Human Genome Project. From genome to health From human genome to other genomes and to gene function Structural Genomics initiative The Human Genome Project From genome to health From human genome to other genomes and to gene function Structural Genomics initiative June 2000 What is the Human Genome Project? U.S. govt. project coordinated

More information

Name: Class: Date: Multiple Choice Identify the choice that best completes the statement or answers the question.

Name: Class: Date: Multiple Choice Identify the choice that best completes the statement or answers the question. Name: Class: Date: Chapter 17 Practice Multiple Choice Identify the choice that best completes the statement or answers the question. 1. The correct order for the levels of Linnaeus's classification system,

More information

MAKING AN EVOLUTIONARY TREE

MAKING AN EVOLUTIONARY TREE Student manual MAKING AN EVOLUTIONARY TREE THEORY The relationship between different species can be derived from different information sources. The connection between species may turn out by similarities

More information

Genetic Technology. Name: Class: Date: Multiple Choice Identify the choice that best completes the statement or answers the question.

Genetic Technology. Name: Class: Date: Multiple Choice Identify the choice that best completes the statement or answers the question. Name: Class: Date: Genetic Technology Multiple Choice Identify the choice that best completes the statement or answers the question. 1. An application of using DNA technology to help environmental scientists

More information

Epidemiology of Hepatitis C Infection. Pablo Barreiro Service of Infectious Diseases Hospital Carlos III, Madrid

Epidemiology of Hepatitis C Infection. Pablo Barreiro Service of Infectious Diseases Hospital Carlos III, Madrid Epidemiology of Hepatitis C Infection Pablo Barreiro Service of Infectious Diseases Hospital Carlos III, Madrid Worldwide Prevalence of Hepatitis C 10% No data available WHO.

More information

Maximum-Likelihood Estimation of Phylogeny from DNA Sequences When Substitution Rates Differ over Sites1

Maximum-Likelihood Estimation of Phylogeny from DNA Sequences When Substitution Rates Differ over Sites1 Maximum-Likelihood Estimation of Phylogeny from DNA Sequences When Substitution Rates Differ over Sites1 Ziheng Yang Department of Animal Science, Beijing Agricultural University Felsenstein s maximum-likelihood

More information

UPDATE ON NEW HEPATITIS C MEDICINES

UPDATE ON NEW HEPATITIS C MEDICINES UPDATE ON NEW HEPATITIS C MEDICINES Diana Sylvestre, MD 2014 AASLD/IDSA Guidelines: Gt 1 Treatment naïve Interferon eligible: sofosuvir + ribavirin + PEG-IFN x 12 weeks Interferon ineligible: sofosbuvir

More information

Introduction to Bioinformatics AS 250.265 Laboratory Assignment 6

Introduction to Bioinformatics AS 250.265 Laboratory Assignment 6 Introduction to Bioinformatics AS 250.265 Laboratory Assignment 6 In the last lab, you learned how to perform basic multiple sequence alignments. While useful in themselves for determining conserved residues

More information

HEPATITIS WEB STUDY Acute Hepatitis C Virus Infection: Epidemiology, Clinical Features, and Diagnosis

HEPATITIS WEB STUDY Acute Hepatitis C Virus Infection: Epidemiology, Clinical Features, and Diagnosis HEPATITIS WEB STUDY Acute C Virus Infection: Epidemiology, Clinical Features, and Diagnosis H. Nina Kim, MD Assistant Professor of Medicine Division of Infectious Diseases University of Washington School

More information

Bloodborne Pathogens

Bloodborne Pathogens Bloodborne Pathogens Learning Objectives By the end of this section, the participant should be able to: Name 3 bloodborne pathogens Identify potentially contaminated bodily fluids Describe 3 safe work

More information

Deterministic computer simulations were performed to evaluate the effect of maternallytransmitted

Deterministic computer simulations were performed to evaluate the effect of maternallytransmitted Supporting Information 3. Host-parasite simulations Deterministic computer simulations were performed to evaluate the effect of maternallytransmitted parasites on the evolution of sex. Briefly, the simulations

More information

Biology Final Exam Study Guide: Semester 2

Biology Final Exam Study Guide: Semester 2 Biology Final Exam Study Guide: Semester 2 Questions 1. Scientific method: What does each of these entail? Investigation and Experimentation Problem Hypothesis Methods Results/Data Discussion/Conclusion

More information

specific B cells Humoral immunity lymphocytes antibodies B cells bone marrow Cell-mediated immunity: T cells antibodies proteins

specific B cells Humoral immunity lymphocytes antibodies B cells bone marrow Cell-mediated immunity: T cells antibodies proteins Adaptive Immunity Chapter 17: Adaptive (specific) Immunity Bio 139 Dr. Amy Rogers Host defenses that are specific to a particular infectious agent Can be innate or genetic for humans as a group: most microbes

More information

REVIEWS. Computer programs for population genetics data analysis: a survival guide FOCUS ON STATISTICAL ANALYSIS

REVIEWS. Computer programs for population genetics data analysis: a survival guide FOCUS ON STATISTICAL ANALYSIS FOCUS ON STATISTICAL ANALYSIS REVIEWS Computer programs for population genetics data analysis: a survival guide Laurent Excoffier and Gerald Heckel Abstract The analysis of genetic diversity within species

More information

João Silva de Mendonça, MD, PhD Infectious Diseases Service Hospital do Servidor Público Estadual São Paulo - Brazil

João Silva de Mendonça, MD, PhD Infectious Diseases Service Hospital do Servidor Público Estadual São Paulo - Brazil Brazilian Ministry of Health, Brazilian Society of Hepatology PAHO, Viral Hepatitis Prevention Board PART III SESSION 9 Prevention and control of Viral Hepatitis in Brazil HEPATITIS IN SPECIAL RISK GROUPS/

More information

Graph theoretic approach to analyze amino acid network

Graph theoretic approach to analyze amino acid network Int. J. Adv. Appl. Math. and Mech. 2(3) (2015) 31-37 (ISSN: 2347-2529) Journal homepage: www.ijaamm.com International Journal of Advances in Applied Mathematics and Mechanics Graph theoretic approach to

More information

MUTATION, DNA REPAIR AND CANCER

MUTATION, DNA REPAIR AND CANCER MUTATION, DNA REPAIR AND CANCER 1 Mutation A heritable change in the genetic material Essential to the continuity of life Source of variation for natural selection New mutations are more likely to be harmful

More information

Bayesian Statistical Analysis in Medical Research

Bayesian Statistical Analysis in Medical Research Bayesian Statistical Analysis in Medical Research David Draper Department of Applied Mathematics and Statistics University of California, Santa Cruz draper@ams.ucsc.edu www.ams.ucsc.edu/ draper ROLE Steering

More information

Genetics Lecture Notes 7.03 2005. Lectures 1 2

Genetics Lecture Notes 7.03 2005. Lectures 1 2 Genetics Lecture Notes 7.03 2005 Lectures 1 2 Lecture 1 We will begin this course with the question: What is a gene? This question will take us four lectures to answer because there are actually several

More information

Modeling the Distribution of Environmental Radon Levels in Iowa: Combining Multiple Sources of Spatially Misaligned Data

Modeling the Distribution of Environmental Radon Levels in Iowa: Combining Multiple Sources of Spatially Misaligned Data Modeling the Distribution of Environmental Radon Levels in Iowa: Combining Multiple Sources of Spatially Misaligned Data Brian J. Smith, Ph.D. The University of Iowa Joint Statistical Meetings August 10,

More information