DNA Barcoding in Plants: Biodiversity Identification and Discovery



Similar documents
Extensive Cryptic Diversity in Indo-Australian Rainbowfishes Revealed by DNA Barcoding

BARCODING LIFE, ILLUSTRATED

A data management framework for the Fungal Tree of Life

Mitochondrial DNA Analysis

DNA Barcoding: A New Tool for Identifying Biological Specimens and Managing Species Diversity

BIOL 3200 Spring 2015 DNA Subway and RNA-Seq Data Analysis

The Biology Project, the University of Arizona:

P. ramorum diagnostics - update. USDA APHIS PPQ CPHST March, 2006

Package phylotools. R topics documented: February 20, Type Package Title Phylogenetic tools for Eco-phylogenetics Version 0.1.

COMPARING DNA SEQUENCES TO DETERMINE EVOLUTIONARY RELATIONSHIPS AMONG MOLLUSKS

Sequence Formats and Sequence Database Searches. Gloria Rendon SC11 Education June, 2011

Robert G. Young & Sarah Adamowicz University of Guelph Cathryn Abbott & Tom Therriault Department of Fisheries and Oceans

DNA Banking International Efforts

Protocols. Internal transcribed spacer region (ITS) region. Niklaus J. Grünwald, Frank N. Martin, and Meg M. Larsen (2013)

Single Nucleotide Polymorphisms (SNPs)

Bioinformatics Resources at a Glance

Lesson Overview. Biodiversity. Lesson Overview. 6.3 Biodiversity

Name Class Date. binomial nomenclature. MAIN IDEA: Linnaeus developed the scientific naming system still used today.

The world of non-coding RNA. Espen Enerly

Original article: COMPARISON OF FIVE CALLIGONUM SPECIES IN TARIM BASIN BASED ON MORPHOLOGICAL AND MOLECULAR DATA

Activity IT S ALL RELATIVES The Role of DNA Evidence in Forensic Investigations

Biological Sciences Initiative. Human Genome

Systematic discovery of regulatory motifs in human promoters and 30 UTRs by comparison of several mammals

Biology Institute: 7 PhD programs Expertise in all areas of biological sciences

Human Genome and Human Genome Project. Louxin Zhang

Just the Facts: A Basic Introduction to the Science Underlying NCBI Resources

Metagenomic and metatranscriptomic analysis

How Sequencing Experiments Fail

Potential study items for students at the Botanic Garden Meise

Strength and Limitations of DNA Barcode under the Multidimensional Species Perspective. Valerio Sbordoni

Annex to the Accreditation Certificate D-PL according to DIN EN ISO/IEC 17025:2005

Pairwise Sequence Alignment

Development of two Novel DNA Analysis methods to Improve Workflow Efficiency for Challenging Forensic Samples

Global Ecology and Wildlife Conservation

A Primer of Genome Science THIRD

Digitization in the Pacific. Larry M. Page PD, idigbio Curator, FLMNH

ADVANCES IN BOTANICAL RESEARCH

RETRIEVING SEQUENCE INFORMATION. Nucleotide sequence databases. Database search. Sequence alignment and comparison

AP Biology Essential Knowledge Student Diagnostic

Life. BarCode of. Wandering the aisles of a supermarket

Lab 2/Phylogenetics/September 16, PHYLOGENETICS

A Morphological Study On Endemic Malabaila lasiocarpa Boiss. (Apiaceae) From Bingol (Turkey)

Searching Nucleotide Databases

Research to improve the use and conservation of agricultural biodiversity for smallholder farmers

BIO 3350: ELEMENTS OF BIOINFORMATICS PARTIALLY ONLINE SYLLABUS

MASTER OF SCIENCE IN BIOLOGY

Worksheet - COMPARATIVE MAPPING 1

Next Generation Sequencing Technologies in Microbial Ecology. Frank Oliver Glöckner

An example of bioinformatics application on plant breeding projects in Rijk Zwaan

MCAS Biology. Review Packet

Next Generation Sequencing

NECC History. Karl V. Steiner 2011 Annual NECC Meeting, Orono, Maine March 15, 2011

Difficult DNA Templates Sequencing. Primer Walking Service

Digitization of the Albion College Herbarium. Matthew Kleinow

Wildlife Ecologist. Mount Gibson Wildlife Sanctuary

CCR Biology - Chapter 9 Practice Test - Summer 2012

Introduction to transcriptome analysis using High Throughput Sequencing technologies (HTS)

Mobilising Vegetation Plot Data: the National Vegetation Survey Databank. Susan Wiser April

Smithsonian Marine Science Network

Bioinformatics Grid - Enabled Tools For Biologists.

ABSTRACT. Promega Corporation, Updated September Campbell-Staton, S.

Single-Cell DNA Sequencing with the C 1. Single-Cell Auto Prep System. Reveal hidden populations and genetic diversity within complex samples

Standards, Guidelines and Best Practices for RNA-Seq V1.0 (June 2011) The ENCODE Consortium

Molecular typing of VTEC: from PFGE to NGS-based phylogeny

Smithsonian American Art Museum

Relationships of Floras (& Faunas)

Using Illumina BaseSpace Apps to Analyze RNA Sequencing Data

Interactive Information Visualization in the Digital Flora of Texas

Accelerate genomic breakthroughs in microbiology. Gain deeper insights with powerful bioinformatic tools.

Next Generation Sequencing: Technology, Mapping, and Analysis

Final Project Report

DNA Sequence-Based Identification and Molecular Phylogeny Within Subfamily Dipterocarpoideae (Dipterocarpaceae)

Forensic Wood Science: science & technology for compliance and enforcement. Shelley Gardner U.S. Forest Service, International Programs

Biology Behind the Crime Scene Week 4: Lab #4 Genetics Exercise (Meiosis) and RFLP Analysis of DNA

Genetic information (DNA) determines structure of proteins DNA RNA proteins cell structure enzymes control cell chemistry ( metabolism )

Genomics GENterprise

Name: Class: Date: Multiple Choice Identify the choice that best completes the statement or answers the question.

S1. Training to sustain evolutionary biology

CHAPTER 2: APPROACH AND METHODS APPROACH

Biodiversity Concepts

GenBank, Entrez, & FASTA

Algorithms in Computational Biology (236522) spring 2007 Lecture #1

Bayesian Phylogeny and Measures of Branch Support

PRINCIPLES OF POPULATION GENETICS

AmphoraNet: Taxonomic Composition Analysis of Metagenomic Shotgun Sequencing Data

Multiple Losses of Flight and Recent Speciation in Steamer Ducks Tara L. Fulton, Brandon Letts, and Beth Shapiro

Innovations in Molecular Epidemiology

Bioprospecting as a conservation tool: history and background PRESTON SCOTT, WFED, 1816 Jefferson Place NW, Washington, D.C ; preston@wfed.

Transcription:

DNA Barcoding in Plants: Biodiversity Identification and Discovery University of Sao Paulo December 2009 W. John Kress Department of Botany National Museum of Natural History Smithsonian Institution

New Technologies for Taxonomy DNA Barcodes

UNITED STATES NATIONAL HERBARIUM 4.7 Million Specimens

NATIONAL MUSEUM OF NATURAL HISTORY 124 Million Specimens

DNA Barcodes A short universal gene sequence taken from a standardized portion of the genome used to identify species

Uses of DNA Barcodes 1. Research tool for taxonomists: To aid identification of species To expand species diagnoses to all life history stages, including fruits, seeds, dimorphic sexes, damaged specimens, gut contents, scats To test consistency of species definitions with a DNA measure of variability 2. Applied tool for users of taxonomy: To identify regulated species, including invasives To test purity and identity of biological products To assist ecologists in field studies of poorly known organisms 3. Discovery tool: To flag potentially new species, especially undescribed and cryptic species

The Barcoding Process - 2 parts 1. Populate the barcode library with known species Collect tissue from voucher specimen Extract DNA PCR/Amplify/cycle sequence gene(s) Sequence Database to answer compelling scientific 3. Put barcode sequences to work questions Ecological forensics 2. BLAST an unidentified Community ecology and specimen against the barcode phylogenetics library Sequence comparison New searching technologies Ultimately - handheld device?

Smithsonian s National Museum of Natural History Caribbean Sponges

Select plant material DNA Barcode Pipeline DNA Extraction PCR Robotic Sequencing Finished Barcode Data Editing L i b r a r y

The Primary Choice for Barcoding in Animals: the Mitochondrial Genome Cyt b D-Loop Small ribosomal RNA Large ribosomal RNA ND5 ND6 L-strand COI ND1 ND2 H-strand ND4 ND4L ND3 COIII COI COII ATPase subunit 8 ATPase subunit 6

What about Plants? Why were plants behind? Finding the right gene regions Mobilizing a consensus in the botanical community Finally. Consensus on gene regions Moving ahead

Criteria for DNA Barcoding Contains sufficient variation to discriminate between species Conserved flanks for universal primers All land plants Short, 300-800 bp Limited by current sequencing technology, cost consideration (= 1 read length), and ability to use degraded samples Sequence Quality

Three Genomes of Plant Cells for Barcode Candidates Chloroplast *High copy number *Conserved structure *Diversity of substitution rates across genes, introns, and intergenic spacers Nuclear *Contain the most variable loci *Problems with multigene families *Single-copy genes often technically difficult Mitochondrial *Locus of choice for animal barcoding is mitochondrial COI *Limitations with plants -Low divergence -Rapid genome rearrangements

Atropa vs. Nicotiana Chloroplast Genomes Complete Schmitz- Linneweber et al. 2002

Atropa vs. Nicotiana Chloroplast Genomes 1% divergence

Atropa vs. Nicotiana Chloroplast Genomes trnl-f trnv-atpe atpb-rbcl psbm-trnd ycf6-psbm trnc-ycf6 trnk-rps16 rpl36-rps8 2% difference 2% divergence trnh-psba

Top Plant Barcode Candidate: Intergenic Spacer trnh-psba CRITERIA FOR BARCODING Short, 300-800 bp trnh-psba = 450 bp Conserved flanks for universal primers trnh-psba = 93-100% success Contains sufficient variation to discriminate between species trnh-psba = 1.17%

A SINGLE-LOCUS PLANT BARCODE Option #1: Best Candidate Plastid Non-Coding trnh-psba Many Other Regions Proposed: accd, matk, ndhj, rbcl, rpoc1, rpob2, trnl, YCF5, UPA, ITS, CO1

SAMPLING AND PCR SUCCESS: 39 Orders of Land Plants

A SINGLE-LOCUS PLANT BARCODE: Comparative Results

A TWO-LOCUS PLANT BARCODE Hierarchical and Complementary rbcl = the Anchor (Plastid Coding Gene) + trnh-psba = the Identifier (Plastid Noncoding Spacer)

INTERGENIC SPACERS Indels, Alignment, and Repeats: Problems or Assets? Spacers for Identification (and localscale phylogenetics) Indels as added characters for ID Partial sequences are useful New Informatics Tools for Searching the Reference Database New technologies for solving problems Indel variation in segment of trnh-psba spacer among 57 species Do we need a coding gene??

An Alternative Two-Locus Plant Barcode CBOL Plant Working Group - 2009 U n i v e r s a l i t y Conclusion: rbcl + matk with trnh-psba & other spacers as alternative barcodes D i s c r i m i n a t i o n 156 Cryptogams 81 Gymnosperms 170 Angiosperms

A THREE-LOCUS PLANT BARCODE Hierarchical and Complementary matk rbcl = the Anchor (Plastid Coding Gene) + trnh-psba = the Identifier (Plastid Noncoding Spacer) + matk (Plastid Coding Gene)

Major Medicinal Plants of the World: An Applied Test of DNA Barcoding What is a medicinal plant? We used a consensus of four sources that list medicinal plants, primarily: World Economic Plants - A Standard Reference

Major Medicinal Plants of the World: An Applied Test of DNA Barcoding How we assembled our set: Selected ~1150 species Requested USDA germplasm USBG living collection Local gardens NMNH herbarium What we have: 768 species >168 Genera 113 Plant Families 4 accessions per species

Major Medicinal Plants of the World: An Applied Test of DNA Barcoding Two-locus approach: create backbone of tree with rbcl as the Anchor; then separate individuals species in smaller groups with trnh-psba as the Identifier Lamiales: Mentha Results: >94% success with rbcl/ trnh-psba rbcl Anchor trnh-psba Identifier

50-ha Forest Dynamics Plot on Barro Colorado Island, Panama Vital statistics of BCI Island in Panama Canal Premier Ecological Plot Research 296 tree Institute species Forest Science 1035 specimens (~3 accession/species) 180 Genera bal Earth Observatories 49 Families O) ~50% of genera have one species = easy test of barcoding forest research: monitoring mate change Why DNA Barcoding on BCI? Species identification *forensic/ecological Phylogenetic applications *species/community phylogenies *functional trait mapping

50-ha Forest Dynamics Plots Field Information Management System Collection Data Tab Geographic Data Tab Tissue Data Tab

50-ha Forest Dynamics Plot on Barro Colorado Island, Panama Institute nce ervatories h: monitoring Barcode Success trnh-psba* matk rbcla pcr seq pcr seq pcr seq 98% 95% 85% 69% 94% 94% ID Freq ID Freq ID Freq 95% 99% 75% *Note: ~8% of sequences are partial

50-ha Forest Dynamics Plot on Barro Colorado Island, Panama Species Identification = BLAST (Basic Local Alignment Search Tool) Designed to search for similarity among sequences Can quantify rates of resolution Use 281 barcode sequences as both library and query RESULTS rbcla + trnh-psba + matk: 98% of all samples could be assigned to correct Species All ambiguity was in 4 genera: Psychotria, Ficus, Inga, Piper 100% of sequences were assigned to correct Genus Partial sequences were assigned correctly

Barcodes and Forensic Ecology Barcode

Barcodes and Community Ecology The Components of Biodiversity Swenson 2009

Building a Community Phylogeny with Phylomatic Phylogenetically clustered = High Plateau, Low Plateau and Young Habitats Phylogenetically Overdispersed = Swamp and Slope Habitats Phylogenetically Random = Stream and Mixed Habitats

Building a Community Phylogeny with Barcodes: A Supermatrix of rbcl, matk, and trnh-psba rbcla *aligns unambiguously matk *aligned with backtranslation (AA) trnh-psba *aligned within ORDERS (Muscle), then orders placed within rbcla alignment with missing data coded for other Orders (MacClade) Trees *constructed with Parsimony (PAUP) and ML (Garli: GTR+I+Ѓ)

50-ha Forest Dynamics Plot on BCI, Panama (281 species): Community Phylogeny using a Supermatrix Approach with rbcl/trnhpsba/matk

A Comparison of Ordinal and Family Relationships on BCI Asterids 50-ha Forest Dynamics Plot on BCI, Panama (282 (281 species): Community Phylogeny of 23 Orders using using a a Supermatrix Approach with rbcl/trnh-psba psba/matk

Barcodes vs. Phylomatic vs. 50-ha Forest Dynamics Plot on BCI, Panama (282 (281 species): Community Phylogeny of 23 Orders using using a a Supermatrix Approach with rbcl/trnh-psba psba/matk Overall Rubiaceae Tree: < 50% resolution vs >97% resolution

Barcodes vs. Phylomatic 50-ha Forest Dynamics Plot on BCI, Panama (282 (281 species): Community Phylogeny using a Supermatrix Approach with rbcl/trnh-psba psba/matk Phylomatic Phylogeny: Phylogenetically clustered = High Plateau, Low Plateau and Young Habitats Phylogenetically Overdispersed = Swamp and Slope Habitats Phylogenetically Random = Stream and Mixed Habitats Barcode Phylogeny: Phylogenetically clustered = Low Plateau and Slope Habitats Phylogenetically Overdispersed = High Plateau, Mixed and Young Habitats = Phylogenetically Random = Stream and Swamp Habitats Net Relatedness Index (NRI)

50-ha Forest Dynamics Plot on BCI, Panama (281 species): Community Phylogeny using a Supermatrix Approach with rbcl/trnhpsba/matk Functional Trait Analysis

Phylogenies and Community Ecology Community Assembly, Productivity, Stability, Functional Trait Evolution Swenson 2009

Center for Tropical Forest Science Smithsonian Institution Global Earth Observatories (SIGEO) 22 Established Sites (Black) 12 Candidate Sites (Blue) Barcoding Initiated (Red) Smithsonian Tropical Research Institute Center for Tropical Forest Science ** * * * * Smithsonian Institution Global * Earth Observatories (SIGEO) * * * * * A global program of long-term forest research: monitoring * the impact of climate change Purpose: *Forest Dynamics *Climate Change Expanding *Conservation the network!

Smithsonian Center for Tropical Institution Forest Global Science Earth Smithsonian Observatories Institution (SIGEO) Global Earth Observatories (SIGEO)

DNA Barcoding in Plants: Biodiversity Identification and Discovery Dave Erickson Ken Wurdack Liz Zimmer Dan Janzen Lee Weigt Ling Zhang Nate Swenson Andy Jones Oris Sanjur Jamie Whitaker Ida Lopez Stuart Davies Joe Wright Biff Bermingham Scott Miller W. John Kress Department of Botany National Museum of Natural History Smithsonian Institution University of Sao Paulo December 2009