GenomeTrakr: A Pathogen Database Marc W. Allard, PhD Senior Biomedical Research Services Officer Division of Microbiology Eric W. Brown, PhD Director Division of Microbiology FAO Expert workshop to develop case studies on the use of Whole Genome Sequencing (WGS) on food safety managment. Nov. 12, 2015
Outline Technology shift GenomeTrakr: Reference database and pathogen detection pipeline Benefits to industry, growers, and distributers. 2
Outline Technology shift GenomeTrakr: Reference database and pathogen detection pipeline Benefits to industry, growers, and distributers. 3
Current method for pathogen identification 1. antigens are screened to identify serovar 2. PFGE: genome is cut into pieces. Sizes of these pieces and the banding patterns they determine discrimination within serovar. PulseNet http://www.cdc.gov/pulsenet/ 4
PFGE v/s WGS WGS is high resolution 3-5 million data points are collected for each isolate WGS analyses are statistically robust Unlike PFGE patterns, WGS data can be analyzed in its evolutionary context. Accurate and stable genetic changes within pathogen genomes enable us to pin point specific common sources of outbreak strains (farms, processing plants, food types, and geographic regions). Source Tracking is Key Application
PFGE identical in red NGS distinguishes geographical structure among closely related Salmonella Bareilly strains
SNP phylogeny for S. Bareilly strains Same PFGE but not part of the outbreak Outbreak Isolates 2-5 SNPs
8
S. Braenderup
Outline Technology shift GenomeTrakr: reference database and pathogen detection pipeline Benefits to industry, growers, and distributers. 10
Bases of DNA (ATGC) are sequentially identified from a DNA template strand DNA Sequencing $3,500 Cost per bacterial genome Next Generation Sequencing (NGS) extends this process across millions of reactions in a massive parallel fashion $3,000 $2,500 $2,000 $1,500 FDA 1st Desk-top NGS involves rapid sequencing of large DNA stretches spanning entire genomes Technology shift 3-5 million data points for each isolate Increasing availability and affordability of NGS is rapidly changing the face of microbiology $1,000 $500 $0 2007 2008 2009 2010 2011 2012 2013 $70/genome in 2014 $40/genome in 2015 w/ Higher througput Technology
GenomeTrakr Fast Facts First distributed network of labs to utilize WGS for pathogen identification GenomeTrakr network has sequenced more than 40,000 isolates, and closed more than 100 genomes through November 12, 2015. Currently sequencing more than 1,000 isolates a month The need for increased number of well characterized environmental (food, water, facility, etc.) sequences may outweigh the need for extensive clinical samples
GenomeTrakr Labs 14 federal labs 14 state and university labs 1 U.S. hospital lab 5 labs outside of the U.S. Collaborations with independent academic researchers More GenomeTrakr labs coming on-line
Total Number of Sequences in the GenomeTrakr Database Average Number of Sequences Added Per Month in 2013 = 184 Average Number of Sequences Added Per Month in 2014 = 1,049 Number of Sequences (as of the last day of the quarter) Public Health England uploads more than 8,000 Salmonella sequences First sequences uploaded in Feb 2013 2013 2014 2015 15
Timeline for Foodborne Illness Investigation Using Whole Genome Sequencing 40 FDA, CDC, FSIS, and States use WGS in real-time and in parallel on clinical, food, and environmental samples Source of contamination identified early through WGS combined database queries 35 30 Number of Cases 25 20 15 Contaminated food enters commerce Averted Illnesses 10 5 0 4 8 12 16 20 24 28 32 36 40 44 48 52 56 60 64 68 Days
MINIMAL PATHOGEN METADATA (FOODBORNE OUTBREAKS) sample_name organism strain/isolate What Category (attribute_package) 1a) Clinical/Host-associated 1a1) specific_host 1a2) isolation_source 1a3) host-disease OR 1b) Environmental/Food/Other 1b1) isolation_source collection_date Geographic location 6a) geo_loc_name OR 6b) lat_lon collected by When Where Who Food Industry can hold confidential metadata linked to public records
Immediate impacts of WGS to industry, growers, and distributers, countries, states. Earlier intervention means: 1) Reduced amount of recalled product; 2) fewer sick patients which means fewer lawsuits; 3) less impact overall and minimal damage to brand recognition.
Impacts to industry, growers, and distributers (continued). Regular testing throughout network: 1) identifies specific suppliers that are introducing contaminants; 2) identifies whether contaminant is resident to a facility or transient; 3) knowledge of where contaminant is coming from allows industry to fix the problem based on scientific evidence. Shift costs to the supplier who has introduced the contaminant. How often is the root cause of the problem left unresolved to occur again at a later date? 21
Background: CFSAN SNP Pipeline Intended for use by bioinformaticists (Linux) Documentation: http://snppipeline.rtfd.org Source Code: https://github.com/cfsan- Biostatistics/snp-pipeline PyPI Distribution: https://pypi.python.org/pypi/snppipeline Pettengill JB, Luo Y, Davis S, Chen Y, Gonzalez- Escalona N, Ottesen A, Rand H, Allard MW, Strain E. (2014) An evaluation of alternative methods for constructing phylogenies from whole genome sequence data: a case study with Salmonella. PeerJ 2:e620 http://dx.doi.org/10.7717/peerj.620
Molecular Epidemiology and Ecology of Multi-drug Resistance (MDR) Salmonella in Tanzania Julius Medardus Sokoine University of agriculture Wondwossen A. Gebreyes Gebreyes.1@osu.edu
ICOPHAI GenomeTrakr partnership
Salmonella lose or gain resistance depending on the ecosystem. GIT- ACSSuT (Briggs and Fratamico, 1999)? Environment- SSu (Gebreyes and Altier, 2002 Gebreyes et al., 2004, 2009) 25
What triggers the recombination? Interaction between bacterial factors and and chemical intervention in pig production Important Element In MDR QAC Common In the Environment Quaternary ammonium compounds qace on integrons and Quats are commonly used as disinfectants. 26
Co-selection: Heavy metal v. MDR Heavy metals in the ecosystem- Cu and Zn; Assoc. b/n AMR-type and HM- MIC; Co-selection with MDR; Association with Invasive NTS strains?; Efflux pump genes- pcoa and czcd; 27
Association- Heavy Metal tolerance and MDR Salmonella Odds ratio between copper tolerance (<20mM) and MDR AmStTeKm was 4.6 (Chi-square=17.9; P<0.05) The odds of having a high Zn MIC (>8mM) were 14.66 times higher in isolates with R-type AmClStSuTe than in those with R-type AmStTeKm(P<0.05). [Medardus et al., 2014] 28
FDA GenomeTrakr partnership 924 isolates submitted to FDA-CFSAN Brazil (4) Ethiopia (401) Kenya (86) Mexico (63) Tanzania (64) Thailand (60) U.S. OSU (247) 29
Tanzania WGS- 45 food animal isolates completed All Unknown STs Plasmid types- ColRNAI, IncI1, IncI2, IncFII, ColpV2 (total 10)- Others? Kentucky (16/ 45) and Not conforming with any known type (n=8) Pending- HM and biocide tolerance genes/ efflux system Comparison with isolates of human origin?
COMPARE is a large EU project with the intention to speed up the detection of and response to disease outbreaks among humans and animals worldwide through the use of new genome technology. The above figure represents genomic information as the pathogen-independent language across locations, sectors and time. http://www.compare-europe.eu/ Coordinator Frank M. Aarestrup Technical University of Denmark National Food Institute fmaa@food.dtu.dk Tel: +45 35 88 62 81 Co-Coordinator Marion Koopmans Erasmus Medical Centre Department of Viroscience m.koopmans@erasmusmc.nl Tel: +31 10 70 44 066 31
Whole Genome Sequencing Program (WGS) GenomeTrakr State and Federal laboratory network collecting and sharing genomic data from foodborne pathogens Distributed sequencing based network Partner with NIH Open-access genomic reference database http://www.ncbi.nlm.nih.gov/bioproject/183844 Can be used to find the contamination sources of current and future outbreaks http://www.fda.gov/food/foodscienceresearch/wholegenomesequencingprogramwgs/default.htm#trakr
For more information: For information about joining the GenomeTrakr network as a sequencing lab, providing isolates to a current member lab for sequencing, or using the GenomeTrakr database as a research tool, please contact FDA at FoodWGS@fda.hhs.gov
With Additional Thanks. FDA Steven Musser Patrick McDermott Ruth Timme Marc Allard Peter Evans Eric Brown Justin Payne Charlie Wang Rebecca Bell Christine Keys Errol Strain Yan Luo James Pettengill Hugh Rand Darcy Hanes Gopal Gopinathrao Chis Grim Palmer Orlandi David Melka Cary Pirone Davies Justin Payne Maria Hoffman Eric Stevens Andrea Ottesen Tim McGrath Don Burr Jie Zheng Cong Li George Kastanis Tim Muravunda Shaohua Zhao National Institutes of health (NCBI) David Lipman Jim Ostell William Klimke Martin Shumway Richa Agarwala State Health Labs Bill Wolfgang (NY) Dave Boxrud (MN) Anita Wright (FL) Elizabeth Driebe (AZ) Angela Fritzinger (VA) Ailyn Perez-Osorio (WA) More to come. USDA David Goldman Kristin Holt Illumina Susan Knowles Omayma Al-Awar Kelly Hoon And a Growing Cast of Colleagues.
INTERNAL FDA STAKEHOLDERS ORA OCC OFS OC OAO OFVM/SRSC CFSAN CDER CBER CDRH CVM NCTR FDA CHIEF SCIENTIST OIP OARSA SCIENCE BOARD IAS FFC FERN JIFSAN ADVISORY COMMITTEE IFSH MOFFETT CENTER CIO DAUPHIN ISLAND CFSAN-OCD CORE WESTERN CENTER FDLI GMA VaFSTF CDC FBI PULSENET-LATIN AM. AM. ACAD MICROBIOL ASM FSIS ARS UNIV VERMONT MINN DOH AZ DOH UNIV FL VA DOH WA DOH TX DOH NY AG LAB IRISH FSA NOVA SE UNIV IGS BALTIMORE INFORM MEETING HONGKONG POLYT U NIST ITALIAN FSA EFSA WHO-FOOD SAFETT DIR. WHO-GFN CDC-EU EMERGING INFECTIOUS DIS CONF DANISH TECH UNIV NM STATE UNIV/ NM DOH CARLOS MALBRAN INST/ARG ST COULD UNIV/FOOD MICRO SENASICA GMI NY DOH/WADSWORTH CENT UNIV HAMBURG CHINA CDC NESTLE FERA-UK MD DOH IAFP APHL AFDO BELGIUM VaTech US ARMY US NAVY MELBOURNE FSA (AUS) UNIV NEBRASKA PUBLIC HEALTH ENGLAND DHS DELMARVA TASKFORCE PENN STATE FOOD SCIENCE PROD MAN ASSOC ILLUMINA UNIV IRELAND/DUBLIN COLLEGE NCBI/NIH GSRS GLOBAL SUMMIT FAO/OIE PUBLIC HEALTH CANADA CFIA HEALTH CANADA INTL VTEC MEETING CPS-GA AOAC UNITED FRESH COLUMBIA HAWAII DOH CA DOH ALASKA DOH SOUTH DAK UNIV UNIV GA UNIV IOWA/DOH UNIV CHILE BRAZIL OSU VETNET TURKEY MEXICO IEH SILLAKER NEW ENG BIOLAB PACIFIC BIO CLC-BIO/QIAGEN CON-AGRA DUPONT AGILENT UC-DAVIS HARVARD MED INFORM MEETING THAILAND
FDA\CFSAN Validation Efforts 1. Technical Performance Accuracy: Salmonella LT2 and Agona SL483 2. Intralaboratory variation, sequencing platform Salmonella Montevideo (180+ runs) 3. Interlaboratory variation Salmonella Braenderup BAA-664 (PFGE control), ISO/CEN 4. Bioinformatics Pipeline Software Validation Collaborations w/ Canada, CDC, NIH/NCBI
Salmonella Braenderup Interlaboratory Study 37
Salmonella Braenderup
Salmonella Braenderup Environmental Samples from Florida Contract Lab FDA\CFSAN FDA\CFSAN (454) 39