Whole genome sequencing of foodborne pathogens: experiences from the Reference Laboratory. Kathie Grant Gastrointestinal Bacteria Reference Unit



Similar documents
Molecular typing of VTEC: from PFGE to NGS-based phylogeny

Provisioning robust automated analytical pipelines for whole genome-based public health microbiological typing

A Salmonella Geno-Serotyping Array (SGSA) for the Rapid Classification of Serovars

Use of Whole Genome Sequencing (WGS) of food-borne pathogens for public health protection

Typing in the NGS era: The way forward!

Future Perspectives in Public Health Microbiology

Identification and Characterization of Foodborne Pathogens by Whole Genome Sequencing: A Shift in Paradigm

Laboratory Protocols Level 2 Training Course Serotypning of Salmonella enterica O and H antigen.

Databases and platforms for data analysis from NGS of MTB

EU Reference Laboratory for E. coli Department of Veterinary Public Health and Food Safety Unit of Foodborne Zoonoses Istituto Superiore di Sanità

Workshop on Methods for Isolation and Identification of Campylobacter spp. June 13-17, 2005

Use of Whole Genome. of food-borne pathogens for public health protection. Efsa Scientific Colloquium Summary Report June 2014, Parma, Italy

PreciseTM Whitepaper

SeqScape Software Version 2.5 Comprehensive Analysis Solution for Resequencing Applications

Automated Lab Management for Illumina SeqLab

Annual Report on Zoonoses in Denmark 2013

The National Antimicrobial Resistance Monitoring System (NARMS)

QUALITY AND SAFETY TESTING

UKB_WCSGAX: UK Biobank 500K Samples Genotyping Data Generation by the Affymetrix Research Services Laboratory. April, 2015

Quality Assurance and Validation of Next Generation Sequencing

Bioruptor NGS: Unbiased DNA shearing for Next-Generation Sequencing

Establishing a Whole Genome Sequence-Based national network for the detection and traceback of Foodborne Pathogens

Genomic Services and Development Unit User Manual

National Zoonoses Conference (NZC), O Reilly Hall, University College Dublin, Wednesday-8 th June, 2011.

Introduction To Real Time Quantitative PCR (qpcr)

Standard Op PulseNet QA/QC Manual. Standards

A Fast, Accurate, and Automated Workflow for Multi Locus Sequence Typing of Bacterial Isolates

Introduction to next-generation sequencing data

TaqMan Genotyper Software v1.0.1 TaqMan Genotyping Data Analysis Software

Network Activities PTs on VTEC detection and typing and training initiatives

Appendix 2 Molecular Biology Core Curriculum. Websites and Other Resources

AGILENT S BIOINFORMATICS ANALYSIS SOFTWARE

Development of a Salmonella source-attribution model for evaluating targets in the turkey meat production 1

Does Big Data offer Better Solutions for Microbial Food Safety and Quality?

Next generation DNA sequencing technologies. theory & prac-ce

Delivering the power of the world s most successful genomics platform

Data Analysis for Ion Torrent Sequencing

Automated Library Preparation for Next-Generation Sequencing

FIVS 316 BIOTECHNOLOGY & FORENSICS Syllabus - Lecture followed by Laboratory

Molecular Diagnostics in the Clinical Microbiology Laboratory

Rapid Aneuploidy and CNV Detection in Single Cells using the MiSeq System

Molecular and Cell Biology Laboratory (BIOL-UA 223) Instructor: Ignatius Tan Phone: Office: 764 Brown

Basic Course on Bioinformatics tools for Next Generation Sequencing data mining June, 2015 Istituto Superiore di Sanità, SIDBAE Training Room

Cloud Computing Solutions for Genomics Across Geographic, Institutional and Economic Barriers

TruSeq Custom Amplicon v1.5

Core Functions and Capabilities. Laboratory Services

Single-Cell Whole Genome Sequencing on the C1 System: a Performance Evaluation

IKDT Laboratory. IKDT as Service Lab (CRO) for Molecular Diagnostics

Identification of the VTEC serogroups mainly associated with human infections by conventional PCR amplification of O-associated genes

Originally published as:

Single-Cell DNA Sequencing with the C 1. Single-Cell Auto Prep System. Reveal hidden populations and genetic diversity within complex samples

Innovations in Molecular Epidemiology

ASSOCIATION OF PUBLIC HEALTH LABORATORIES

Multiplex your most important

Annual Report on Zoonoses in Denmark 2014

Multi-locus sequence typing (MLST) of C. jejuni infections in the United States Patrick Kwan, PhD

05/11/2014. Food Safety in PAHO. Enrique Perez Gutierrez, Department of Communicable Diseases and Health Analysis, PAHO/WHO

An example of bioinformatics application on plant breeding projects in Rijk Zwaan

Metagenomics revisits the one pathogen/one disease postulates and translate the One Health concept into action

Bacterial Next Generation Sequencing - nur mehr Daten oder auch mehr Wissen? Dag Harmsen Univ. Münster, Germany dharmsen@uni-muenster.

MiSeq: Imaging and Base Calling

Nazneen Aziz, PhD. Director, Molecular Medicine Transformation Program Office

Introduction to Proteomics 1.0

Version 5.0 Release Notes

Data deluge (and it s applications) Gianluigi Zanetti. Data deluge. (and its applications) Gianluigi Zanetti

Lecture 11 Data storage and LIMS solutions. Stéphane LE CROM

Molecular Approaches to Understanding Transmission and Source Attribution in Nontyphoidal Salmonella and Their Application in Africa

How many of you have checked out the web site on protein-dna interactions?

Data Processing of Nextera Mate Pair Reads on Illumina Sequencing Platforms

Replacing TaqMan SNP Genotyping Assays that Fail Applied Biosystems Manufacturing Quality Control. Begin

Biotracker TM A Laboratory Information Management System By Ocimum Biosolutions

Importance of Statistics in creating high dimensional data

Next Generation Sequencing in Public Health Laboratories Survey Results

Mehrwert von Next Generation Sequencing zur molekularen Charakterisierung von Meningokokken

Core Facility Genomics

Automated and Scalable Data Management System for Genome Sequencing Data

Molecular Assessment of Dried Blood Spot Quality during Development of a Novel Automated. Screening

Next Generation Sequencing: Technology, Mapping, and Analysis

Biorepository and Biobanking

CHAPTER 13. Quality Control/Quality Assurance

Mascot Integra: Data management for Proteomics ASMS 2004

G E N OM I C S S E RV I C ES

Focusing on results not data comprehensive data analysis for targeted next generation sequencing

UK Standards for Microbiology Investigations

A Primer of Genome Science THIRD

Sampling and the Health Protection Agency

Mass Storage Use Cases April 21, 2011

Cell Discovery 360: Explore more possibilities.

Sample storage survey

Bioinformatics Unit Department of Biological Services. Get to know us

Automation in genomics High-throughput management of biological samples and nucleic acids. Valentina Gualdi Operational Scientist PGP

SeqPig: simple and scalable scripting for large sequencing data sets in Hadoop

Specialty Lab Informatics and its role in a large academic medical center

Cloud BioLinux: Pre-configured and On-demand Bioinformatics Computing for the Genomics Community

General Services Administration Federal Supply Service Authorized Federal Supply Schedule Price List

Chapter 8: Recombinant DNA 2002 by W. H. Freeman and Company Chapter 8: Recombinant DNA 2002 by W. H. Freeman and Company

The CVN Development Programme a 4-month update

Forensic DNA Testing Terminology

Practical Solutions for Big Data Analytics

Transcription:

Whole genome sequencing of foodborne pathogens: experiences from the Reference Laboratory Kathie Grant Gastrointestinal Bacteria Reference Unit 16 th June 2014

Planning for Implementation of WGS 2011-2014 PHE investment: financial, laboratory, bioinformatics, staff, training Prioritise organisms for routine WGS Practical implementation 2

PHE WGS Sequencing Service Hardware Two HiSeq 2500 high-throughput sequencers Two MiSeq machines Capacity ~ 3,000 genomes per week

Infrastructure Data storage warehouse Generators & Coolers 4

Skills Developing a Bioinformatics Capability Files system: Machine Date Sample FastQ Analysis Results Automated workflows Ad hoc /development pipelines Research/non routine pipelines: Galaxy Authorisers Bioinformaticians Lab staff, researchers Users

Sample Workflow 6 Information Flow 1) Nucleic Acid sent to NGSS along with sample Info. 3) Batch of samples pre-processed (ROBOTICS) 5) Sample batch run on Machine Metrics imported in NGS LIMS QC metrics imported to NGS LIMS Bulk Data temp stored 2) Sample Info to NGS LIMS 4) NGS LIMS exports sample sheet for HiSeq 6) Bulk data processed 7) QC data stored 8) Sample Fastq s stored 9) Detailed sample Info linked with NGS data and metrics 10) Requestor accesses files, data metrics and results in web interface Data and Bioinformatics Workflow

Sequencing Service Validation reliable sample handling processes through the robotics reproducible high-quality data consistent linking of meta-data through the whole sample workflow. reliable capture of quality metrics into NGS LIMS ISO15189 Accreditation 7

8 Salmonella NGS at PHE Salmonella classification is complicated 20 th Century

Current GBRU Typing Methods for Salmonella Subspeciation Real time TaqMan PCR assays - target three different genes OmniLog ID System (Biolog) - phenotypic microarray Serotyping Agglutination with specific antisera against LPS & flagella (O & H antigens) - Slide agglutination - Microtitre plates - Dreyer s tubes 9 Serotyping

Current Sub-typing Methods for Salmonella - Phage typing - e.g. Typhimurium DT1, DT193 - Multi-locus Variable Number Tandem Repeat Analysis (MLVA) - e.g. 4-13-13-10-0211 - Pulsed-field gel electrophoresis (PFGE) - e.g. SNWPXB.0010 10 Sub-typing

Issues with existing Salmonella typing methods Turn around times too long Serotyping: Phage typing: Originally 25 days Reduced this year to 17 days Originally 20 days Reduced this year to 10 days PFGE: 4 days, VNTR: 2 days Biological Not a true classification compared with sequence based typing Safety problems Isolates identified local clinical lab as CL2 serovar Referred to SRS and handled at CL2 Identified by reference lab as CL3 Quality Typing methods can be difficult to standardise including existing molecular methods 11 Presentation title - edit in Header and Footer

Salmonella identified as a priority organism Use of whole genome sequencing to replace lengthy laboratory methods and improve safety, quality Serotying Phage typing PFGE MLVA Salmonella NGS Project 95% WGS provides opportunity for identification and typing using a single method WGS = MLST data + SNP detection + lots of other interesting data 12 WGS - 2013

Salmonella population structure is complicated 21 st Century Minimal spanning tree of MLST data for S. enterica subspecies enterica Each circle corresponds to a sequence type (ST) The size is proportional to the number of isolates ebgs are natural clusters of genetically related isolates Increasing distance equates to fewer shared alleles MLST STs correlate with serotypes Achtman et al., 2012 13 Salmonella NGS at PHE

90% Distribution of serotypes 14

Salmonella NGS Project - 2013 Validation set Common Serovar No of Isolates 1500 strains selected for sequencing 1000 common strains representative of 2012 (10%) - >50% Salmonella Enteritidis & Salmonella Typhimurium - different phage types Salmonella Enteritidis 364 Salmonella Typhimurium 337 Salmonella Infantis 36 Salmonella Typhi 36 Salmonella Java 33 Salmonella Paratyphi A 33 Salmonella Newport 32 Salmonella Virchow 31 Salmonella Kentucky 22 500 strains of less common serovars proportional representation of 2012 Salmonella Stanley 20 Salmonella Braenderup 19 Salmonella Montevideo 19 Salmonella Agona 18 1000 15 NGS - 2013

Back to the 20 th Century PHE in the 21 st Century 1 st phase validation Salmonella sample from 2012 Sequencing of 1500 representative Salmonella results compared with serotyping PHE Bioinformatics pipeline Quality trim KmerID to check purity Short read sequence typing to determine MLST MLST/EBG-serotype 16 Salmonella Presentation NGS title at - edit PHEin Header and Footer

Salmonella NGS project - workflow Innoculate broth culture (overnight growth) or use growth on slopes Automated Genomic Extraction QiaSymphony 96 well Measure DNA quantity & quality Glomax/Labchip Trouble shooting! 30ng/ ul 260/2 80260 /230 =1.8 Automated Library preparation (Nextera) Sequence on HiSeq2500 (Rapid run) Automated Bioinformatics Analysis pipeline development, analysis tools NGS project workflow

Results - MLST derivation WGS MLST derived grouping correlated with traditional serogroup > 94% for the common serovars (Common serovars make up to 90% of the workload) But lower correlation with rarer serovars Current MLST database Only 900 out of 2600 serotypes have been assigned MLST profiles - Mis-matches between serotypes and MLST serogroups MLST

Back to the 20 th Century PHE in the 21 st Century 2 nd phase validation Routine use of sequencing Sample received by reference lab Reported to customer MLST/serotype If the strain belongs to serotype in our Top 14 it goes for SNP analysis Reporting to customer 19 Salmonella NGS at PHE

Detection of Salmonella outbreaks At PHE, laboratory and epidemiological staff work closely together to detect and investigate outbreaks Currently, this is done on the basis of serotype, phage type, MLVA and PFGE - these techniques have varying resolution and molecular typing not performed on every isolate Use an exceedance above what we would expect to see as background, before an outbreak investigation is triggered The more common a serotype is, the harder it is to spot an outbreak 20 Salmonella WGS at PHE

Top 14 serotypes SNP typing IPython Notebook - bit.ly/1t2g5kl David Powell 21 Salmonella NGS at PHE

Top 14 serotypes SNP typing Challenges: Many EBGs Hundreds of strains a week Rapid, hands-off analysis Solution SNPdatabase (SNPdb): 30 mins - parallel db EBG 1 - Typhimurium 3-30 minutes Sample FASTQs (with ST) db db db db EBG 3 - Newport EBG 4 - Enteritidis EBG 11 Paratyphi A EBG 13 - Typhi 22 Presentation Salmonella WGS title -at edit PHE in Header and Footer

Uploading data into Short Read Archive NCBI BioProject accession: PRJNA248064 23 Salmonella WGS at PHE

Salmonella Mikawasima Outbreak WGS Analysis Dec 2013 Dec 2013 increase in Salmonella Mikawasima in England, Wales, Scotland Several different PFGE profiles but 2 predominant ones Sequenced 109 isolates England & Wales, 11 Scotland and included in analysis 38 sequenced in Denmark (SSI, DTU) 80 from 2013, 28 2012 44 isolates with OB PFGE profile clustered <10 SNPs (31 E, 10 D, 3 S) also 3 isolates with different PFGE profile 4 with this PFGE profile formed distinct cluster (<10SNPs) with isolate from 2009 6 isolates with 2 nd OB profile clustered with Scottish isolate with different profile Colours represent different PFGE profiles 24 International working and collaboration

Acknowledgements This presentation is the result of work across PHE particularly Genomic Services Unit, Bioinformatics Unit and the Gastrointestinal Bacteria Reference Unit GBRU Elizabeth de Pinna, Tansy Peters, Satheesh Nair, Tim Dallman, Phil Ashton and other staff in the lab Genomic Services Unit Cath Arnold and team Bioinformatics Unit Jonathon Green, Anthony Underwood, Rediat Tewolde 25 Salmonella WGS at PHE