Innovations in Molecular Epidemiology
Molecular Epidemiology Measure current rates of active transmission Determine whether recurrent tuberculosis is attributable to exogenous reinfection Determine whether all strains in a population exert similar epidemiological characteristics Understand transmission dynamics Identify outbreaks or extensive transmission from apparently sporadic, epidemiologically unrelated cases
Which approach to choose? stability of biomarker - the observed rate of polymorphism the genetic diversity of strains in the population The rate of change of the chosen biomarker must be able to distinguish epidemiologically unrelated strains and yet sufficiently slow and stable to reliably link related cases
Genetic Variation Generation Strategies in Bacteria Can be divided into just three categories: 1. Small local changes in nucleotide sequence of genome (e.g. single nucleotide polymorphisms or SNPs ) 2. Intragenomic reshuffling and deletion of segments of genomic sequence 3. Acquisition of DNA sequences from another organism
Different levels of resolution are obtained from different markers SNPs LSPs Spoligo VNTR/ MIRU IS6110 Low resolution high resolution
1. SNPs Mutational generation of a completely novel biological function unlikely: None described in M. tuberculosis Stepwise improvement of already available biological functions: a study of 26 structural genes from 842 strains of Mtb showed 95% of mutations linked to antibiotic resistance (rpob, gyra ) Silent (Synonymous) SNPs that do not affect the biological function: <subject to selective pressure Strain typing, speciation
Strain typing Broad evolutionary scenario for M. tuberculosis complex organisms characterised by KatG codon 463 (Leu) and GyrA codon 95 (Thr) into three Principal Genetic Groups (PGG1-3). Taken from Sreevatsan et al, 1997
Speciation SNPs used: RpoB ID short sequence assay gyrb SNP position 675 gyrb SNP position 756 gyrb SNP position 1410 gyrb SNP position 1450
2. Intragenomic Reshuffling of Segments of DNA Occurs via recombination of related sequences and can lead to: Novel combination of capacities by the fusion of different functional domains Reassortment of expression control signals with different reading frames serving in protein production Duplication of DNA segments serving as substrates for evolution Deletion of DNA segments to remove inessential sequences
Different levels of resolution are obtained from different markers SNPs LSPs Spoligo VNTR/ MIRU IS6110 Low resolution high resolution
Deletion analysis Taken from Brosch et al, 2002
Large Sequence Polymorphisms Gagneux a global phylogeny of M. tuberculosis complex, based on 89 concatenated gene sequences in 108 strains. Coloured branches indicate the main strain lineages (from Hershberg et al. PLoS Biology 2008). Frequency of subtypes based on 40,000 spoligotypes in spoldb4 Beijing 11.3% EAI 8.8% H 11.2% LAM 13.4% T 25% X 9%
Large Sequence Polymorphisms Global phylogeography of M. tuberculosis Taken from Gagneux S and Small PM, Lancet Infect Dis 7:328-337 2007
Different levels of resolution are obtained from different markers SNPs LSPs Spoligo VNTR/ MIRU IS6110 Low resolution high resolution
Spoligotyping
M. tuberculosis Spoligotyping 1-34 35 36 37 38 39 40 41 42 43 PCR Amplification Primer Hybridization and Extension G C Detection Developed by SEQUENOM, slide courtesy of Dr Christiane Honisch Spacer Regions 35 43 Direct Repeat Region Direct Repeat PCR Primer Extension Primer Spacer 38 (+) extension product detected Spacer 03 (-) extension product not detected unextended primer only
Different levels of resolution are obtained from different markers SNPs LSPs Spoligo VNTR/ MIRU IS6110 Low resolution high resolution
VNTR/MIRU Variable number tandem repeat (VNTR) variation is thought to caused by slipped-strand mispairing. The peculiar tertiary structure of repetitive DNA allows mis-matching of neighbouring repeats, and, depending on the strand orientation, repeats can be inserted or deleted during replication. Variation in repeat numbers and sequence degeneracy can be explained by DNA recombination between multiple loci consisting of homologous repeat motifs.
Genesis of repeats in M. tuberculosis Requires pre-existing small repeats ( seeds) Short stem-loop structure aids slipped-strand mispairing G C C G G C C G C G G C
VNTR sequence alignment 10 20 30 40 50 60........................................ CDC1551MIRU10 -------ATGGCGCCGCTCCTCCTCATCGCTGCGCTCTGCATCGTCGCCGGCGGTAGTTA------ MIRU 10 repeat 2 -------...C.------ MIRU 10 repeat 3 -------...C.------ MIRU 10 repeat 4 -------...C.------ MIRU 10 repeat 5 -------...CGG.GGTCAT--- CDC1551MIRU23 ---------T...T...A...CG.C.C.------ MIRU 23 repeat 2 -------TCT...T...A...CG.C.C.------ MIRU 23 repeat 3 -------TCT...G...T...A...CG.C.C.------ MIRU 23 repeat 4 -------TCT...G...T...A...CG.C.C.------ MIRU 23 repeat 5? -------TCT...-...T...A...CGCA.GGTCAGCG CDC1551MIRU26 --------AA...--GAGGTCA---- MIRU 26 repeat 2 ---------A...--GAGGTCA---- MIRU26 repeat 3 ---------A...--GAGGTCA---- MIRU26 repeat 4 ---------A...--GAGGTCA---- MIRU 26 repeat 5? ----------...--GAGGTCACAGA CDC1551MIRU16 ------------...T...T...CGGT.C.------ MIRU16 repeat 2 -------CGA...T...T...CGGT.C.------ MIRU 16 repeat 3? -------CGA...T...T...CGGC.C.CGTGG- CDC1551ETRE -----------------...CC.ACC.------ ETRE repeat 2 -------TCT...CC.ACC.------ ETRE repeat 3 -------TCT...CG.AGC.GCG--- CDC1551 MIRU2 --------A...T...G...T...CG...C.------ MIRU 2 repeat 2 -------TA...T...G...CG...C.------ MIRU 2 repeat 3 -------TA...TC..C.GCAAG..G.AGG...CC.CA.CT.ATGT..G..C.ACT--- CDC1551MIRU39 ----------...T...CGG.CCG------ MIRU 39 repeat 2 -------T...T...CGG..C.------ CDC1551QUB5 -----------------------A.G..AGATT-.A...CGG..C.------ QUB 5 repeat 2 aga -------C...C...TT...CGG..C.CTGGC- QUB5 repeat 3 -------C...C...TT...CGG..C.------ QUB5 repeat 4 -------C...C...TT...CGG..C.ATCG-- CDC1551MIRU24 repe TGCTTCG...AG...C------------
Different levels of resolution are obtained from different markers SNPs LSPs Spoligo VNTR/ MIRU IS6110 Low resolution high resolution
Transposable Elements IS6110 Transposon enzyme generated Transposon cut out and inserted in new location ( hotspot ) Resulting in disrupted gene Transposon mediated DNA rearrangements are often regarded as a major evolutionary driving force, however repetitive DNA sequences also play a role. In M. tuberculosis the role of repetitive sequences may be greater.
IS6110 typing 0 bp 4.4Mbp 1355 base pairs containing an ORF encoding a transposase PvuII restriction followed by southern blotting and hybridisation with IS6110 probe 0-20 copies in M. tuberculosis. 8% UK isolates, 40% India isolates and most M. bovis isolates possess only a single copy of IS6110 The drawbacks of IS6110 RFLP are widely reported and include: The need for extended culturing time-consuming and can take weeks from start to completion lower throughput costly methodology
RFLP Exact sizing of fragments is also an issue, especially when considering interlaboratory comparison of data.
IS6110 FAFLP of M. tuberculosis IS6110 primer IS6110 Frequent cutter (TaqI) TaqI adaptor-specific primer Genomic DNA IS6110 blue labelled fragments
In silico (H37Rv) 75.24 86.49 97.95 126.12 141.27 163.36 173.74 190.60 198.69 210.58 84.88 Expected fragments in this window: 75, 85, 87, 100, 127, 141, 163, 172, 190, 211
Example of strain comparison Sunderland Sunderland Preston Preston Highlighted area corresponds to the IS6110 element inserted in the DR region if spacer 24 is present then the FAFLP profile will contain the 318 bp fragment
Four colour IS6110 6110-based FAFLP DR based fragment
T lineage defined H, PGG2 X, PGG2 T, PGG3 T2Uganda, PGG2 LAM, PGG2 S, PGG2 Beijing PGG1, PGG2 outliers sharing no fragments
Evolutionary timeline Evolutionary timeline Mapped using multiple markers
Acknowledgements Sonia Borrell Nikki Thorne John Magee Jason Evans Saheer Gharbia Chloe Mortimer Christiane Honisch HPA TB Diagnostics And Molecular Epidemiology Group