Next-generation sequencing can replace Sanger sequencing in clinical diagnostics. Birgit Sikkema-Raddatz Department of Genetics, University Medical Center Groningen The Netherlands
Groningen GRONINGEN ~190,000 inhabitants ~50,000 students
University Medical Center Groningen Department of Genetics ~ 300 employees Many different nationalities Counseling Diagnostics Research Teaching No. beds: 1,339 No. employees: >11,0000 Section Genome Diagnostics 10 laboratory specialists, 70 technicians >20,000 genetics tests Karyotyping SNP & Oligo Arrays FISH Sequencing analysis
NGS into diagnostics Genome Diagnostics Richard Sinke Eddy de Boer Krista van Dijk-Bos Jan Jongbloed Yvonne Vos Eva van den Berg-de Ruiter Trijnie Dijkhuizen Annemieke van der Hout Henny Lemmink Renee Niessen Jos Dijkhuis Annelies ten Berge Margaret Burton Martine Meems-Veldhuis Inge Mulder Arjen Scheper Martijn Viel Cisca Wijmenga Clinical Genetics Rolf Sijmons Tom de Koning Conny van Ravenswaaij Peter van Tintelen Joke Verhey Jan Oosterwijk Corien Verschuuren-Bemelmans Irene van Langen Beike Leegte Genomics coordination center Morris Swertz Freerk van Dijk Pieter Neerincx Martijn Dijkstra Rowida Almomani Lennart Johansson Patrick Deelen
!"#$%&'()%"*%+,#-'-,).% $.#.(% Many diseases are heterogeneous!,1$.%+"&&.+8"#%"*% 0,8.#)%(,90&.( Department of Genetics /001",+23%4,#$.1%(.56.#+'#$7%%
Cardiomyopathies normal heart DCM; dilated cardiomyopathy HCM; hypertrofic cardiomyopathy ACM; aritmogenic cardiomyopathy Wilde & Behr (2013) Nat. Rev. Cardiol. 10(10):571-83
Cardiomyopathies Number of genes reported Paul van der Zwaag, thesis, 2012
Cardiomyopathies Genetic overlap Up to 50 genes involved Up to 6 genes in current Diagnostics
Sanger sequencing Data analyses Sanger Sequencing: analysis of 1 gene Arg294Stop CGA-TGA Growing number of genes need to be tested for a particular disease 9
Novel strategies Aim: Candidate Gene Next Generation Screening Sequencing Apply one comprehensive test. Design and implement various targeted next generation sequencing (NGS) genepanels.
Novel strategies What kind of sequencing machine? Candidate Gene Screening Next Generation Sequencing Which technique?
Exome Sequencing (Agilent) Exome Library!""#$%&'(# Fragments of 120bp labeled with Biotine.
Exome vs. Targeted Exome sequencing: High number of variants! clinical interpretation Insufficient coverage! missing mutations - Coverage performance in exome data varied significantly between exons and (for some exons) insufficient to rely on exome sequencing only Targeted resequencing = capturing of exons of certain genes - Insufficient coverage may result in missing clinically relevant mutations
Targeted NGS What kind of gene panels should we construct to replace Sanger sequencing? Based on the indication; Cardiomyopathies Hereditary cancer (breast cancer, Lynch syndrome) Epilepsy Neurological /movement disorders Skin disorders (mental retardation)
!!
Cardio: Genes enriched ABCC9, ACTC1, ACTN2, ANKRD1, BAG3, CALR3, CAV3, CRYAB, CSRP3/MLP, DES, DMD, DSC2, DSG2, DSP, DTNA, EMD, EYA4, GATAD1, GLA, JPH2, JUP, LAMA4, LAMP2, LMNA, MYBPC3, MYH6, MYH7, MYL2, MYL3, MYPN, MYOZ1, MYOZ2, NEXN, PKP2, PLN, PRKAG2, PSEN1, PSEN2, RBM20, RYR2, SCN5A, SGCD, TAZ, TBX20, TCAP, TMEM43, TNNC1, TNNI3, TNNT2, TPM1, TTN, TXNRD2, VCL, ZASP/LDB3
Cardio: validation Validationcriteria: Coverage minimal 30 for each nucleotide compared to Sanger: Specificity 100% Sensitivity 98% MiSeq capacity: 1 channel 5 miljoen reads Readlength 150 bp 5.000.000 x 150 bps = 750.000.000 bp Paired-end 750.000.000 x2 = 1.500.000.000 bp Accuracy 75% 75% x 1.500.000.000 = 1.125.000.000 bp Size Cardio Custom 320.000bp 1.125.000.000 bp /320.000 bp = 3515 bp 12 patients multiplex 3515 /12 = 292 Average coverage 292x
Results 1. Technical validation: Coverage Specificity/ Sensitivity compared to Sanger Sequencing 2. Clinical validation: Diagnostic yield
Coverage; cardio 48 genes, 1134 targets: Coverage >30: 99% (<30: 4,398 bps out of 323,651 bps)
Results - Coverage gaps Coverage; on average 250x (151 bp paired-end reads ) 29 targets out of 1134 in at least one sample 15 exons with insufficient coverage
Cardio: Sensitivity/ specificity
Conclusions Resequencing of gene panels on the MiSeq can be used in routine diagnostics 99% of all bases coverage >30x No false positives/ negatives 12-16 patients can be multiplexed Average coverage: 250x ~15 exons per require Sanger sequencing in parallel
Analysis - Workflow FASTQ-file FASTA-file VCF-file
Interpretation tree Cartagenia Filtering: - quality - 1000 genomes - GoNL - ESP - SNP database - managed variant lists
Interpretation tree Cartagenia Per Filtering: Quality Per patient 5 10 variants 1000 genomes GoNL ESP SNP database managed variant lists
Transfer to results form Previous Classification Present in HGMD Relevante isoforms Allele frequency Population frequency (1000 G, GoNL) Conservation Splicing effects Prediction programas: Alamut (PhyloP score, Mutation Taster, Polyphen,SIFT) Conclusion Category: Benign Likely Benign VOUS Likely Pathogenic Pathogenic
Cardio-panel v2; 55 genes Clinical yield Since September 2012 in Routine Diagnostics: >800 patients received >450 finished reports >1 MiSeq run per week Diagnostic yield: 390 patients 43 pathogenic; 11% 134 likely pathogenic; 34% 213 no mutations/ potentially benign; 55% Note: ~15% >1 P/LP
Challenges Data sharing! Cartagenia Users Dutch UMCs Existing LOVDs Other LSDBs MMDB ClinVar
Targeted NGS panels Panel No. of genes Coverage > 30 for each base No. of patients analyzed Cardio 55 99,6 >800 Onco 73 99.5 120 Movement 88 99.37 28 Skin 63 in progres not yet Epilepsy 147 99.63 not yet Neuro Construction Liver Construction
Onco; genes enriched BRCA1, BRCA2, PTEN, NF1, CDK4, MUTYH, APC, MSH2, MSH6, MLH1, PMS2, CDH1, STK11, SDHB, RET, SDHD, WT1, SDHC, MEN1, SDHA, FLCN,VHL, NF2, PTCH, FH, BMPR1A, SMAD4, CHEK2, RAD51C, RAD51D, BRIP1, XRCC2, BARD1, HOXB13, KLLN, MITF, ENG, AXIN2, BMP4, TMEM127, CDC73, AIP, CDKN2B, CDKN2C, CDKN1A, CDKN1B, SDHAF2,MAX, PHOX2B, TERT, RUNX1, CEBPA, GATA2, PTCH2, MET, SUFU, TP53, CDKN2A, BAP1, PALB2, DICER1, SMARCB1, SMARCA4, BUB1B, PALLD, EGFR, PDGFRA, KIT, PRKAR1A, ATM, CEP57
Targeted NGS; onco 3 virtual sub-panels based on preventive options Class 1 (n= 25/70 ) e.g. BRCA1, MLH1, RET,! Preventive options available for the frequently associated tumor types Following national / international guidelines Class 2 (n = 32/70 ) e.g. RAD51C, MAX, ALK,! Preventive options available for the frequently associated tumor types No official guidelines yet Class 3 (n = 13/70) e.g. TP53, KIT, BAP1,!. No preventive options available for frequently associated tumor types (e.g. pancreatic cancer, sarcoma)
Targeted NGS; onco 3 virtual sub-panels based on preventive options Class 1 (n= 25/70 ) e.g. BRCA1, MLH1, RET,! Preventive options available for the frequently associated tumor types Following national / international guidelines Class 2 (n = 32/70 ) e.g. RAD51C, MAX, ALK,! Preventive options available for the frequently associated tumor types No official guidelines yet Class 3 (n = 13/70) e.g. TP53, KIT, BAP1,!. No preventive options available for frequently associated tumor types (e.g. pancreatic cancer, sarcoma) :"()%0,8.#)(%% +2""(.%;%<%=%
Detection of exon deletions/ duplications Average coverage per target One serie of 12 samples Another serie of 12 samples Targets from X chromosome Normalisation (first) per sample and (second) per target Calculation of variation coefficient
Detection of exon deletions/ duplications Validation of 120 samples, including 10 known deletions/ duplications On average 907 of the 930 targets of the onco panel pass the thresholds. No false negative results.
Conclusion Targeted NGS can replace Sanger Sequencing Obsolete at a coverage of >30 per nucleotide. In parallel with NGS at a coverage structural <20x. Exon deletions/ duplication can be detected.
Improvements Reducing Turn-Around-Times 1. Further robotization of sample processing 2. Optimizing the pipe line - Filtering parameters - Generation of reporting files 3. Automation of interpretation process How to deal with homologe/ repetitive sequences and pseudogenes.
Targeted NGS, what else!. (1) Non invasive prenatal diagnostics (2) Hematology : amplicon panels for mutation detection TLA panels for structural abnormalities (3) Neonatology: 72 h Whole Genome Sequencing