Next Generation Sequencing Technology and applications 10/1/2015 Jeroen Van Houdt - Genomics Core - KU Leuven - UZ Leuven 1
Landmarks in DNA sequencing 1953 Discovery of DNA double helix structure 1977 A Maxam and W Gilbert "DNA seq by chemical degradation" F Sanger"DNA sequencing with chain- terminating inhibitors" 1984 DNA sequence of the Epstein- Barr virus, 170 kb 1987 Applied Biosystems - first automated sequencer 1991 Sequencing of human genome in Venter's lab 1996 P. Nyrén and M Ronaghi - pyrosequencing 2001 A draft sequence of the human genome 2003 human genome completed 2004 454 Life Sciences markets first NGS machine 10/1/2015 Jeroen Van Houdt - Genomics Core - UZ Leuven - KU Leuven
Massive parallel sequencing 10/1/2015 Jeroen Van Houdt - Genomics Core - UZ Leuven- KU Leuven
10/1/2015 Jeroen Van Houdt - Genomics Core - UZ Leuven- KU Leuven
10/1/2015 Jeroen Van Houdt - Genomics Core - UZ Leuven - KU Leuven
Roche 454 Landmarks in NGS Solexa/Illumina SOLiD E. coli (5Mb) Arabidopsis thaliana (157 Mb) 200 K reads 120 bp 30M reads 35 bp 100M reads 35 bp 2005 2006 2007 6
Landmarks in NGS Roche 454 Illumina SOLiD Ion torrent PacBio RS E. coli (5Mb) Arabidopsis thaliana (157 Mb) 200 K reads 120 bp 30M reads 35 bp 100M reads 35 bp 2005 2006 2007 2008 2009 2010 7
DNA Sequencing the next generation NGS refers to non- Sanger- based high- throughput DNA sequencing technologies. Millions or billions of DNA strands can be sequenced in parallel
DNA Sequencing the next generation NGS refers to non- Sanger- based high- throughput DNA sequencing technologies. NGS technologies constitute various strategies that rely on a combination of Library/template preparation Parallel sequencing 10/1/2015 Jeroen Van Houdt - Genomics Core - UZ Leuven- KU Leuven
DNA Sequencing the next generation Sample prep Clonal Amplification Parallel sequencing 10/1/2015 Jeroen Van Houdt - Genomics Core - KU Leuven - UZ Leuven 11
Roche GS FLX 454 & Roche Junior 454 SEQUENCING 10/1/2015 Jeroen Van Houdt - Genomics Core - KU Leuven - UZ Leuven 12
454 sequencing 10/1/2015 Jeroen Van Houdt - Genomics Core - KU Leuven - UZ Leuven 13
454 sequencing 10/1/2015 Jeroen Van Houdt - Genomics Core - KU Leuven - UZ Leuven 14
454 sequencing 10/1/2015 Jeroen Van Houdt - Genomics Core - KU Leuven - UZ Leuven 15
Life Technologies SOLiD 5500 Genetic Analyzer SOLID SEQUENCING 10/1/2015 Jeroen Van Houdt - Genomics Core - KU Leuven - UZ Leuven 16
SOLiD sequencing 10/1/2015 Jeroen Van Houdt - Genomics Core - KU Leuven - UZ Leuven 17
SOLiD sequencing 10/1/2015 Jeroen Van Houdt - Genomics Core - KU Leuven - UZ Leuven 18
Life Technologies: Ion Proton & Ion PGM ION TORRENT SEQUENCING 10/1/2015 Jeroen Van Houdt - Genomics Core - KU Leuven - UZ Leuven 19
Ion Torrent Sequencing 10/1/2015 Jeroen Van Houdt - Genomics Core - KU Leuven - UZ Leuven 20
Ion Torrent Sequencing 10/1/2015 Jeroen Van Houdt - Genomics Core - KU Leuven - UZ Leuven 21
Illumina HiSeq & NextSeq & MiSeq ILLUMINA (SOLEXA) SEQUENCING 10/1/2015 Jeroen Van Houdt - Genomics Core - KU Leuven - UZ Leuven 22
Illumina sequencing Library All sample preparation protocols regardless of the application end with the same product: Double- stranded DNA with the insert to be sequenced flanked by adapters 10/1/2015 Jeroen Van Houdt - Genomics Core - KU Leuven - UZ Leuven 23
Illumina library prep 10/1/2015 Jeroen Van Houdt - Genomics Core - KU Leuven - UZ Leuven 24
Illumina Sequencing 10/1/2015 Jeroen Van Houdt - Genomics Core - KU Leuven - UZ Leuven 25
Illumina Sequencing 10/1/2015 Jeroen Van Houdt - Genomics Core - KU Leuven - UZ Leuven 26
Illumina Sequencing 10/1/2015 Jeroen Van Houdt - Genomics Core - KU Leuven - UZ Leuven 27
Helicos BioSciences: November 15, 2012, bankrupt HELISCOPE SEQUENCING 10/1/2015 Jeroen Van Houdt - Genomics Core - KU Leuven - UZ Leuven 28
DNA Sequencing the next generation Sample prep Clonal Amplification Parallel sequencing 10/1/2015 Jeroen Van Houdt - Genomics Core - KU Leuven - UZ Leuven 29
Heliscope sequencing 10/1/2015 Jeroen Van Houdt - Genomics Core - KU Leuven - UZ Leuven 30
Oxford Nanopore Technologies: GridION & MinION NANOPORE SEQUENCING 10/1/2015 Jeroen Van Houdt - Genomics Core - KU Leuven - UZ Leuven 31
Oxford Nanopore Technologies: GridION & MiION NANOPORE SEQUENCING
Pacific Biosciences PacBio RS II SMRT SEQUENCING 10/1/2015 Jeroen Van Houdt - Genomics Core - KU Leuven - UZ Leuven 33
PacBio history 2010 - PacBio seduced investors with a promise of technology revolution A whole human genomes for $100 in about 15 minutes 2011 - GC applies for funding for third generation sequencer
PacBio history 2012 - None of those predictions came true Few scientists bought the one- ton instrument. PacBio market valuation of less than $70 million technology value of $0. $600 million of cash down the toilet. 2012 GC gets funding for PacBio! Oxford Nanopore announced at AGBT
PacBio history 2012 New CEO Mike Hunkapiller @ PacBio 2013 GC installs PacBio PacBio improved and has a niche ability to detect structural genetic variations creating high- quality genomes of small organisms like bacteria, viruses, and worms. PacBio s deal with Roche to develop technology for the diagnostic market
Single Molecule, Real-Time (SMRT ) DNA Sequencing SMRT bell SMRT Cells PacBio RS II
Template Preparation Template Preparation Run Design Polymerase e Binding Instrument Run Primary Analysis Secondary Analysis Tertiary Analysis DNA Sample Fragment DNA Damage Repair/ End Repair Ligate adapters Purify DNA SMRTbell Template preparation can be used to create libraries of various insert sizes from 250 bp to 20,000 bp depending on the needs of the application.
Advantages of SMRTbell Templates Key Advantages: Structurally linear Topologically circular Provides sequences of both forward and reverse strands in the same trace
Base Modification: Discover the Epigenome Directly observe base modifications using the kinetics of the polymerization reaction during normal sequencing
Signal Processing and Base Calling Converting pulses of light into DNA bases and kinetic measures 43
Understanding Accuracy in SMRT Sequencing Single-pass error rate ~11% (predominantly deletions or insertions) Single Molecule, Real-Time (SMRT ) DNA sequencing achieves highly accurate sequencing results, exceeding 99.999% (Q50) How is this possible given that single-pass sequence has 1 mistake every 10 nucleotides Single-pass errors are distributed randomly, which means that they wash out very rapidly upon building consensus.
Sequencing 45 74
SMRT Sequencing Accuracy Perspective: Understanding SMRT Sequencing Accuracy Data generated with P4-C2 chemistry on PacBio RS II; Analyzed using Quiver with 2.0.1 SMRT Analysis
The PacBio RS Helps Resolve Genetically Complex Problems Comprehensively Targeted Characterize Sequencing Genomic Variation De Generate Novo Assembly Finished Assemblies Automatically Base Modification detect Detection DNA base modifications 47
NGS time line Roche 454 Illumina SOLiD Ion torrent PacBio RS E. coli (5Mb) Arabidopsis thaliana (157 Mb) 200 K reads 120 bp 30M reads 35 bp 100M reads 35 bp 2005 2006 2007 2008 2009 2010 49 2011
NGS time line 54 Illumina SOLiD Ion torrent PacBio RS b) Arabidopsis thaliana (157 Mb) ds 30M reads 35 bp 100M reads 35 bp 2006 2007 2008 2009 2010 2011 2012 50
NGS time line Ion torrent PacBio RS HiSeq 4000 HiSeq X ten HiSeq2500 PB Sequel 9 2010 2011 2012 2013 2014 2015 2016 51
NGS Technology: conclusions 52
NGS Technology: conclusions 53
Summary 54
NGS terminology 55
NGS as a tool for studying Genome variation and regulation NGS APPLICATIONS 10/1/2015 Jeroen Van Houdt - Genomics Core - KU Leuven - UZ Leuven 56
10/1/2015 Jeroen Van Houdt - Genomics Core - KU Leuven - UZ Leuven 57
DNA SEQUENCING WHOLE GENOME SEQUENCING
59
Copy Number Variations 60
Structural Variations 61
Whole genome sequencing ì ì ì Copy number variation analysis ì ì Sequencing a genome at 0.1-0.3x Sequencing a genome at 1-3x Structural variation analysis ì Sequencing a genome at 5-10x Whole genome re- sequencing ì ì Sequencing a genome at >30x yeast, fruit fly, bacterial genomes, human 62
DNA SEQUENCING TARGETED RE- SEQUENCING
Sequencing - the beginning Random genome sequencing?????? Sanger sequencing Targeted 700-100 0 bp 10/1/2015 Jeroen Van Houdt - Genomics Core - UZ Leuven- KU Leuven
Target enrichment strategies Random genome sequencing Hybrid Capture PCR based Sanger sequencing 10/1/2015 Jeroen Van Houdt - Genomics Core - UZ Leuven- KU Leuven
Target enrichment strategies 10/1/2015 Jeroen Van Houdt - Genomics Core - UZ Leuven- KU Leuven
67
Rapid expression profiling, transcriptome sequencing and small RNA s RNA SEQUENCING
RNA- seq
RNAseq: Gene Expression through sequencing ì ì ì ì ì ì ì ì Supports discovery, screening, and profiling Does not require prior gene knowledge or annotation Unique combination of Qualitative and quantitative measurement Digital counts vs analog intensities Increased dynamic range and sensitivity No probes or primers Any species - Even when reference genome not available Analyze gene expression
RNAseq: summary ì ì ì Counting or Profiling ì 10 million total reads of 35 bp length from poly- A selected RNA will give performance better than any microarray Studying Alternative Splicing or quantifying csnps for most transcripts ì Deeper profiling of 50 to 100 million reads, with read lengths of 50 to 100 bps, from poly- A selected RNA using mrna- Seq assay Complete Annotation of an entirely New Transcriptome ì ~500 Million reads of 100 bp read length from multiple tissues ì Normalized stranded mrna- Seq & ncrnas ì Small RNA- Seq for micrornas