Overview of the Genome Sequencer FLX System. Genome Sequencer FLX System Performance. Genome Sequencer FLX System Workflow

Similar documents
How is genome sequencing done?

The Power of Next-Generation Sequencing in Your Hands On the Path towards Diagnostics

History of DNA Sequencing & Current Applications

14/12/2012. HLA typing - problem #1. Applications for NGS. HLA typing - problem #1 HLA typing - problem #2

Genome Sequencer System. Amplicon Sequencing. Application Note No. 5 / February

Genetic Analysis. Phenotype analysis: biological-biochemical analysis. Genotype analysis: molecular and physical analysis

Data Analysis for Ion Torrent Sequencing

Overview of Next Generation Sequencing platform technologies

July 7th 2009 DNA sequencing

SEQUENCING. From Sample to Sequence-Ready

4. GS Reporter Application 65

Genome Sequencer 20 System. First to the Finish.

Introduction to transcriptome analysis using High Throughput Sequencing technologies (HTS)

Introduction to next-generation sequencing data

Illumina Sequencing Technology

Advances in RainDance Sequence Enrichment Technology and Applications in Cancer Research. March 17, 2011 Rendez-Vous Séquençage

Universidade Estadual de Maringá

Next Generation Sequencing

SeqScape Software Version 2.5 Comprehensive Analysis Solution for Resequencing Applications

PreciseTM Whitepaper

Go where the biology takes you. Genome Analyzer IIx Genome Analyzer IIe

New generation sequencing: current limits and future perspectives. Giorgio Valle CRIBI - Università di Padova

Welcome to Pacific Biosciences' Introduction to SMRTbell Template Preparation.

DNA Sequence Analysis

Next Generation Sequencing for DUMMIES

G E N OM I C S S E RV I C ES

TruSeq Custom Amplicon v1.5

An Overview of DNA Sequencing

Introduction to NGS data analysis

Single-Cell DNA Sequencing with the C 1. Single-Cell Auto Prep System. Reveal hidden populations and genetic diversity within complex samples

Sanger Sequencing and Quality Assurance. Zbigniew Rudzki Department of Pathology University of Melbourne

Analysis of DNA methylation: bisulfite libraries and SOLiD sequencing

Core Facility Genomics

Automated DNA sequencing 20/12/2009. Next Generation Sequencing

Bioruptor NGS: Unbiased DNA shearing for Next-Generation Sequencing

Next Generation Sequencing

BRCA1 / 2 testing by massive sequencing highlights, shadows or pitfalls?

empcr Amplification Method Manual - Lib-A

Next generation DNA sequencing technologies. theory & prac-ce

NGS data analysis. Bernardo J. Clavijo

MiSeq: Imaging and Base Calling

Genotyping by sequencing and data analysis. Ross Whetten North Carolina State University

LifeScope Genomic Analysis Software 2.5

Next Generation Sequencing: Technology, Mapping, and Analysis

SMRT Analysis v2.2.0 Overview. 1. SMRT Analysis v SMRT Analysis v2.2.0 Overview. Notes:

BacReady TM Multiplex PCR System

Introduction To Real Time Quantitative PCR (qpcr)

Single-Cell Whole Genome Sequencing on the C1 System: a Performance Evaluation

Complete Genomics Sequencing

Data Processing of Nextera Mate Pair Reads on Illumina Sequencing Platforms

Introduction Bioo Scientific

AS Replaces Page 1 of 50 ATF. Software for. DNA Sequencing. Operators Manual. Assign-ATF is intended for Research Use Only (RUO):

Chapter 8: Recombinant DNA 2002 by W. H. Freeman and Company Chapter 8: Recombinant DNA 2002 by W. H. Freeman and Company

Description: Molecular Biology Services and DNA Sequencing

Concepts and methods in sequencing and genome assembly

Nazneen Aziz, PhD. Director, Molecular Medicine Transformation Program Office

Illumina TruSeq DNA Adapters De-Mystified James Schiemer

Shouguo Gao Ph. D Department of Physics and Comprehensive Diabetes Center

Illumina GAIIx Sequencing Service

454 Sequencing System Software Manual, v 2.5p1

Focusing on results not data comprehensive data analysis for targeted next generation sequencing

GenScript BloodReady TM Multiplex PCR System

DNA Sequencing & The Human Genome Project

FOR REFERENCE PURPOSES

NGS Technologies for Genomics and Transcriptomics

Consistent Assay Performance Across Universal Arrays and Scanners

New Technologies for Sensitive, Low-Input RNA-Seq. Clontech Laboratories, Inc.

Delivering the power of the world s most successful genomics platform

The author(s) shown below used Federal funds provided by the U.S. Department of Justice and prepared the following final report:

Using Illumina BaseSpace Apps to Analyze RNA Sequencing Data

Sequencing Guidelines Adapted from ABI BigDye Terminator v3.1 Cycle Sequencing Kit and Roswell Park Cancer Institute Core Laboratory website

How Sequencing Experiments Fail

Improved methods for site-directed mutagenesis using Gibson Assembly TM Master Mix

Software Getting Started Guide

Automated Lab Management for Illumina SeqLab

Accelerate genomic breakthroughs in microbiology. Gain deeper insights with powerful bioinformatic tools.

DNA Sequencing Troubleshooting Guide

Reading DNA Sequences:

Assuring the Quality of Next-Generation Sequencing in Clinical Laboratory Practice. Supplementary Guidelines

454 Sequencing System Software Manual Version 2.6

Handling next generation sequence data

Cluster Generation. Module 2: Overview

RETRIEVING SEQUENCE INFORMATION. Nucleotide sequence databases. Database search. Sequence alignment and comparison

Agencourt AMPure XP. Xtra Performance Post-PCR clean UP

DNA SEQUENCING SANGER: TECHNICALS SOLUTIONS GUIDE

Personal Genome Sequencing with Complete Genomics Technology. Maido Remm

Essentials of Real Time PCR. About Sequence Detection Chemistries

An Introduction to Next-Generation Sequencing for in vitro Fertilization

Introduction To Epigenetic Regulation: How Can The Epigenomics Core Services Help Your Research? Maria (Ken) Figueroa, M.D. Core Scientific Director

SYBR Green Realtime PCR Master Mix -Plus-

Vector NTI Advance 11 Quick Start Guide

Genomics GENterprise

Just the Facts: A Basic Introduction to the Science Underlying NCBI Resources

Lawrence Berkeley National Laboratory Lawrence Berkeley National Laboratory

Computational Genomics. Next generation sequencing (NGS)

Targeted. sequencing solutions. Accurate, scalable, fast TARGETED

Gene Expression Assays

360 Master Mix. , and a supplementary 360 GC Enhancer.

PrimeSTAR HS DNA Polymerase

Reduced Representation Bisulfite-Seq A Brief Guide to RRBS

Transcription:

Overview of the Genome Sequencer FLX System Genome Sequencer FLX System Performance Genome Sequencer FLX System Workflow Genome Sequencer FLX System Data Flexibility Applications The Future 2

GS Evolution Genome Sequencer 20 Genome Sequencer FLX Genome Sequencer FLX Titanium Read length 100 bases > 200 bases > 400 bp # of clonal reads >200,000 > 400,000 > 2,000,000 System throughput /8hr shift 20-30mb 100mb 0,5-1gb Cost $6,000-9,000.00 $3,000-9,000 $10,000? Accuracy 99.99% 99.99% 99.99%

1,200,000 1,000,000 Relative Throughput Tech. Improvement 500MB Reads per Run 800,000 600,000 400,000 200,000 0 20MB GS20 fluidics 100MB GS FLX PTP GS FLX-Titanium 50 100 150 200 250 300 350 400 450 500 Read Length 4

Benefits of the Genome Sequencer FLX System Efficiency, Speed, Comprehensiveness, and Quality Obtain more comprehensive data from your genomic samples Generate 400,000 reads per run with 250-base reads at 99.5% accuracy Reduce your cost per result Ultra-high throughput significantly reduces cost compared to traditional sequencing Long, highly accurate reads and lack of bias deliver confident, quality results Pool 12 samples per region, up to 192 samples per run, for low cost per sample Expand your project capabilities Utilize long shot-gun reads or 3K Long-Tag Paired End reads. Publish Faster Sequence 100 million high-quality filtered bases per 7.5 hour run and accelerate time to result with easy to use analysis tools 5

Customer Success with the GS FLX Turning Sequencing into Publications Cumulative Publications by Quarter 0 10 20 30 40 50 Whole Genomes ncrna Transcriptomes Metagenomes Genome Structure Amplicons Technology Ancient DNA 160 120 80 40 0 Peer-Reviewed Publications by Application Q1'05 Q3'05 Q1'06 Q3'06 Q1'07 Q3'07 Q1'08 6

Genome Sequencer FLX System Throughput Run Time Read Length Reads per Run Single-Base Accuracy Sample Input Requirement 100 million high-quality, filter-passed bases per run, 200 million bases per day (1 billion bases per day Titanium) 7.5 hours 250 bases (>400-base reads Titanium) 400,000 high-quality reads >99.5% single-read accuracy over 250 bases As little as 100 ng of DNA Data Trace data accepted by NCBI since 2005 Computing Requirements Robustness Single Linux server No complex optics or lasers; long-life reagents 7

Sample Input and Fragmentation Generation of small DNA fragments via nebulization Library Preparation Ligation of A/B-Adaptors flanking single-stranded DNA fragments One Fragment = One Bead Emulsification of beads and fragments in water-in-oil microreactors empcr (emulsion PCR) Amplification Clonal amplification of fragments bound to beads in microreactors One Bead = One Read Sequencing by synthesis Data Analysis Flowgrams, basecalling, analysis tools 8

Genome fragmented by nebulization sstdna library created with adaptors A/B fragments selected using streptavidin-biotin purification DNA library preparation empcr Sequencing Hands on time 4.0 h Hands on time 4.0 h Hands on time Total time 4.5 h Total time 8.0 h Total time 0.5 h 7.5 h 9

Anneal sstdna to an excess of DNA Capture Beads Emulsify beads and PCR reagents in water-in-oil microreactors Clonal amplification occurs inside microreactors Break microreactors, enrich for DNApositive beads DNA library preparation empcr Sequencing Hands on time 4.0 h Hands on time 4.0 h Hands on time Total time 4.5 h Total time 8.0 h Total time 0.5 h 7.5 h 10

1.6 milion wells per PTP Well diameter: average of 44 µm Each well is only able to accept a single DNA bead. DNA capture beads and enzyme beads deposited onto PTP 11

Load beads onto PicoTiterPlate device Load PicoTiterPlate device on instrument Open instrument Load sequencing reagents Press START for Sequencing! 12

Bases (TACG) are flowed sequentially and always in the same order (100 times for a large GS FLX run) across the PicoTiterPlate device during a sequencing run. A nucleotide complementary to the template strand generates a light signal. The light signal is recorded by the CCD camera. The signal strength is proportional to the number of nucleotides incorporated. DNA library preparation empcr Sequencing Hands on time 4.0 h Hands on time 4.0 h Hands on time Total time 4.5 h Total time 8.0 h Total time 0.5 h 7.5 h 13

1. Raw data is series of images T A G C T 2. Each well s data extracted, quantized and normalized 3. Read data converted into flowgrams 14

4-mer Flowgram T A C G Flow Order 3-mer T T C T G C G A A 2-mer 1-mer Key sequence = TCAG for signal calibration 15

Cumulative Read Error 4.0% 3.5% 3.0% 2.5% 2.0% 1.5% 1.0% 0.5% 0.0% Reported in Nature 2005 GS20 Q2 2006 0 50 100 150 200 250 Base Position E. 09_29A coli run #1 E. coli run #2 E. 09_29B coli run #3 E. coli run #4 T. 09_14 thermophilus + 09_18A C. jejuni 09_18B+09_25 Thermophilus C jejuni Single-Read Accuracy > 99.5% (including all indel errors) Ignoring indels (homopolymers) the individual read accuracy is ~99.996% 16

Accuracy 100% 90% 80% 70% 60% 50% 40% 30% 20% 10% 0% 1 2 3 4 5 6 7 8 9 Homopolymer Length 100% Consensus Reads 17

Achieve optimization of cost, coverage, and time to result for your specific research needs Sequencing Kit Reads per Run Average Read Length Run Time GS LR70 400,000 250 bases 7.5 hours GS LR25 70,000 250 bases 7.5 hours GS SR70 400,000 100 bases 4.5 hours Optimal For High-throughput / multi-sample applications Small projects, application development, feasibility testing Short-read applications 18

Tailor sample number and throughput to your specific needs Gasket Format (regions) GS Sequencing Kit Compatibility Number of Reads per Region 1 GS LR70 and GS SR70 400,000 GS LR25 70,000 2 GS LR70 and GS SR70 200,000 4 GS LR70 and GS SR70 70,000 GS LR25 12,000 8 GS LR70 and GS SR70 30,000 16 GS LR70 and GS SR70 12,000 19

1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 Sequencing primer Primer A Read Key MID Library fragment Primer B MID1 MID2 MID3 MID4 MID5 MID6 MID7 MID8 MID9 MID10 MID11 MID12 ACGAGTGCGT ACGCTCGACA AGACGCACTC AGCACTGTAG ATCAGACACG ATATCGCGAG CGTGTCTCTA CTCGCGTGTC TAGTATCAGC TCTCTATGCG TGATACGTCT TACTGAGCTA 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 T A C G T A C G T A C G T A C G T A C G The MIDs in flow space 20

21

New Protocol for sequencing of 250 bases from each end of a 3,000 base span on a single read Use for De novo assemblies of complex genomes Identify structural variations at high resolution 22

For an out-of-the-box solution we have qualified a provider who will supply an integrated system. This purchase is made through Roche and is delivered as a one box solution. Server specifications are available for a do-it-yourself option. 23

Obtain results from your sequence data quickly and affordably with the powerful suite of analysis tools provided with the Genome Sequencer FLX System Data Management, Processing, and Analysis Features User-friendly GUI Integration with LIMS Bar-code scanning PHRED quality scores* Trace data accepted by GenBank Hardware Single-node computing Manageable data storage Software Tools De novo assembly tool Mapping tool Amplicon analysis tool Third-party LIMS Third-party options Supplied with the GS FLX System Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes * PHRED equivalent score has been developed in conjunction with The Broad Institute. 24

Flowgram 25

Perform whole genome shotgun assembly of genomes with or without paired-end data. Order contigs into scaffolds using available paired-end reads. Assemble genomes up to 50 megabases in size. Select the read files you want to assemble by browsing in the GUI. View the assembly and browse the individual reads that make up the multiple Example alignment Data: de novo sequencing for each and contig. assembly of complete genomes E. coli K-12 C. jejuni T. thermophilus S. cerevisiae Genome Size (bases) 4,639,675 1,641,481 2,127,575 12,495,682 % GC content 50 30 69 38 Oversampling 20x 20x 20x 21x Number of Contigs 113 33 65 488 Number of Scaffolds 11 4 7 85 Total Coverage 97.62% 97.54% 98.04% 93.09% Consensus Accuracy 99.999% 99.998% 99.998% 99.979% 26

27

Map reads to any reference genome and generate a consensus sequence. Easily view all differences compared to the reference sequence with automatic output to separate files. Insertions- inserted blocks of up to 50 bases Deletions- deleted blocks of up to 50 bases SNPs Quickly identify high-confidence differences compared to the reference genome, which are singled out into a separate file. Compare genomes up to three gigabases in size. Choose files by browsing in the GUI. Example Data: Whole genome resequencing and mutation detection E. coli K-12 C. jejuni T. thermophilus S. cerevisiae Genome Size (bases) 4,639,675 1,641,481 2,127,575 12,495,682 % GC content 50 30 69 38 Number of Runs 1 ½ ½ ½ Number of Reads Mapped 397,418 169,560 196,557 176,576 Total Coverage 98.52% 98.43% 99.09% 97.29% Consensus Accuracy 100.000% 99.999% 99.998% 99.998% 28

Automatically compute the alignment of reads from amplicon-based samples against a reference sequence. Quickly identify variants and their respective frequencies in large pools of data Screen for known variants Discover unknown variants Identify haplotypes Detect low-frequency (<1%) variants The GS Amplicon Variant Analyzer Software can also be used for the following studies: Resequencing Analyze disease-associated regions Discover SNPs, insertions, and deletions on a population level Detect rare somatic mutations Perform viral subtyping Epigenetics Analyze DNA methylation patterns Metagenomics and genetic diversity Sequence 16S RNA 29

30

The apparent price per base advantage of micro-read teachnologies in resequencing applications shrinks dramatically after adjusting for Loss of raw reads when mapped against the reference Additional over-sequencing required due to lower single read accuracy Fully loaded costs (including required bioinformatics labor and IT maintenance) No amount of additional over-sequencing can provide equivalent quality to 454 Sequencing due to micro-read technologies Shorter read length Greater bias Limitation to SNP identification, versus detection of indels (large and small) and larger scale genome variation Price advantage per equivalent base will decisively shift to 454 Sequencing with the XLR HD kit format, which will run on the current GS FLX instrument 31

When calculating the real cost per result, all cost must be considered - Reagents - Labor and bioinformatics - Instrument cost Using few long reads compared to many micro-reads the bioinformatics requirement per run is much lower with 454 Sequencing (for the average user, 3X more informatics time is required per Illumina run) The Instrument cost must be considered when considering the total cost of the result produced by that instrument 32

1. Bioanalyzer or Nanodrop 2. Nebulizator 3. RT-PCR (LightCycler) 4. PCR 5. Laminar flow (P2) 6. Mixer for emulsion prep. 7. N2 8. Centrifuge for plates (specific model) Total cost ~500.000