Application du séquençage génomique et transcriptomique à haut débit au Genoscope. Jean-Marc Aury
|
|
|
- Lionel Washington
- 10 years ago
- Views:
Transcription
1 Application du séquençage génomique et transcriptomique à haut débit au Genoscope Jean-Marc Aury
2 Introduction Presentation of Genoscope and NGS activities Overview of sequencing technologies Sequencing and assembly of prokaryotic and eukaryotic genomes Annotating genomes using massive-scale RNA-Seq Future projects, applications and Next-Next Generation Sequencing
3 Genoscope (National Sequencing center) Among the largest sequencing center in Europe Part of the CEA Institut de Génomique since 05/2007 Provide high-throughput sequencing data to the French Academic community, and carry out in-house genomic projects Involved in large genome projects : human genome project, arabidopsis, rice, Coordination of large genome projects : tetraodon, paramecium, vitis, oikopleura, and as well fungal genomes (botrytis, tuber) and prokaryotic genomes
4 Genoscope (National Sequencing center) NGS activities : Sequencing of prokaryotic genomes (2007) RNA-Seq / Annotation of eukaryotic genomes (2008) SNP calling : identification of mutations (2008) Metagenomic projects (2008) Sequencing of large eukaryotic genomes (2009/2010) Chip-Seq Seq, detection of structural variations, (2009/2010)
5 Genoscope (National Sequencing center) Sequencing capacity : 19 ABI /Roche Titanium 2 Illumina GA2x 3 Illumina HiSeq Solid v3
6 Genoscope (National Sequencing center) Access to the Genoscope sequencing capacity by call for tender Project Sequencing Assembly / Finishing Prokaryotics genomes annotation (MAGE) Eukaryotics genomes annotation (GAZE)
7 Sequencing technologies Applied Biosystems ABI 3730XL Roche / 454 Genome Sequencer FLX Illumina / Solexa Genetic Analyzer Applied Biosystems SOLiD What s different : Quantity and types of data Quality of data
8 Sequencing technologies 454 / Roche Genome Sequence FLX 2006 Gs20 20Mb / run 100bp / read 2007 GSFLX Standard 100Mb / run 250bp / read 2009 GSFLX Titanium 500Mb / run 500bp / read Actual version (GSFLX Titanium) : Majority of 500bp reads Around reads / run and 500Mbp / run Run duration : 8h High error rate in homopolymer sequences Good assemblies at 20X of coverage No cloning biases
9 Sequencing technologies 454 / Roche Genome Sequence FLX 1 fragment -> 1 bead 1 bead -> 1 read
10 Sequencing technologies Illumina / Solexa Genetic Analyzer GA 1 1Gb / run 32bp / read GA II 8Gb / run 50bp / read Paired reads Actual version (Genome Analyzer IIx): GA IIx 90 Gb / run 150bp / read 14 days / run Paired reads and mate pairs Hi-Seq Gb / run 100bp / read 8 days / run Paired reads and mate pairs Reads of 108bp Around 640M reads / run and 70Gbp / run 10 days / run very low error rate (70% of perfect reads) No indels => good complementarity with the 454 technology No coverage gaps Price
11 Sequencing technologies Illumina / Solexa Genetic Analyzer
12 Sequencing technologies 454 / Roche Genome Sequence FLX ¼ run on Acinetobacter baylyi (3,5Mb) ~ reads Cumulative size of 130Mb 99,9% of aligned reads Average error rate : 0,55% 37% deletions, 53% insertions, 10% substitutions. Errors accumulated around homopolymers => error rate is not constant
13 Sequencing technologies Illumina / Solexa Genetic Analyzer 1l lane on Acinetobacter t baylyi (3,5Mb) 11,4M reads cumulative size of 900Mb 98,5% aligned reads Average error rate : 0,38% 3% deletions, 2% insertions, 95% substitutions.
14 Sequencing technologies
15 Sequencing of prokaryotic genomes
16 Whole genome shotgun sequencing
17 Prokaryotic genomes sequencing 454 / Roche Genome Sequence FLX Required genome coverage :
18 Prokaryotic genomes sequencing 454 / Roche Genome Sequence FLX Required genome coverage :
19 Prokaryotic genomes sequencing 454 / Roche Genome Sequence FLX Required genome coverage :
20 Procaryotic genomes sequencing Sanger Unpaired 454 Unpaired + PE 454 Coverage 7.4X 20X 25X Assembler Arachne (Broad Institute) Newbler (454/Roche) Newbler (454/Roche) # of contigs Contigs N50 (Kb) # of scaffolds Scaffolds N50 (Kb) 2, ,000 Assembly size 3.417Mb Mb Mb (% of reference) (95%) (98%) (98%) Mis-assemblies # of errors 3, Substitutions 2, Insertions / Deletions
21 Prokaryotic genomes sequencing Sanger Unpaired + PE 454 unpaired + paired 454 with Illumina / Solexa GA1 Coverage 7.4X 25X 25X and 50X Assembler Arachne (Broad Institute) Newbler (454/Roche) Newbler (454 / Roche) # of contigs Contigs N50 (Kb) # of scaffolds Scaffolds N50 (Kb) 2,200 1,000 1,000 Assembly size (% of reference) 3.417Mb (95%) Mb (98%) Mb (98%) Mis-assemblies # of errors 3, (1 error / 8Kb) (1 error / 22Kb) Substitutions 2, Insertions / Deletions
22 Microbial Genome Sequencing Evolution Propose a strategy to sequence prokaryotic genomes, accounting for assembly quality and costs Mixing 454 and illumina technologies to obtain high quality drafts (454 provide read length and illumina low error rate) Sanger - 12X ~40 projects 454 (15X) + Sanger (4X) ~20 projects 454 (20X) + Sanger (1X) + Solexa (~50X) ~15 projects Phrap Arachne Newbler (20X) + Solexa (~50X) 1 project
23 Prokaryotic genomes sequencing Sanger unpaired + paired illumina GAI Illumina 76bp + Illumina MP 10Kb 52bp Illumina 76bp + Illumina MP 10Kb 52bp Coverage 7.4X 25X and 50X 100X and 50X 100X and 50X Assembler Arachne (Broad Institute) Newbler (454 / Roche) Soap (BGI) Velvet (EBI) # of contigs Contigs N50 (Kb) # of scaffolds Scaffolds N50 (Kb) 2,200 3,600 3,183 3,601 Assembly size 3.417Mb Mb Mb Mb (% of reference) (95%) (98%) 58K N (98%) 457K N (100%) 11K N Misassemblies # of errors 3,442 < error / 28Kb 1 error / 175Kb 1 error / 17Kb Substitutions 2, Insertions / Deletions
24 Eukaryotic Genome Sequencing Evolution Extend prokaryotic strategy to eukaryotic genomes Sanger sequencing is still used to sequence long DNA fragments : >20Kb, BAC ends, Sanger - 10X - >12 projects Tetraodon, paramecium, vitis, meloidogyne, tuber, blastocystis, chondrus, podospora, oikopleura, Ectocarpus 454 (~15X) + Sanger BAC ends + Solexa (~50X) 9 projects : Cocoa, banana, trypanosome, citrus, clytia, adineta, coffee, trout, colza Arachne Newbler
25 Annotating genomes using RNA-Seq
26 Annotating genomes using RNA-Seq Goal : annotate eukaryotic genomes using transcriptomic data from ultra-high throughput sequencers : Illumina and Solid Difficulties : Predict complete gene structures with 40 bp reads Align short reads to exon/exon junctions (mapping algorithms allow a limited number of gaps during alignments). Molecular biology: Power sequencing. Brenton R. Graveley. Nature 453, ( (26 June 2008)
27 Typical RNA-Seq experiments
28 Annotating genomes using RNA-Seq 1. covtigs construction mapped reads genome coverage depth threshold covtigs Step 1. covtigs construction
29 Annotating genomes using RNA-Seq Inaccurate detection ti of splicing i junctions GGTGTTCACTACTTAGCCATGAAGATCTAGATTTCACACTTTTAGAAGCCTTAGAAAGCTG... covtig Mapped reads TGAAGATCTAGATTTCACACTTTTAGAAGCCTT TGAAGATCTAGATTTCACACTTTTAGAAGCCTT TGAAGATCTAGATTTCACACTTTTAGAAGCCTT TGAAGATCTAGATTTCACACTTTTAGAAGCCTT TGAAGATCTAGATTTCACACTTTTAGAAGCCTT TGAAGATCTAGATTTCAAACTTTTAGAAGCCTT TGAAGATCTAGATTTCACACTTTTAGAAGCCTT TGAAGATCTAGATTTCACACTTTTAGAAGCCTT TGAAGATCTAGATTTCACACTTTTAGAAGCCTT TGAAGATCTAGATTTCACACTTTTAGAAGCCTT TGAAGATCTAGATTTCACACTTTTAGAAGCCTT TGAAGATCTAGATTTCACACTTTTAGAAGCCT TGAAGATCTAGATTTCACACTTTTAGAAGCCT TGAAGATCTAGATTTCACACTTTTAGAAGCCT TGAAGATCTAGATTTCACACTTTTAGAAGCCT TGAAGATCTAGATTTCACACTTTTAGAAGCCT CTAGATTTCACACTTTTAGAAGCCTTAGAAAGC TCTAGATTTCACACTTTTAGAAGCCTTAGAAAG CTAGATTTCACACTTTTAGAAGCCTTAGAAAGC CTAGATTTCACACTTTTAGAAGCCTTATAAAG ATCTAGATTTCACACTTTTAGAAGCCTTAGAA CTAGATTTCACACTTTTAGAAGCCTTAGAAAG CTAGATTTCACACTTTTAGAAGCCTTATAAAG ATCTAGATTTCACACTTTTAGAAGCCTTAGAA GACACCATGAAGATCTAGATTTCACACTTTTAG CACCAACACCATGAAGATCTAGATTTCACACTT CACCAACACCATGAAGATCTAGATTTCACACTT Unmapped reads Improvement : CGACACCATGAAGATCTAGATTTCACACTTTTA CCAGCACCCACCAACACCATGAAGATCTAGATT CACCAACACCATGAAGATCTAGATTTCACACTT GGTGCACCCACCAACACCATGAAGATCTAGATT CAACACCATGAAGATCTAGATTTCACACTTTTA CCAACACCATGAAGATCTAGATTTCACACTTTT CACCAACACCATGAAGATCTAGATTTCACACTT Extension of covtigs using unmapped reads
30 2. candidate exons covtig 100 nt ag gt forward and reverse candidate exons Step 2. extraction of candidate exons
31 Annotating genomes using RNA-Seq unmapped reads word dictionary k-mer 1 X 1 Step 3: Validation of exon/exon junctions Validation of junctions between candidate exons using a word dictionary built from the unmapped reads.. k-mer 2 k-mer n X 2 X n verify words existence in the dictionary candidate exons gt ag validated junction covtig1...ggtgttcactacttacccatgt...agatctacacacttttagaagcctgaaag... covtig2 Kmers TTACCCAT ATCTACACACTTTTAGA CTTACCCAT ATCTACACACTTTTAG ACTTACCCAT ATCTACACACTTTTA TACTTACCCAT ATCTACACACTTTT CTACTTACCCAT ATCTACACACTTT ACTACTTACCCAT ATCTACACACTT CACTACTTACCCAT ATCTACACACT TCACTACTTACCCAT ATCTACACAC TTCACTACTTACCCAT ATCTACACA GTTCACTACTTACCCAT ATCTACAC Junction validation Unmapped reads Creation of the dictionnary TGTTCACTACTTACCCATATCTACACACTTTTAGAA TGTTCACTACTTACCCATATCTACA TCACTACTTACCCATATCTACACACTTTTAGAAGCC GTTCACTACTTACCCATATCTACAC GTTCACTACTTACCCATATCTACACACTTTTAGAAG TTCACTACTTACCCATATCTACACA TTCACTACTTACCCATATCTACACACTTTTAGAAGC TCACTACTTACCCATATCTACACAC TGTTCACTACTTACCCATATCTACACACTTTTAGAA CACTACTTACCCATATCTACACACT GTTCACTACTTACCCATATCTACACACTTTTAGAAG... GTGTTCACTACTTACCCATATCTACACACTTTTAGA
32 Annotating genomes using RNA-Seq 4. graph of candidate exons linked by validated junctions Open Reading Frame 5. model construction and coding sequence detection G-Mo.R-Se models M1 M 2 M3 M 4 M5 M 6 M 7 Real transcripts T 1 T 2 T 3 T 4 T 5
33 Annotating genomes using RNA-Seq Method set-up to annotate the vitis genome (500Mb) Around 175 million of illumina reads 4 tissues : leaf, root, stem and callus 140 million of uniquely aligned reads (73,5Mb) around covtigs (38,5Mb) transcript models ( loci), and with a plausible CDS ( loci) Around one week of computation with a desktop computer
34 Annotating genomes using RNA-Seq
35 Annotating genomes using RNA-Seq G-Mo.R-Se (Gene MOdeling using Rna-Seq), is downloadable from Genoscope website : Used with illumina data, but it can be easily adapt to manage Solid data (in colorspace) Method used to annotate the whole vitis genome Example of a novel gene identified by G-Mo.R-Se
36 Capture and sequencing for mutation discovery Laboratoire de Ressources Génomique : Gabòr Gyapay Laboratoire de Séquençage : Patrick Wincker Laboratoire d Analyse BIoinformatique des Séquences : François Artiguenave, Vincent Meyer, Marc Wessner, Benjamin Noel
37 Capture and sequencing for mutation discovery Goal : detection of mutations (SNPs) on large genomes (typically the human genome) parallelization of several samples reasonable cost Principle : Target regions on the human genome of several megabases. Amplify these targeted regions High throughput sequencing Nimblegen chips and 454 sequencing For which projects? Rare genetic diseases Cancer research
38 Capture and sequencing for mutation discovery Digital Light Processing technology
39 Capture and sequencing for mutation discovery
40 Capture and sequencing for mutation discovery Pilot project : genes, exons, cumulative of 4 Mb 8 samples : 4 tumor samples et 4 corresponding normal samples with 1 GSFLX run per sample (~ 100Mb) selected regions : 3,97Mb (average size of 300bp) Nimblegen design : targeted regions ; 5,6Mb (average size of 400bp) Selected regions Targeted regions
41 Plateforme détection de mutations Alignment of sequencing data with the human genome
42 Capture and sequencing for mutation discovery Fragments size : bp Sequencing of ~225 bp Targeted region Sequencing region
43 Capture and sequencing for mutation discovery Detection of around SNPs Extraction of 50 SNPs after selection and classification Criteria quality of the predicted SNP (sequencing depth and quality values) localization of the predicted SNP comparison between samples and with known SNPs
44 Current and future projects/applications pp Diversification of applications and projects : de novo sequencing, genome annotation, re-sequencing, metagenomic, functional genomic, identification of mutations and structure variations Increase of number and size of projects
45 Current and future projects De novo sequencing : Assembly of prokaryotic genomes with Illumina sequencing only Sequencing of large eukaryotic genomes with 454 and Sanger : banana (~500Mb ; WGS with 20X X Sanger), cocoa (~400Mb ; WGS with 20X Sanger BAC ends), trout (~2Gb ; WGS with 454 and Sanger BAC ends), wheat chromosome 3b (~1Gb ; 454) Benchmarking assembly of illumina data for large eukaryotic genomes : banana, cocoa, Re-sequencing : 100 Arabidopsis genomes : transposons mobility and methylation
46 Tara Oceans Project: Current and future projects Eukaryotic meta-genomic and meta-transcriptomic 3 years expedition with regular sampling at different depth and different cell size fractions. Pilot project in progress 1 sampling station, 3 different depths, 4 cell size fractions, 2 sequencing technologies (454 and illumina) Sequencing of DNA, total RNA, messenger RNA and Tags (V4 and V9 18S) Establish a collection of reference genome sequences (culture and single cell) and gene catalogue
47 Structural Variation Paired-End Mapping Reveals Extensive Structural Variation in the Human Genome. Jan O. Korbel et al. Science. Oct 2007 Deux individus : NA15510 et NA18505 Recircularisation de grands fragments d ADN (3Kb) et séquençage 454
48 Structural Variation Paired-End Mapping Reveals Extensive Structural Variation in the Human Genome. Jan O. Korbel et al. Science. Oct 2007 NA15510 : 10M de lectures (2,1X) et NA18505 : 21M de lectures (4,3X) Identification de SV indels et 122 inversions (45% de SVs partagés) Distribués sur l ensemble du génome avec quelques hotspots
49 Targeted RNA-Seq Targeted next-generation sequencing of a cancer transcriptome enhances detection of sequence variants and novel fusion transcripts. Levin JZ et al. Genome Biology. Oct combine le RNA-Seq et la capture de séquence (capture en milieu liquide : Agilent) test sur 467 gènes (protein tyrosine kinase, nuclear hormone-receptor, Cancer Gene Census) d une librairie de cdna provenant d une tumeur (K-562) RNA-Seq encore trop couteux lorsque l on veut détecter des événements rares (réarrangements structuraux, transcrits aberrants, ) et/ou que l on travaille sur des gènes faiblement exprimés
50 Targeted RNA-Seq Targeted next-generation sequencing of a cancer transcriptome enhances detection of sequence variants and novel fusion transcripts. Levin JZ et al. Genome Biology. Oct En plus de BCR-ABL1 et NUP214-XKR3 (détectés aussi sans capture), 4 autres fusions detectées. Dans tous les cas, un seul des deux gènes était ciblé.
51 Next-Next Generation Sequencing
52 Next-Next Generation Sequencing Plusieurs technologies différentes sont accessibles aujourd hui Ce sont des méthodes par flux : cycles incorporation, lecture, nouvelle incorporation Elles nécessitent une amplification de l ADN de départ Arrivée du séquençage «single-molecule, real-time»
53 Next-Next Generation Sequencing Heliscope Helicos Biosciences Single-molecule, flow sequencing Specifications : 30 Gb per run; 25 to 55 bases in length ; 600M reads 0,2% substitutions ; 1,5% insertions and 3,0% deletions Homopolymers trouble, but without phasing issue. Applications : aged DNA, expression level
54 Next-Next Generation Sequencing Heliscope Helicos Biosciences
55 Next-Next Generation Sequencing Pacific Biosciences i : SMRT (Single Molecule l Real Time) La polymérase est attachée au fond d un puit, d une dizaine de nanomètres de diamètre et d un volume de 20 zeptolitres (10-21 litres). Dans ce volume, l activité d un dun seul nucléotide peut être détecté parmi un bruit de fond de plusieurs centaines de nucléotides fluorescents.
56 Next-Next Generation Sequencing Pacific Biosciences : SMRT (Single Molecule Real Time) Vitesse de 4-5 bases / secondes et puits par puce (36Mb / heure), pour des lectures > 1000pb Prévision à 5 ans : 50 bases / secondes, avec des puces à 1M de puits, pour un débit d environ 100Gb / heure
57 Next-Next Generation Sequencing Pacific Biosciences i : SMRT (Single Molecule l Real Time) Possibilité de longue lecture : problème de perte de signal Nouveau concept : «Strobe Sequencing» ATGGTGTCCTGCATAGCATACAGTATGGTGGCATAGCATACAGTATGGTGTCCTGTCCTGCATAGCATACAGT Laser on Laser off Laser on ATGGTGTCCTGC..CAGTATGGTGTCC.
58 Next-Next Generation Sequencing
59 Next-Next Generation Sequencing
60 Next-Next Generation Sequencing Here s how the technology is used to call a base: If a nucleotide, for example a C, is added to a DNA template and is then incorporated into a strand of DNA, a hydrogen ion will be released. The charge from that ion will change the ph of the solution, which can be detected by our proprietary ion sensor
61 Next-Next Generation Sequencing If there are two identical bases on the DNA strand, the voltage will be double, and the chip will record two identical bases called. Because this is direct detection no scanning, no cameras, no light each nucleotide incorporation is recorded in seconds.
62 Next-Next Generation Sequencing ~50,000 $ par machine Run : 250 $ / run Read Lengths : bases Reads per run : Error rate : ~1% Portabilité, turn-over rapide : adapté à des tests rapides, proches du terrain Débit insuffisant pour les applications massives (ex. génome humain)
63 Next-Next Generation Sequencing FRETing FRET : Fluorescence Resonance Energy Transfer. Transfert d énergie d un «quantum dot» associé à une polymérase vers un nucléotide. Quand un nucléotide est incorporé, la proximité (entre le fluorophore donneur de la polymérase et le fluorophore accepteur du nucléotide) déclenche un signal FRET.
64 Next-Next Generation Sequencing Nanopore Sequencing
65 Next-Next Generation Sequencing Continuous base identification for single-molecule nanopore DNA sequencing. Clarke J, Wu HC, Jayasinghe L, Patel A, Reid S, Bayley H. Nature Nanotechnology 2009 Apr;4(4):
66 Next-Next Generation Sequencing Perspective : Séquenceurs de 3 ème génération Avantages du temps réel - molécule unique -Rapidité : pas d étape de préparation p d échantillon - Pas d amplification (pas de biais, travail possible sur des échantillons anciens, dégradés ) - Longues lectures : assemblage facilité; résolution d haplotype - Coût probablement très diminué
67 Conclusion Price-cutting => burst of novel applications Still at the beginning, NGS change every month : Illumina announce 200Gbases per run (8days) with the HiSeq /Roche will provide Kb reads in 2011 Next-next generation is coming : single molecule sequencing, longer reads, runtime decrease,... Necessity to provide adequate IT infrastructure : production, storage and analysis.
68 Conclusion Because of next-generation sequencing (NGS), the cost of sequencing a base is dropping faster than the cost of storing a byte. Scales are logarithmic and not corrected for inflation or costs of personnel, overheads and depreciation. Image is reprinted from Stein, L.D. Genome Biol. 11, 207 (2010).
69 Thank you Laboratoire d Analyse Bioinformatique des Séquences François Artiguenave Jean-Marc Aury Christophe Battail Maria Bernard France Denoeud Kamel Jabbari Olivier Jaillon Hamid Khalili Amine Madoui Vincent Meyer Benjamin Noel Corinne da Silva Marc Wessner Laboratoire de séquençage Patrick Wincker Corinne Cruaud Julie Poulain Adriana Alberti Karine Labadie Laboratoire finishing Valérie Barbe Sophie Mangenot Laboratoire d informatique Claude Scarpelli Veronique Anthouard Arnaud Couloux Frederic Gavory Ei Eric Pelletier Gaelle Samson
Next Generation Sequencing data Analysis at Genoscope. Jean-Marc Aury
Next Generation Sequencing data Analysis at Genoscope Jean-Marc Aury Introduction Presentation of Genoscope and NGS activities Overview of sequencing technologies Sequencing and assembly of prokaryotic
Next Generation Sequencing
Next Generation Sequencing Technology and applications 10/1/2015 Jeroen Van Houdt - Genomics Core - KU Leuven - UZ Leuven 1 Landmarks in DNA sequencing 1953 Discovery of DNA double helix structure 1977
Shouguo Gao Ph. D Department of Physics and Comprehensive Diabetes Center
Computational Challenges in Storage, Analysis and Interpretation of Next-Generation Sequencing Data Shouguo Gao Ph. D Department of Physics and Comprehensive Diabetes Center Next Generation Sequencing
Next Generation Sequencing: Technology, Mapping, and Analysis
Next Generation Sequencing: Technology, Mapping, and Analysis Gary Benson Computer Science, Biology, Bioinformatics Boston University [email protected] http://tandem.bu.edu/ The Human Genome Project took
Next generation DNA sequencing technologies. theory & prac-ce
Next generation DNA sequencing technologies theory & prac-ce Outline Next- Genera-on sequencing (NGS) technologies overview NGS applica-ons NGS workflow: data collec-on and processing the exome sequencing
Computational Genomics. Next generation sequencing (NGS)
Computational Genomics Next generation sequencing (NGS) Sequencing technology defies Moore s law Nature Methods 2011 Log 10 (price) Sequencing the Human Genome 2001: Human Genome Project 2.7G$, 11 years
Introduction to next-generation sequencing data
Introduction to next-generation sequencing data David Simpson Centre for Experimental Medicine Queens University Belfast http://www.qub.ac.uk/research-centres/cem/ Outline History of DNA sequencing NGS
Go where the biology takes you. Genome Analyzer IIx Genome Analyzer IIe
Go where the biology takes you. Genome Analyzer IIx Genome Analyzer IIe Go where the biology takes you. To published results faster With proven scalability To the forefront of discovery To limitless applications
Nazneen Aziz, PhD. Director, Molecular Medicine Transformation Program Office
2013 Laboratory Accreditation Program Audioconferences and Webinars Implementing Next Generation Sequencing (NGS) as a Clinical Tool in the Laboratory Nazneen Aziz, PhD Director, Molecular Medicine Transformation
July 7th 2009 DNA sequencing
July 7th 2009 DNA sequencing Overview Sequencing technologies Sequencing strategies Sample preparation Sequencing instruments at MPI EVA 2 x 5 x ABI 3730/3730xl 454 FLX Titanium Illumina Genome Analyzer
Introduction to transcriptome analysis using High Throughput Sequencing technologies (HTS)
Introduction to transcriptome analysis using High Throughput Sequencing technologies (HTS) A typical RNA Seq experiment Library construction Protocol variations Fragmentation methods RNA: nebulization,
NGS data analysis. Bernardo J. Clavijo
NGS data analysis Bernardo J. Clavijo 1 A brief history of DNA sequencing 1953 double helix structure, Watson & Crick! 1977 rapid DNA sequencing, Sanger! 1977 first full (5k) genome bacteriophage Phi X!
New generation sequencing: current limits and future perspectives. Giorgio Valle CRIBI - Università di Padova
New generation sequencing: current limits and future perspectives Giorgio Valle CRIBI Università di Padova Around 2004 the Race for the 1000$ Genome started A few questions... When? How? Why? Standard
Introduction to NGS data analysis
Introduction to NGS data analysis Jeroen F. J. Laros Leiden Genome Technology Center Department of Human Genetics Center for Human and Clinical Genetics Sequencing Illumina platforms Characteristics: High
History of DNA Sequencing & Current Applications
History of DNA Sequencing & Current Applications Christopher McLeod President & CEO, 454 Life Sciences, A Roche Company IMPORTANT NOTICE Intended Use Unless explicitly stated otherwise, all Roche Applied
Overview of Next Generation Sequencing platform technologies
Overview of Next Generation Sequencing platform technologies Dr. Bernd Timmermann Next Generation Sequencing Core Facility Max Planck Institute for Molecular Genetics Berlin, Germany Outline 1. Technologies
Automated DNA sequencing 20/12/2009. Next Generation Sequencing
DNA sequencing the beginnings Ghent University (Fiers et al) pioneers sequencing first complete gene (1972) first complete genome (1976) Next Generation Sequencing Fred Sanger develops dideoxy sequencing
The Power of Next-Generation Sequencing in Your Hands On the Path towards Diagnostics
The Power of Next-Generation Sequencing in Your Hands On the Path towards Diagnostics The GS Junior System The Power of Next-Generation Sequencing on Your Benchtop Proven technology: Uses the same long
DNA Sequencing & The Human Genome Project
DNA Sequencing & The Human Genome Project An Endeavor Revolutionizing Modern Biology Jutta Marzillier, Ph.D Lehigh University Biological Sciences November 13 th, 2013 Guess, who turned 60 earlier this
A Primer of Genome Science THIRD
A Primer of Genome Science THIRD EDITION GREG GIBSON-SPENCER V. MUSE North Carolina State University Sinauer Associates, Inc. Publishers Sunderland, Massachusetts USA Contents Preface xi 1 Genome Projects:
G E N OM I C S S E RV I C ES
GENOMICS SERVICES THE NEW YORK GENOME CENTER NYGC is an independent non-profit implementing advanced genomic research to improve diagnosis and treatment of serious diseases. capabilities. N E X T- G E
Core Facility Genomics
Core Facility Genomics versatile genome or transcriptome analyses based on quantifiable highthroughput data ascertainment 1 Topics Collaboration with Harald Binder and Clemens Kreutz Project: Microarray
NGS Technologies for Genomics and Transcriptomics
NGS Technologies for Genomics and Transcriptomics Massimo Delledonne Department of Biotechnologies - University of Verona http://profs.sci.univr.it/delledonne 13 years and $3 billion required for the Human
An Overview of DNA Sequencing
An Overview of DNA Sequencing Prokaryotic DNA Plasmid http://en.wikipedia.org/wiki/image:prokaryote_cell_diagram.svg Eukaryotic DNA http://en.wikipedia.org/wiki/image:plant_cell_structure_svg.svg DNA Structure
LifeScope Genomic Analysis Software 2.5
USER GUIDE LifeScope Genomic Analysis Software 2.5 Graphical User Interface DATA ANALYSIS METHODS AND INTERPRETATION Publication Part Number 4471877 Rev. A Revision Date November 2011 For Research Use
Data Processing of Nextera Mate Pair Reads on Illumina Sequencing Platforms
Data Processing of Nextera Mate Pair Reads on Illumina Sequencing Platforms Introduction Mate pair sequencing enables the generation of libraries with insert sizes in the range of several kilobases (Kb).
Discovery and Quantification of RNA with RNASeq Roderic Guigó Serra Centre de Regulació Genòmica (CRG) [email protected]
Bioinformatique et Séquençage Haut Débit, Discovery and Quantification of RNA with RNASeq Roderic Guigó Serra Centre de Regulació Genòmica (CRG) [email protected] 1 RNA Transcription to RNA and subsequent
SeqScape Software Version 2.5 Comprehensive Analysis Solution for Resequencing Applications
Product Bulletin Sequencing Software SeqScape Software Version 2.5 Comprehensive Analysis Solution for Resequencing Applications Comprehensive reference sequence handling Helps interpret the role of each
RETRIEVING SEQUENCE INFORMATION. Nucleotide sequence databases. Database search. Sequence alignment and comparison
RETRIEVING SEQUENCE INFORMATION Nucleotide sequence databases Database search Sequence alignment and comparison Biological sequence databases Originally just a storage place for sequences. Currently the
Genetic Analysis. Phenotype analysis: biological-biochemical analysis. Genotype analysis: molecular and physical analysis
Genetic Analysis Phenotype analysis: biological-biochemical analysis Behaviour under specific environmental conditions Behaviour of specific genetic configurations Behaviour of progeny in crosses - Genotype
Just the Facts: A Basic Introduction to the Science Underlying NCBI Resources
1 of 8 11/7/2004 11:00 AM National Center for Biotechnology Information About NCBI NCBI at a Glance A Science Primer Human Genome Resources Model Organisms Guide Outreach and Education Databases and Tools
CAMÉRAS EMCCD POUR LA QUANTIFICATION DE FLUORESCENCE IN VIVO
CAMÉRAS EMCCD POUR LA QUANTIFICATION DE FLUORESCENCE IN VIVO Yoann Gosselin, B.Ing., M.Sc.A. Ingénieur en application 2 3 3 TECHNOLOGIE BREVETÉE POUR LA CAMÉRA LA PLUS SENSIBLE 6 TECHNOLOGIE BREVETÉE
Tutorial for Windows and Macintosh. Preparing Your Data for NGS Alignment
Tutorial for Windows and Macintosh Preparing Your Data for NGS Alignment 2015 Gene Codes Corporation Gene Codes Corporation 775 Technology Drive, Ann Arbor, MI 48108 USA 1.800.497.4939 (USA) 1.734.769.7249
Next Generation Sequencing; Technologies, applications and data analysis
; Technologies, applications and data analysis Course 2542 Dr. Martie C.M. Verschuren Research group Analysis techniques in Life Science, Breda Prof. dr. Johan T. den Dunnen Leiden Genome Technology Center,
Analysis of NGS Data
Analysis of NGS Data Introduction and Basics Folie: 1 Overview of Analysis Workflow Images Basecalling Sequences denovo - Sequencing Assembly Annotation Resequencing Alignments Comparison to reference
Microbial Oceanomics using High-Throughput DNA Sequencing
Microbial Oceanomics using High-Throughput DNA Sequencing Ramiro Logares Institute of Marine Sciences, CSIC, Barcelona 9th RES Users'Conference 23 September 2015 Importance of microbes in the sunlit ocean
Advances in RainDance Sequence Enrichment Technology and Applications in Cancer Research. March 17, 2011 Rendez-Vous Séquençage
Advances in RainDance Sequence Enrichment Technology and Applications in Cancer Research March 17, 2011 Rendez-Vous Séquençage Presentation Overview Core Technology Review Sequence Enrichment Application
Lectures 1 and 8 15. February 7, 2013. Genomics 2012: Repetitorium. Peter N Robinson. VL1: Next- Generation Sequencing. VL8 9: Variant Calling
Lectures 1 and 8 15 February 7, 2013 This is a review of the material from lectures 1 and 8 14. Note that the material from lecture 15 is not relevant for the final exam. Today we will go over the material
How is genome sequencing done?
How is genome sequencing done? Using 454 Sequencing on the Genome Sequencer FLX System, DNA from a genome is converted into sequence data through four primary steps: Step One DNA sample preparation; Step
Concepts and methods in sequencing and genome assembly
BCM-2004 Concepts and methods in sequencing and genome assembly B. Franz LANG, Département de Biochimie Bureau: H307-15 Courrier électronique: [email protected] Outline 1. Concepts in DNA and RNA
Next Generation Sequencing: Adjusting to Big Data. Daniel Nicorici, Dr.Tech. Statistikot Suomen Lääketeollisuudessa 29.10.2013
Next Generation Sequencing: Adjusting to Big Data Daniel Nicorici, Dr.Tech. Statistikot Suomen Lääketeollisuudessa 29.10.2013 Outline Human Genome Project Next-Generation Sequencing Personalized Medicine
Focusing on results not data comprehensive data analysis for targeted next generation sequencing
Focusing on results not data comprehensive data analysis for targeted next generation sequencing Daniel Swan, Jolyon Holdstock, Angela Matchan, Richard Stark, John Shovelton, Duarte Mohla and Simon Hughes
Next Generation Sequencing for DUMMIES
Next Generation Sequencing for DUMMIES Looking at a presentation without the explanation from the author is sometimes difficult to understand. This document contains extra information for some slides that
Next generation sequencing (NGS)
Next generation sequencing (NGS) Vijayachitra Modhukur BIIT [email protected] 1 Bioinformatics course 11/13/12 Sequencing 2 Bioinformatics course 11/13/12 Microarrays vs NGS Sequences do not need to be known
Introduction to Bioinformatics 3. DNA editing and contig assembly
Introduction to Bioinformatics 3. DNA editing and contig assembly Benjamin F. Matthews United States Department of Agriculture Soybean Genomics and Improvement Laboratory Beltsville, MD 20708 [email protected]
QUALITY AND SAFETY TESTING
QUALITY AND SAFETY TESTING Large scale of solutions for the identification and rapid detection of micro-organisms. Easier investment in molecular techniques Frédéric BAR, Key Account Manager b2 b3b4 Quality
Data Analysis for Ion Torrent Sequencing
IFU022 v140202 Research Use Only Instructions For Use Part III Data Analysis for Ion Torrent Sequencing MANUFACTURER: Multiplicom N.V. Galileilaan 18 2845 Niel Belgium Revision date: August 21, 2014 Page
NECC History. Karl V. Steiner 2011 Annual NECC Meeting, Orono, Maine March 15, 2011
NECC History Karl V. Steiner 2011 Annual NECC Meeting, Orono, Maine March 15, 2011 EPSCoR Cyberinfrastructure Workshop First regional NENI (now NECC) Workshop held in Vermont in August 2007 Workshop heldinkentucky
Text file One header line meta information lines One line : variant/position
Software Calling: GATK SAMTOOLS mpileup Varscan SOAP VCF format Text file One header line meta information lines One line : variant/position ##fileformat=vcfv4.1! ##filedate=20090805! ##source=myimputationprogramv3.1!
escience and Post-Genome Biomedical Research
escience and Post-Genome Biomedical Research Thomas L. Casavant, Adam P. DeLuca Departments of Biomedical Engineering, Electrical Engineering and Ophthalmology Coordinated Laboratory for Computational
Next Generation Sequencing; Technologies, applications and data analysis
; Technologies, applications and data analysis Course 2542 Dr. Martie C.M. Verschuren Research group Analysis techniques in Life Science, Breda Prof. dr. Johan T. den Dunnen Leiden Genome Technology Center,
De Novo Assembly Using Illumina Reads
De Novo Assembly Using Illumina Reads High quality de novo sequence assembly using Illumina Genome Analyzer reads is possible today using publicly available short-read assemblers. Here we summarize the
New Technologies for Sensitive, Low-Input RNA-Seq. Clontech Laboratories, Inc.
New Technologies for Sensitive, Low-Input RNA-Seq Clontech Laboratories, Inc. Outline Introduction Single-Cell-Capable mrna-seq Using SMART Technology SMARTer Ultra Low RNA Kit for the Fluidigm C 1 System
Challenges associated with analysis and storage of NGS data
Challenges associated with analysis and storage of NGS data Gabriella Rustici Research and training coordinator Functional Genomics Group [email protected] Next-generation sequencing Next-generation sequencing
FlipFlop: Fast Lasso-based Isoform Prediction as a Flow Problem
FlipFlop: Fast Lasso-based Isoform Prediction as a Flow Problem Elsa Bernard Laurent Jacob Julien Mairal Jean-Philippe Vert September 24, 2013 Abstract FlipFlop implements a fast method for de novo transcript
Human Genome Organization: An Update. Genome Organization: An Update
Human Genome Organization: An Update Genome Organization: An Update Highlights of Human Genome Project Timetable Proposed in 1990 as 3 billion dollar joint venture between DOE and NIH with 15 year completion
Illumina Sequencing Technology
Illumina Sequencing Technology Highest data accuracy, simple workflow, and a broad range of applications. Introduction Figure 1: Illumina Flow Cell Illumina sequencing technology leverages clonal array
An example of bioinformatics application on plant breeding projects in Rijk Zwaan
An example of bioinformatics application on plant breeding projects in Rijk Zwaan Xiangyu Rao 17-08-2012 Introduction of RZ Rijk Zwaan is active worldwide as a vegetable breeding company that focuses on
Standards, Guidelines and Best Practices for RNA-Seq V1.0 (June 2011) The ENCODE Consortium
Standards, Guidelines and Best Practices for RNA-Seq V1.0 (June 2011) The ENCODE Consortium I. Introduction: Sequence based assays of transcriptomes (RNA-seq) are in wide use because of their favorable
INTERNATIONAL CONFERENCE ON HARMONISATION OF TECHNICAL REQUIREMENTS FOR REGISTRATION OF PHARMACEUTICALS FOR HUMAN USE Q5B
INTERNATIONAL CONFERENCE ON HARMONISATION OF TECHNICAL REQUIREMENTS FOR REGISTRATION OF PHARMACEUTICALS FOR HUMAN USE ICH HARMONISED TRIPARTITE GUIDELINE QUALITY OF BIOTECHNOLOGICAL PRODUCTS: ANALYSIS
Frequently Asked Questions Next Generation Sequencing
Frequently Asked Questions Next Generation Sequencing Import These Frequently Asked Questions for Next Generation Sequencing are some of the more common questions our customers ask. Questions are divided
Single-Cell DNA Sequencing with the C 1. Single-Cell Auto Prep System. Reveal hidden populations and genetic diversity within complex samples
DATA Sheet Single-Cell DNA Sequencing with the C 1 Single-Cell Auto Prep System Reveal hidden populations and genetic diversity within complex samples Single-cell sensitivity Discover and detect SNPs,
SEQUENCING. From Sample to Sequence-Ready
SEQUENCING From Sample to Sequence-Ready ACCESS ARRAY SYSTEM HIGH-QUALITY LIBRARIES, NOT ONCE, BUT EVERY TIME The highest-quality amplicons more sensitive, accurate, and specific Full support for all major
Introduction au BIM. ESEB 38170 Seyssinet-Pariset Economie de la construction email : [email protected]
Quel est l objectif? 1 La France n est pas le seul pays impliqué 2 Une démarche obligatoire 3 Une organisation plus efficace 4 Le contexte 5 Risque d erreur INTERVENANTS : - Architecte - Économiste - Contrôleur
Appendix 2 Molecular Biology Core Curriculum. Websites and Other Resources
Appendix 2 Molecular Biology Core Curriculum Websites and Other Resources Chapter 1 - The Molecular Basis of Cancer 1. Inside Cancer http://www.insidecancer.org/ From the Dolan DNA Learning Center Cold
NEXT GENERATION SEQUENCING
NEXT GENERATION SEQUENCING Dr. R. Piazza SANGER SEQUENCING + DNA NEXT GENERATION SEQUENCING Flowcell NEXT GENERATION SEQUENCING Library di DNA Genomic DNA NEXT GENERATION SEQUENCING NEXT GENERATION SEQUENCING
Welcome to Pacific Biosciences' Introduction to SMRTbell Template Preparation.
Introduction to SMRTbell Template Preparation 100 338 500 01 1. SMRTbell Template Preparation 1.1 Introduction to SMRTbell Template Preparation Welcome to Pacific Biosciences' Introduction to SMRTbell
Algorithms for Next Generation Sequencing Data Analysis
UNIVERSITÀ DEGLI STUDI DI MILANO - BICOCCA FACOLTÀ DI SCIENZE MATEMATICHE, FISICHE E NATURALI DIPARTIMENTO DI INFORMATICA, SISTEMISTICA E COMUNICAZIONE DOTTORATO DI RICERCA IN INFORMATICA - CICLO XXV Ph.D.
Keeping up with DNA technologies
Keeping up with DNA technologies Mihai Pop Department of Computer Science Center for Bioinformatics and Computational Biology University of Maryland, College Park The evolution of DNA sequencing Since
The sequence of bases on the mrna is a code that determines the sequence of amino acids in the polypeptide being synthesized:
Module 3F Protein Synthesis So far in this unit, we have examined: How genes are transmitted from one generation to the next Where genes are located What genes are made of How genes are replicated How
Targeted. sequencing solutions. Accurate, scalable, fast TARGETED
Targeted TARGETED Sequencing sequencing solutions Accurate, scalable, fast Sequencing for every lab, every budget, every application Ion Torrent semiconductor sequencing Ion Torrent technology has pioneered
Genomics Services @ GENterprise
Genomics Services @ GENterprise since 1998 Mainz University spin-off privately financed 6-10 employees since 2006 Genomics Services @ GENterprise Sequencing Service (Sanger/3730, 454) Genome Projects (Bacteria,
Genotyping by sequencing and data analysis. Ross Whetten North Carolina State University
Genotyping by sequencing and data analysis Ross Whetten North Carolina State University Stein (2010) Genome Biology 11:207 More New Technology on the Horizon Genotyping By Sequencing Timeline 2007 Complexity
School of Nursing. Presented by Yvette Conley, PhD
Presented by Yvette Conley, PhD What we will cover during this webcast: Briefly discuss the approaches introduced in the paper: Genome Sequencing Genome Wide Association Studies Epigenomics Gene Expression
Big data in cancer research : DNA sequencing and personalised medicine
Big in cancer research : DNA sequencing and personalised medicine Philippe Hupé Conférence BIGDATA 04/04/2013 1 - Titre de la présentation - nom du département émetteur et/ ou rédacteur - 00/00/2005 Deciphering
An Introduction to Next-Generation Sequencing Technology
n Introduction to Next-eneration Sequencing echnology Part II: n overview of DN Sequencing pplications Diverse pplications Next-generation sequencing (NS) platforms enable a wide variety of applications,
Structure and Function of DNA
Structure and Function of DNA DNA and RNA Structure DNA and RNA are nucleic acids. They consist of chemical units called nucleotides. The nucleotides are joined by a sugar-phosphate backbone. The four
DNA Sequence Analysis
DNA Sequence Analysis Two general kinds of analysis Screen for one of a set of known sequences Determine the sequence even if it is novel Screening for a known sequence usually involves an oligonucleotide
BIOL 3200 Spring 2015 DNA Subway and RNA-Seq Data Analysis
BIOL 3200 Spring 2015 DNA Subway and RNA-Seq Data Analysis By the end of this lab students should be able to: Describe the uses for each line of the DNA subway program (Red/Yellow/Blue/Green) Describe
Single-Cell Whole Genome Sequencing on the C1 System: a Performance Evaluation
PN 100-9879 A1 TECHNICAL NOTE Single-Cell Whole Genome Sequencing on the C1 System: a Performance Evaluation Introduction Cancer is a dynamic evolutionary process of which intratumor genetic and phenotypic
RNAseq / ChipSeq / Methylseq and personalized genomics
RNAseq / ChipSeq / Methylseq and personalized genomics 7711 Lecture Subhajyo) De, PhD Division of Biomedical Informa)cs and Personalized Biomedicine, Department of Medicine University of Colorado School
Bioruptor NGS: Unbiased DNA shearing for Next-Generation Sequencing
STGAAC STGAACT GTGCACT GTGAACT STGAAC STGAACT GTGCACT GTGAACT STGAAC STGAAC GTGCAC GTGAAC Wouter Coppieters Head of the genomics core facility GIGA center, University of Liège Bioruptor NGS: Unbiased DNA
CHALLENGES IN NEXT-GENERATION SEQUENCING
CHALLENGES IN NEXT-GENERATION SEQUENCING BASIC TENETS OF DATA AND HPC Gray s Laws of data engineering 1 : Scientific computing is very dataintensive, with no real limits. The solution is scale-out architecture
PrimePCR Assay Validation Report
Gene Information Gene Name Gene Symbol Organism Gene Summary Gene Aliases RefSeq Accession No. UniGene ID Ensembl Gene ID papillary renal cell carcinoma (translocation-associated) PRCC Human This gene
MUTATION, DNA REPAIR AND CANCER
MUTATION, DNA REPAIR AND CANCER 1 Mutation A heritable change in the genetic material Essential to the continuity of life Source of variation for natural selection New mutations are more likely to be harmful
To be able to describe polypeptide synthesis including transcription and splicing
Thursday 8th March COPY LO: To be able to describe polypeptide synthesis including transcription and splicing Starter Explain the difference between transcription and translation BATS Describe and explain
Sanger Sequencing and Quality Assurance. Zbigniew Rudzki Department of Pathology University of Melbourne
Sanger Sequencing and Quality Assurance Zbigniew Rudzki Department of Pathology University of Melbourne Sanger DNA sequencing The era of DNA sequencing essentially started with the publication of the enzymatic
Techniques in Molecular Biology (to study the function of genes)
Techniques in Molecular Biology (to study the function of genes) Analysis of nucleic acids: Polymerase chain reaction (PCR) Gel electrophoresis Blotting techniques (Northern, Southern) Gene expression
Analysis of ChIP-seq data in Galaxy
Analysis of ChIP-seq data in Galaxy November, 2012 Local copy: https://galaxy.wi.mit.edu/ Joint project between BaRC and IT Main site: http://main.g2.bx.psu.edu/ 1 Font Conventions Bold and blue refers
