GenomeView. June 2 nd, 2010, Espoo, Finland

Similar documents
Next Generation Sequencing: Technology, Mapping, and Analysis

Next Generation Sequencing

RETRIEVING SEQUENCE INFORMATION. Nucleotide sequence databases. Database search. Sequence alignment and comparison

Introduction to transcriptome analysis using High Throughput Sequencing technologies (HTS)

Next generation DNA sequencing technologies. theory & prac-ce

SeqScape Software Version 2.5 Comprehensive Analysis Solution for Resequencing Applications

New generation sequencing: current limits and future perspectives. Giorgio Valle CRIBI - Università di Padova

NGS data analysis. Bernardo J. Clavijo

Shouguo Gao Ph. D Department of Physics and Comprehensive Diabetes Center

Introduction to NGS data analysis

July 7th 2009 DNA sequencing

Bioinformatics Resources at a Glance

Introduction to Bioinformatics 3. DNA editing and contig assembly

Computational Genomics. Next generation sequencing (NGS)

Go where the biology takes you. Genome Analyzer IIx Genome Analyzer IIe

Analysis of gene expression data. Ulf Leser and Philippe Thomas

Keeping up with DNA technologies

A Primer of Genome Science THIRD

Version 5.0 Release Notes

Nazneen Aziz, PhD. Director, Molecular Medicine Transformation Program Office

Final Project Report

HENIPAVIRUS ANTIBODY ESCAPE SEQUENCING REPORT

Recombinant DNA and Biotechnology

NECC History. Karl V. Steiner 2011 Annual NECC Meeting, Orono, Maine March 15, 2011

Single Nucleotide Polymorphisms (SNPs)

Focusing on results not data comprehensive data analysis for targeted next generation sequencing

Next Generation Sequencing for DUMMIES

SMRT Analysis v2.2.0 Overview. 1. SMRT Analysis v SMRT Analysis v2.2.0 Overview. Notes:

Automated DNA sequencing 20/12/2009. Next Generation Sequencing

Genomes and SNPs in Malaria and Sickle Cell Anemia

How Sequencing Experiments Fail

Transcription and Translation of DNA

Introduction to next-generation sequencing data

Lab # 12: DNA and RNA

Tutorial for Windows and Macintosh. Preparing Your Data for NGS Alignment

Core Facility Genomics

Gene Models & Bed format: What they represent.

Systematic discovery of regulatory motifs in human promoters and 30 UTRs by comparison of several mammals

Using Illumina BaseSpace Apps to Analyze RNA Sequencing Data

Discovery and Quantification of RNA with RNASeq Roderic Guigó Serra Centre de Regulació Genòmica (CRG)

Data Analysis & Management of High-throughput Sequencing Data. Quoclinh Nguyen Research Informatics Genomics Core / Medical Research Institute

When you install Mascot, it includes a copy of the Swiss-Prot protein database. However, it is almost certain that you and your colleagues will want

Microarray Technology

Searching Nucleotide Databases

Human-Mouse Synteny in Functional Genomics Experiment

Frequently Asked Questions Next Generation Sequencing

Data search and visualization tools at the Comparative Evolutionary Genomics of Cotton Web resource

Delivering the power of the world s most successful genomics platform

How many of you have checked out the web site on protein-dna interactions?

Next Generation Sequencing; Technologies, applications and data analysis

Human Genome Organization: An Update. Genome Organization: An Update

Advances in RainDance Sequence Enrichment Technology and Applications in Cancer Research. March 17, 2011 Rendez-Vous Séquençage

Removing Sequential Bottlenecks in Analysis of Next-Generation Sequencing Data

Biotechnology and Recombinant DNA (Chapter 9) Lecture Materials for Amy Warenda Czura, Ph.D. Suffolk County Community College

Analysis of ChIP-seq data in Galaxy

LifeScope Genomic Analysis Software 2.5

Next Generation Sequencing; Technologies, applications and data analysis

Data Analysis for Ion Torrent Sequencing

Overview sequence projects

AS Replaces Page 1 of 50 ATF. Software for. DNA Sequencing. Operators Manual. Assign-ATF is intended for Research Use Only (RUO):

All your base(s) are belong to us

BIOL 3200 Spring 2015 DNA Subway and RNA-Seq Data Analysis

Organization and analysis of NGS variations. Alireza Hadj Khodabakhshi Research Investigator

Genotyping by sequencing and data analysis. Ross Whetten North Carolina State University

Data Processing of Nextera Mate Pair Reads on Illumina Sequencing Platforms

Next Generation Sequencing; Technologies, applications and data analysis

TECHNOLOGIES, PRODUCTS & SERVICES for MOLECULAR DIAGNOSTICS, MDx ABA 298

UCLA Team Sequences Cell Line, Puts Open Source Software Framework into Production

NGS Technologies for Genomics and Transcriptomics

Geospiza s Finch-Server: A Complete Data Management System for DNA Sequencing

Just the Facts: A Basic Introduction to the Science Underlying NCBI Resources

SICKLE CELL ANEMIA & THE HEMOGLOBIN GENE TEACHER S GUIDE

Managing and Conducting Biomedical Research on the Cloud Prasad Patil

Software Getting Started Guide

Algorithms in Computational Biology (236522) spring 2007 Lecture #1

Intro to Bioinformatics

Module 1. Sequence Formats and Retrieval. Charles Steward

Expression Quantification (I)

RNAseq / ChipSeq / Methylseq and personalized genomics

An FPGA Acceleration of Short Read Human Genome Mapping

escience and Post-Genome Biomedical Research

Next generation sequencing (NGS)

Biotechnology: DNA Technology & Genomics

Typing in the NGS era: The way forward!

Genetomic Promototypes

An example of bioinformatics application on plant breeding projects in Rijk Zwaan

Sequence Formats and Sequence Database Searches. Gloria Rendon SC11 Education June, 2011

Targeted. sequencing solutions. Accurate, scalable, fast TARGETED

Lecture 13: DNA Technology. DNA Sequencing. DNA Sequencing Genetic Markers - RFLPs polymerase chain reaction (PCR) products of biotechnology

Molecular Genetics: Challenges for Statistical Practice. J.K. Lindsey

PrimePCR Assay Validation Report

What is a contig? What are the contig assembly programs?

Disease gene identification with exome sequencing

Overview of Next Generation Sequencing platform technologies

Single-Cell Whole Genome Sequencing on the C1 System: a Performance Evaluation

Interaktionen von RNAs und Proteinen

G E N OM I C S S E RV I C ES

New Technologies for Sensitive, Low-Input RNA-Seq. Clontech Laboratories, Inc.

Lectures 1 and February 7, Genomics 2012: Repetitorium. Peter N Robinson. VL1: Next- Generation Sequencing. VL8 9: Variant Calling

REAL TIME PCR USING SYBR GREEN

Transcription:

Visualizing NGS data with GenomeView June 2 nd, 2010, Espoo, Finland Thomas Abeel thomas@abeel.be VIB - Ghent University, Gent, Belgium Broad Institute of MIT and Harvard, Cambridge, MA, USA

Overview Introduction to NGS What is NGS? Platforms? Applications of NGS: Assembly Mapping: RNA-seq, chip-seq, resequencing, etc. Visualizing NGS and other data Description of GenomeView 2

NGS INTRODUCTION 3

Next-gen sequencing Next-generations sequencing machine ABI Solid, Illumina(Solexa), and Roche 454 Helicos Biosciences, Pacific Biosciences Pour DNA in machine and you get reads Read = fragment of sequenced DNA Gen. 1st next next next Platform Sanger Roche 454 Illumina SOLiD Data/run 100kb 500 Mb 8 Gb 16Gb Read len. 1000 nt 500 nt 100 nt 50 nt $/genome 1,000k 25k 25k 25k 4

1 st, next, 3 rd? Discussion what s next -generation? Generation Platforms Status 1 Sanger Production 2 (next) 454 Production 2.5 (next) Illumina, Solid Production 3 Helicos, Pacbio Firstcommercial models available 3.x (not available) Nanopore,. Prototype For ease of discussion, NGS will be used for anything not Sanger seq. 5

What to do with short reads Assembly At least 40x coverage Result is large number of contigs Map to a reference genome Re-sequencing SNP discovery ChIP-seq RNA-seq 6

Re-sequencing Sequence the genome of an organism for which you already have a genomic sequence Genomic diversity: SNPs, indels, but also structural variation 7

RNA-seq Whole genome transcriptomesequencing Extract mrna by picking up the poly-a tail Sequence cdna 8

RNA-seqapplications Gene expression If a read maps to a gene, that gene is expressed Expression measure If 10 reads map to the same place in a gene, it s 10 fold expressed Condition/cell type specific expression Microarray replacement Annotation 9

ChIP-seq ChIPproduces a library of target DNA to which a protein of interest binds in-vivo -seqsequences the library Less bias than ChIP-on-chip because not limited to probes Application: Find transcription factor binding sites 10

Short read mapping Each of the proposed applications boil down to aligning the reads to a reference sequence Characteristics: Many short queries (reads) Single large reference (reference) Mismatches allowed 11

Alignment challenges Efficiency Need to align several billion reads (300 Gb) To a genome of several billion nucleotides (3 Gb) Preferably over the weekend Ambiguity Sequencing errors Genetic variation Alignment with mismatches 12

Assigning reads 36 nt with a couple of mismatches reads may align to multiple locations in the genome Unique: where it maps Non-unique: all locations random nowhere 13

Short read aligners Slide adapted from Heng Li 14

Typical NGS analysis pipeline Extracted DNA (or somebody else did it) Put it in machine, got a lot of data in return We have aligned it to a reference sequence We now have an even bigger file with even more information. 15

Analyses SNP calling Peak detection Gene prediction Transcript identification/quantification Visualization 16

Reasons for visualization Taking a look at the data Sanity check on the data Hypothesis generation Provide insights in large-scale data sets Augment ability to reason about complex data Make it easier to develop algorithms The appropriate image makes the solution obvious 17

VISUALIZING NGS DATA 18

GenomeView Interactive genome browser/editor Annotations and mapped experimental data Short read alignments (*-seq) Multiple alignments 19

GUI Overview 20

Basic features 21

Short read mappings Green = forward mapped read Blue= reverse mapped read Yellow= mismatch Black= read insertion Red= read deletion Purple = connector for paired-end reads and spliced reads 22

Quality, indelsand mismatches 23

SNP track 24

Re-sequencing demo

SNPs http://www.youtube.com/watch?v=kpgarxgbdam 26

RNA-seqdemo

RNA-seq http://www.youtube.com/watch?v=pgbu1gwckwu 28

Chip-seq 29

Other features Plug-ins, editor, integration, multiple alignments

Plug-ins Computational analyses can be plugged in Gene prediction Core promoter prediction (EP3, Prosom) Splice site prediction (SpliceMachine) Translation start site prediction (StartScan) Coding potential prediction Physical properties of DNA Blast... 31

Editor 32

Browser integration SNP in short read alignment Select polymorphism View polymorphism in GV Go to polymorphism detail page Select strain with alternative allele June 2, 2010 33

Editor integration 34

Saving back to the server 35

Synteny 36

Multiple alignments 37

Next-gen multiple alignments 38

Things to come Indexed sequence data structure Indexed feature data structure Faster and more interactive 39

Summary NGS data sets are large and daunting Visualization makes them less daunting Visualization is essential GenomeViewcan visualize a broad set of NGS data sets (and other types of data) GenomeViewis freely available @ http://genomeview.org 40

Acknowledgements Yves Van de Peer Yvan Saeys James Galagan Thomas Van Parys http://genomeview.org 41