First generation" sequencing technologies and genome assembly. Roger Bumgarner Associate Professor, Microbiology, UW Rogerb@u.washington.



Similar documents
- In , Allan Maxam and walter Gilbert devised the first method for sequencing DNA fragments containing up to ~ 500 nucleotides.

Lecture 13: DNA Technology. DNA Sequencing. DNA Sequencing Genetic Markers - RFLPs polymerase chain reaction (PCR) products of biotechnology

Recombinant DNA & Genetic Engineering. Tools for Genetic Manipulation

Sanger Sequencing and Quality Assurance. Zbigniew Rudzki Department of Pathology University of Melbourne

An Overview of DNA Sequencing

CCR Biology - Chapter 9 Practice Test - Summer 2012

How many of you have checked out the web site on protein-dna interactions?

DNA Sequence Analysis

1/12 Dideoxy DNA Sequencing

July 7th 2009 DNA sequencing

DNA Sequencing & The Human Genome Project

restriction enzymes 350 Home R. Ward: Spring 2001

Troubleshooting Sequencing Data

DNA Sequencing Handbook

Biotechnology: DNA Technology & Genomics

Biotechnology and Recombinant DNA (Chapter 9) Lecture Materials for Amy Warenda Czura, Ph.D. Suffolk County Community College

14.3 Studying the Human Genome

Procedures For DNA Sequencing

Sanger Sequencing. Troubleshooting Guide. Failed sequence

DNA Sequencing Troubleshooting Guide

DNA SEQUENCING SANGER: TECHNICALS SOLUTIONS GUIDE

Next Generation Sequencing

CHAPTER 6: RECOMBINANT DNA TECHNOLOGY YEAR III PHARM.D DR. V. CHITRA

Introduction. Preparation of Template DNA

DNA Replication in Prokaryotes

Reading DNA Sequences:

Introduction To Real Time Quantitative PCR (qpcr)

Genetic Technology. Name: Class: Date: Multiple Choice Identify the choice that best completes the statement or answers the question.

DNA Scissors: Introduction to Restriction Enzymes

Chapter 6 DNA Replication

The Techniques of Molecular Biology: Forensic DNA Fingerprinting

4. DNA replication Pages: Difficulty: 2 Ans: C Which one of the following statements about enzymes that interact with DNA is true?

HCS Exercise 1 Dr. Jones Spring Recombinant DNA (Molecular Cloning) exercise:

Bio 102 Practice Problems Chromosomes and DNA Replication

Gene Mapping Techniques

Single Nucleotide Polymorphisms (SNPs)

Introduction to next-generation sequencing data

Recombinant DNA and Biotechnology

Concepts and methods in sequencing and genome assembly

Chapter 8: Recombinant DNA 2002 by W. H. Freeman and Company Chapter 8: Recombinant DNA 2002 by W. H. Freeman and Company

DNA sequencing. Dideoxy-terminating sequencing or Sanger dideoxy sequencing

Sequencing Guidelines Adapted from ABI BigDye Terminator v3.1 Cycle Sequencing Kit and Roswell Park Cancer Institute Core Laboratory website

Description: Molecular Biology Services and DNA Sequencing

Sequencing the Human Genome

Recombinant DNA Unit Exam

Forensic DNA Testing Terminology

DNA Core Facility: DNA Sequencing Guide

Rapid Acquisition of Unknown DNA Sequence Adjacent to a Known Segment by Multiplex Restriction Site PCR

Basic Concepts Recombinant DNA Use with Chapter 13, Section 13.2

Appendix 2 Molecular Biology Core Curriculum. Websites and Other Resources

A Brief Guide to Interpreting the DNA Sequencing Electropherogram Version 3.0

The Biotechnology Education Company

Cloning GFP into Mammalian cells

DNA SEQUENCING (using an ABI automated sequencer)

DNA Sequencing. Contents. Introduction. Maxam-Gilbert

Electrophoresis, cleaning up on spin-columns, labeling of PCR products and preparation extended products for sequencing

Semiconservative DNA replication. Meselson and Stahl

DNA Technology Mapping a plasmid digesting How do restriction enzymes work?

Genetics Module B, Anchor 3

Nucleic Acid Techniques in Bacterial Systematics

Mitochondrial DNA Analysis

2. The number of different kinds of nucleotides present in any DNA molecule is A) four B) six C) two D) three

Modified Genetic Algorithm for DNA Sequence Assembly by Shotgun and Hybridization Sequencing Techniques

How is genome sequencing done?

PrimeSTAR HS DNA Polymerase

Chapter 2. imapper: A web server for the automated analysis and mapping of insertional mutagenesis sequence data against Ensembl genomes

Tribuna Académica. Overview of Metagenomics for Marine Biodiversity Research 1. Barton E. Slatko* Metagenomics defined

Recombinant DNA Technology

Genomic DNA Clean & Concentrator Catalog Nos. D4010 & D4011

IDTutorial: DNA Sequencing

HiPer RT-PCR Teaching Kit

TIANquick Mini Purification Kit

Universidade Estadual de Maringá

Troubleshooting the Single-step PCR Site-directed Mutagenesis Procedure Intended to Create a Non-functional rop Gene in the pbr322 Plasmid

Just the Facts: A Basic Introduction to the Science Underlying NCBI Resources

7. 3. replication. Unit 7: Molecular biology and genetics

Analysis of DNA methylation: bisulfite libraries and SOLiD sequencing

DNA Sequencing Overview

Expression and Purification of Recombinant Protein in bacteria and Yeast. Presented By: Puspa pandey, Mohit sachdeva & Ming yu

Difficult DNA Templates Sequencing. Primer Walking Service

Introduction to transcriptome analysis using High Throughput Sequencing technologies (HTS)

1865 Discovery: Heredity Transmitted in Units

DNA Sequencing Setup and Troubleshooting

1. Molecular computation uses molecules to represent information and molecular processes to implement information processing.

DNA sequencing is the process of determining the precise order of the nucleotide bases in a particular DNA molecule. In 1974, two methods of DNA

BacReady TM Multiplex PCR System

1.5 page 3 DNA Replication S. Preston 1

New generation sequencing: current limits and future perspectives. Giorgio Valle CRIBI - Università di Padova

Genetic Engineering and Biotechnology

Computational Genomics. Next generation sequencing (NGS)

Becker Muscular Dystrophy

Artisan Scientific is You~ Source for: Quality New and Certified-Used/Pre:-awned ECJuiflment

Welcome to Pacific Biosciences' Introduction to SMRTbell Template Preparation.

Biology Final Exam Study Guide: Semester 2

Genetic Analysis. Phenotype analysis: biological-biochemical analysis. Genotype analysis: molecular and physical analysis

PicoMaxx High Fidelity PCR System

Searching Nucleotide Databases

Transcription:

First generation" sequencing technologies and genome assembly Roger Bumgarner ssociate Professor, Microbiology, UW Rogerb@u.washington.edu

Why discuss a technology that appears to be being replaced? Next gen technologies are great for obtaining large numbers of sequences (thousands to billions) but are not necessarily applicable to smaller projects. Most clinical sequencing is done using Sanger sequencing methods.

Overview How to sequence any DN How to sequence a lot of DN What have we learned from 20 years of the genome project? What s next?

Intended outcomes n understanding of: The process of DN sequencing the types, sources and rates of errors in DN sequence data historical perspective of genome sequencing n understanding of the methods used to sequence and assemble genomes

utomated DN Sequencing

Goal - To Read the Sequence of the Basepairs in a region of DN

DN Structure

DN Sequencing: Process Overview Generation of a nested set of fragments Separation of the fragments Detection nalysis or base calling

Maxam-Gilbert Sequencing Maxam M, Gilbert W (February 1977). " new method for sequencing DN". Proc. Natl. cad. Sci. U.S.. 74 (2): 560 4.

DN Replication single stranded DN binding proteins 5 5 3 helicase primosome primase 3 5 replicating 3 5 DN polymerase III active sites 3 5 RN primer 3 DN polymerase I ligase 5

The 3 hydroxyl group is the point of attachment of the next base What happens if the 3 OH is not there? X

Sanger Sequencing Sanger F, Coulson R (May 1975). " rapid method for determining sequences in DN by primed synthesis with DN polymerase". J. Mol. Biol. 94 (3): 441 8

n utorad of a Sequencing Gel CGT

With 4-colors, all reaction can be run in one lane C G T C G T C G C G C C T C G C T T C T C G C G C C T C G C T T C T C G C G C C T C G C T T C T Label each with a different color Mix all reactions prior to loading utomated DN sequencing and analysis of the human genome. Genomics. 1987 Nov;1(3):201-12. Hood LE, Hunkapiller MW, Smith LM.

The Principle of 4-color Fluorescent DN Sequencing

The First 4-Color Sequencing Instrument Fluorescence detection in automated DN sequence analysis. Smith LM, Sanders JZ, Kaiser RJ, Hughes P, Dodd C, Connell CR, Heiner C, Kent SB, Hood LE. Nature. 321(6071):674-9(1986).

The Perkin Elmer/BI 370/373 Fluorescence Based DN Sequencer By about 1987 pplied Biosystems had developed a slab gel system capable of sequencing 16 samples to about 250 bp in a 24 hour run

Sequencing Gel Image

utomated DN Sequencing - CGTT. C CG CGT CGTT +

Different Labeling Chemistries can be used Dye Primer - dye is attached to the 5 end of the sequencing primer. Dye Terminator - dye is attached to the ddntp - allows all 4 reactions to be run in same tube. Internal Labeling - dye is attached to a dntp - signal/molecule increases with length (rarely used today)

Processed Electropherogram

Errors and error rate verage error rate <1% Highest error rates are: t the beginning of the run (due to misalignment of the peaks and noise from unpurified fluorescent material) t the end of the run (due to loss in gel resolution often results in indel errors). lso have errors due to: compression Mixed samples (heterozygosity, repeats and PCR, etc)

Higher Voltages Produce Faster rates of Electrophoresis Speed is proportional to Voltage (V) Current (I) is depends on the resistance of the gel I=V/R Energy in Watts is W = V*I Thinner gels give higher R. Hence, thin or otherwise small gels must be used for higher voltages.

Capillaries automate loading of samples - + Sample buffer

The current range of capillary sequencers

Typical Specifications Read length 400-900bp Run times of 36mins to 3-4 hours. Total throughput per machine BI 3730, 96 capillaries: 2100 kbp/day (run 24hours/day) BI 310, 1 capillary: 5200 bp/day

Large Scale Sequencing

The (Human) Genome Project. The ultimate goal of the Human Genome Project is to decode, letter by letter, the exact sequence of all 3 billion nucleotide bases that make up the human genome. Just a single misplaced letter is sufficient to cause disease. GCTTCTGGTCTGTGCTTCGT 3,400,000,000 letters total

The (Human) Genome Project. Begun in 1990 with a 15 year budget of $3.0B overall. Goals: To obtain the sequences of human and model Organisms - E-Coli, Drosophila (fruit fly), C-Elegans (a worm), Yeast, Mouse Develop the necessary technologies to obtain the above.

How do we begin to analyze a genome? We want DN sequence for the entire genome (3.5 Bbp for human, 4Mbp for a bacterium). Sequencing allows one to read about 750 base pairs/sample. We need a method to sequence bigger pieces.

Primer Walking Vector Clone to sequence Primer Sequence New Primer Sequence Repeat

Shotgun sequencing Copy Subclone Clone to sequence Sequence and assemble.gtctcctgtctgtctgc.... CCTGTCTGTCTGCTT.... GTCTGTCTGCTTCG...

Shotgun vs. walking Method dvantage Disadvantage Shotgun Easy to automate Highly redundant Walking Not very redundant Harder to automate

Methods for very large scale sequencing hierarchical approach Map on a large scale (physical mapping), sequence specific clones whose position in the genome is known Shot gun sequencing Tear up the genome and sequence random fragments until it is done Sequence tagged connectors (STC) Sequence the ends of many clones and use this info to pick overlapping clones

Making a genomic library Cells Isolate DN Fragment DN Clone Library {

Library Types Chromosome specific libraries Chromosomes can be sorted from one another based on size and GC content. Genomic Libraries - made from the entire genome. Large insert/small insert : combination of vector choice (YC, BC, plasmid, m13), fragmentation method (enzymatic, shearing, sonication), and size selection (by gel or other method).

nother view of a library Multiple copies of the genome (streched out) Randomly fragment and clone Can we order these fragments relative to one another?

Restriction Enzymes - 1970 Copyright 1998 ccess Excellence www.gene.com

Physical Mapping : Digest and look for common features in clones B B

Repeat Pick a many minimal times tiling to construct path a physical map Sequence these mapped clones (typically by the shotgun method).

Path that was initially used for genome sequencing YCs map (MBP) BCs or Cosmids map (200kBP) m13, plasmid sequence (kbp)

Shotgun the genome Genome to sequence Subclone Sequence and assemble.gtctcctgtctgtctgc.... CCTGTCTGTCTGCTT.... GTCTGTCTGCTTCG...

Sequence tagged connectors (STC) Genome to sequence Subclone Sequence the ends and store in a db Sequence a clone, look for overlaps in the db

Which method? Whole genome shot-gun Method of choice today for small genomes and genomes with good reference sequences (good implies conserved genomic structure across the species) Celera s approach to the human genome, but what about repeats? Physical mapping Traditional method rarely used today STC Hybrid method, not a difficult as physical mapping, can resolve some issue with repeats. Common method today in genomic sequencing.

Who can do sequencing for me? Visit www.iths.org/resources Put in the word sequencing University of Washington - CEEH Facility Core #1: Functional Genomics & Proteomics, CFRTC - Genomics Core,DERC - Virus, Molecular Biology and Cell Core, DN Sequencing and Gene nalysis Center, High Throughput Genomics Unit Fred Hutchison Cancer Researc Center Genomics Resource at the Fred Hutchinson Cancer Research Center University of Idaho IBEST DN Sequence nalysis Core Facility Institute for Systems Biology - DN Sequencing Core Idaho State University - SU Molecular Research Core Facility University of Montanta - Murdock Molecular Biology Facility Seattle Biomed - Sequencing Core at Seattle Biomed University of Wyoming - The Nucleic cid Exploration Facility