DNA sequencing. Matt Hudson

Similar documents
- In , Allan Maxam and walter Gilbert devised the first method for sequencing DNA fragments containing up to ~ 500 nucleotides.

1/12 Dideoxy DNA Sequencing

The Biotechnology Education Company

July 7th 2009 DNA sequencing

Electrophoresis, cleaning up on spin-columns, labeling of PCR products and preparation extended products for sequencing

An Overview of DNA Sequencing

Sequencing the Human Genome

Sanger Sequencing and Quality Assurance. Zbigniew Rudzki Department of Pathology University of Melbourne

LESSON 9. Analyzing DNA Sequences and DNA Barcoding. Introduction. Learning Objectives

The Techniques of Molecular Biology: Forensic DNA Fingerprinting

Introduction to next-generation sequencing data

Lecture 13: DNA Technology. DNA Sequencing. DNA Sequencing Genetic Markers - RFLPs polymerase chain reaction (PCR) products of biotechnology

DNA sequencing is the process of determining the precise order of the nucleotide bases in a particular DNA molecule. In 1974, two methods of DNA

Biotechnology Explorer. Planning Guide

Genome and DNA Sequence Databases. BME 110/BIOL 181 CompBio Tools Todd Lowe March 31, 2009

DNA sequencing. Dideoxy-terminating sequencing or Sanger dideoxy sequencing

Concepts and methods in sequencing and genome assembly

STRUCTURES OF NUCLEIC ACIDS

DNA Sequencing Overview

1. Molecular computation uses molecules to represent information and molecular processes to implement information processing.

Nucleic Acids Research

Automated DNA sequencing 20/12/2009. Next Generation Sequencing

Next Generation Sequencing

Amazing DNA facts. Hands-on DNA: A Question of Taste Amazing facts and quiz questions

DNA Sequencing. Contents. Introduction. Maxam-Gilbert

How many of you have checked out the web site on protein-dna interactions?

First generation" sequencing technologies and genome assembly. Roger Bumgarner Associate Professor, Microbiology, UW

2. The number of different kinds of nucleotides present in any DNA molecule is A) four B) six C) two D) three

CCR Biology - Chapter 9 Practice Test - Summer 2012

Next Generation Sequencing

Polyacrylamide gels have a pore size of only a few nm

Biotechnology: DNA Technology & Genomics

DNA Sequencing & The Human Genome Project

How is genome sequencing done?

Automated DNA Sequencing. Chemistry Guide

DNA SEQUENCING (using an ABI automated sequencer)

What is a contig? What are the contig assembly programs?

Forensic DNA Testing Terminology

HiPer RT-PCR Teaching Kit

DNA Detection. Chapter 13

Biotechnology and Recombinant DNA (Chapter 9) Lecture Materials for Amy Warenda Czura, Ph.D. Suffolk County Community College

4. DNA replication Pages: Difficulty: 2 Ans: C Which one of the following statements about enzymes that interact with DNA is true?

DNA Sequencing Troubleshooting Guide.

DNA Fingerprinting. Unless they are identical twins, individuals have unique DNA

Bioinformatics I, WS 09-10, D. Huson, January 27,

The Central Dogma of Molecular Biology

Description: Molecular Biology Services and DNA Sequencing

DNA Replication in Prokaryotes

Sequencing Guidelines Adapted from ABI BigDye Terminator v3.1 Cycle Sequencing Kit and Roswell Park Cancer Institute Core Laboratory website

CUSTOM DNA SEQUENCING SERVICES

Mitochondrial DNA Analysis

Recombinant DNA & Genetic Engineering. Tools for Genetic Manipulation

New generation sequencing: current limits and future perspectives. Giorgio Valle CRIBI - Università di Padova

Introduction. Preparation of Template DNA

Bio 102 Practice Problems Chromosomes and DNA Replication

Procedures For DNA Sequencing

Lab 5: DNA Fingerprinting

Cloning GFP into Mammalian cells

Genetic Analysis. Phenotype analysis: biological-biochemical analysis. Genotype analysis: molecular and physical analysis

DNA Sequence Analysis

Next generation DNA sequencing technologies. theory & prac-ce

IDTutorial: DNA Sequencing

Universidade Estadual de Maringá

IMBB Genomic DNA purifica8on

Institutional Partnership Program

Genetics Module B, Anchor 3

Name Class Date. Figure Which nucleotide in Figure 13 1 indicates the nucleic acid above is RNA? a. uracil c. cytosine b. guanine d.

IIID 14. Biotechnology in Fish Disease Diagnostics: Application of the Polymerase Chain Reaction (PCR)

Reading DNA Sequences:

DNA Sequencing Handbook

Answer: 2. Uracil. Answer: 2. hydrogen bonds. Adenine, Cytosine and Guanine are found in both RNA and DNA.

Illumina Sequencing Technology

Name Date Period. 2. When a molecule of double-stranded DNA undergoes replication, it results in

Next Generation Sequencing for DUMMIES

ABI PRISM BigDye Primer Cycle Sequencing Ready Reaction Kit With AmpliTaq DNA Polymerase, FS

Bioinformatic Approaches for Genome Finishing

Advances in RainDance Sequence Enrichment Technology and Applications in Cancer Research. March 17, 2011 Rendez-Vous Séquençage

14.3 Studying the Human Genome

A Brief Guide to Interpreting the DNA Sequencing Electropherogram Version 3.0

Section 16.1 Producing DNA fragments

Gene Mapping Techniques

Just the Facts: A Basic Introduction to the Science Underlying NCBI Resources

Single Nucleotide Polymorphisms (SNPs)

DNA Core Facility: DNA Sequencing Guide

Genome Viewing. Module 2. Using Genome Browsers to View Annotation of the Human Genome

PCR Instruments and Consumables

SEQUENCING SERVICES. McGill University and Génome Québec Innovation Centre JANUARY 21, Version 3.4

RETRIEVING SEQUENCE INFORMATION. Nucleotide sequence databases. Database search. Sequence alignment and comparison

Handling next generation sequence data

DNA Separation Methods. Chapter 12

Appendix 2 Molecular Biology Core Curriculum. Websites and Other Resources

C A. How many high-energy phosphate bonds would be consumed during the replication of a 10-nucleotide DNA sequence (synthesis of a single-strand)?

BacReady TM Multiplex PCR System

GENOTYPING ASSAYS AT ZIRC

Troubleshooting the Single-step PCR Site-directed Mutagenesis Procedure Intended to Create a Non-functional rop Gene in the pbr322 Plasmid

HCS Exercise 1 Dr. Jones Spring Recombinant DNA (Molecular Cloning) exercise:

Sequenase Version 2.0 PCR Product Sequencing Kit

Dye structure affects Taq DNA polymerase terminator selectivity

Sanger Sequencing. Troubleshooting Guide. Failed sequence

Transcription:

DNA sequencing Matt Hudson

DNA Sequencing Dideoxy sequencing was developed by Fred Sanger at Cambridge in the 1970s. Often called Sanger sequencing. Nobel prize number 2 for Fred Sanger in 1980, shared with Walter Gilbert from Harvard (inventor of the now little-used Maxam-Gilbert sequencing method).

Sanger s Dideoxy DNA sequencing method -How it works: 1. DNA template is denatured to single strands. 2. DNA primer (with 3 end near sequence of interest) is annealed to the template DNA and extended with DNA polymerase. 3. Four reactions are set up, each containing: 1. DNA template eg a plasmid 2. Primer 3. DNA polymerase 4. dntps (datp, dttp, dctp, and dgtp) 4. Next, a different radio-labeled dideoxynucleotide (ddatp, ddttp, ddctp, or ddgtp) is added to each of the four reaction tubes at 1/100th the concentration of normal dntps

Terminators stop further elongation of a DNA deoxyribose-phosphate backbone ddntps are terminators: they possess a 3 -H instead of 3 -OH, compete in the reaction with normal dntps, and produce no phosphodiester bond. Whenever the radio-labeled ddntps are incorporated in the chain, DNA synthesis terminates.

hasta la vista

Manual Dideoxy DNA sequencing-how it works (cont.): 5. Each of the four reaction mixtures produces a population of DNA molecules with DNA chains terminating at each terminator base.. 6. Extension products in each of the four reaction mixutes also end with a different radio-labeled ddntp (depending on the base). 7. Next, each reaction mixture is electrophoresed in a separate lane (4 lanes) at high voltage on a polyacrylamide gel. 8. Pattern of bands in each of the four lanes is visualized on X-ray film. 9. Location of bands in each of the four lanes indicate the size of the fragment terminating with a respective radio-labeled ddntp. 10. DNA sequence is deduced from the pattern of bands in the 4 lanes.

Vigilant et al. 1989 PNAS 86:9350-9354

Radio-labeled ddntps (4 rxns) Sequence (5 to 3 ) G G A T A T A A C C C C T G T Short products Long products

Manual vs automatic sequencing Manual sequencing has basically died out. It needs four lanes, radioactive gels, and a technician in one day from one gel can get four sets of four lanes, with maybe 300 base pairs of data from each template. Everyone now uses automatic sequencing the downside is no one lab can afford the machine, so it is done in a central facility (eg. Keck center). Most automated DNA sequencers can load robotically and operate around the clock for weeks with minimal labor.

Dye deoxy terminators One tube. One gel lane or capilliary

DNA sequence output from ABI 377 (a gel-based sequencer) 1. Trace files (dye signals) are analyzed and bases called to create chromatograms. 2. Chromatograms from opposite strands are reconciled with software to create doublestranded sequence data.

Robotic 96 capillary machine: ABI 3730 xl

Capillary sequencing no gels

Databases DNA sequence and chromatogram data All is deposited in public repositories The analysis of DNA sequences is the basis of the field of bioinformatics.

What is bioinformatics? Interface of biology and computers Analysis of proteins, genes and genomes using computer algorithms and computer databases Genomics is the analysis of genomes. The tools of bioinformatics are used to make sense of the billions of base pairs of DNA that are sequenced by genomics projects.

Growth of GenBank Base pairs of DNA (billions) 1982 1986 1990 1994 1998 2002 Fig. 2.1 Year Page 17 Sequences (millions) Updated 8-12-04: >40b base pairs

There are three major public DNA databases EMBL Housed at EBI European Bioinformatics Institute GenBank Housed at NCBI National Center for Biotechnology Information DDBJ Housed in Japan Page 16

>100,000 species are represented in GenBank all species 128,941 viruses 6,137 bacteria 31,262 archaea 2,100 eukaryota 87,147 Table 2-1 Page 17

The most sequenced organisms in GenBank Homo sapiens 11.2 billion bases Mus musculus 7.5b Rattus norvegicus 5.7b Danio rerio 2.1b Bos taurus 1.9b Zea mays 1.4b Oryza sativa (japonica) 1.2b Xenopus tropicalis 0.9b Canis familiaris 0.8b Drosophila melanogaster 0.7b Updated 8-29-05 GenBank release 149.0 Table 2-2 Page 18

National Center for Biotechnology Information (NCBI) www.ncbi.nlm.nih.gov Page 24

www.ncbi.nlm.nih.gov Fig. 2.5 Page 25

The genome factory There are a few centers around the world that have a factory big enough to do shotgun sequence of a large eukaryotic genome: Broad Institute, MIT Baylor College of Medicine, Houston Washington University, St Louis DoE Joint Genomics Institute, Walnut Creek, CA Sanger Centre, Cambridge Beijing Genomics Institute, Chinese Academy of Sciences

Pictures from JGI

Qpix robot picks colonies

Biomek miniprep robot

PCR 384 x 4 x 48 x 3

About 150 sequencers, at $200,000 each

Sequence analysis

Armies of programmers and large supercomputers are necessary to assemble and annotate the sequence Bioinformatics

The past Most of these genome factories are dying out and being replaced by machines that can do massively-parallel sequencing 1.6 million well plates instead of 96 well or 384 well plates. The cost will go down enough to allow the differences between humans, or plants, to be detected by whole-genome resequencing

Current genomic technology Roche 454 sequencer ~$500k Equivalent to about 10,000 old sequencers Illumina GA2 sequencer ~$500k Equivalent to about 100,000 old sequencers

And then what? Fastest moving field in biology Keep up with it!