Molecular markers
What are molecular markers? A readily detectable sequence of DNA or a protein whose inheritance can be monitored Polymorphism in proteins Allozymes and isozymes Polymorphism in DNA Nuclear Cytoplasmic
Desirable properties of molecular markers Polymorphic Codominat inheritance Easy, fast and inexpensive to detect Reproducible, transferable No single marker meets all needs!
Isozymes and allozymes Around 1960 Multiple forms of the same enzyme Isozyme:one enzyme, more than one gene locus Allozyme:one enzyme, one gene locus Methodology Macerate tissue Separate enzymes by electrophoresis Locate enzymes by histochemical staining Analyze banding patterns
Allozymes:methodology
Enzyme electrophoresis: Advantages Robust, cheap reproducible method for: Characterizing/ identifying genotypes Studying population genetics Examining geographical patterns of variation Disadvantages Limited number of enzymes available Limited amount of variation
RFLP Restriction Fragment Length Polymorphism Around 1970 RFLP examines differences in size of specific DNA restriction fragments Requires pure, high molecular weight DNA
RFLP: methodology Cut DNA into smaller fragments Separate fragments by gel Transfer DNA fragments to a filter Visualise DNA fragments radioactive probes non-radioactive probes
RFLP: analysis of results Bands scored for presence/absence Differences reflect genetic differences the choice of restriction enzyme crucial
RFLP Advantages Reproducible Co-dominant markers Simple Disadvantages Time consuming Expensive Use of radioactive probes not needed for mitochondrial/ chloroplast DNA
Minisatellites Invented in 1985 (Jeffreys et al.) Like RFLP, but the probe binds in microsatellite area (tandem repeats, 10-50 base pairs) Multi locus minisatellites (DNA-fingerprints) paternity analysis, not very useful for population analysis Single locus minisatellites (VNTR = Variable number of tandem repeats) more useful for population analysis
Minisatellites: interpretation
RAPD Random Amplified Polymorphic DNA 1990 (Welsh et al.) PCR-based method Procedure: amplify anonymous stretches of DNA by using arbitrary primers one primer/reaction generally 10 bases (depends on genome size) Separate fragments by agarose gel Locate fragments by total DNA-staining
RAPD: interpreation Bands scored for presence/absence
RAPD Advantages Fast and simple Inexpensive No radioisotopes Disadvantages Dominant markers Reproducibility problems Problems of interpretation Same band, same fragment
Microsatellites 1989 (Tautz, Litt & Luty) (originally CA) Sequence tagged microsatellites (STMS) or Simple Sequence Repeat Polymorphism (SSRP) or Short Tandem Repeats (STR) Motif (1-6 base pairs) repeats e.g. CGCGCGCGCGCGCGCG Procedure: amplify with PCR with specific primers Separate fragments by polyacryalamide sequencing gel (+ stain with silver nitrate) (or agarose + total DNA-stain) or use automated DNA detection with fluorescent based technology
Microsatellites
Microsatellites Advantages High amount of variation Codominant Highly reproducible More useful than minisatellites Known mutation model (?) Disadvantages Development difficult (separate sheet) and expensive Interpretation sometimes difficult (stutter bands)
Microsatellites: examples
Microsatellites: examples
AFLP 1995 (Vos et al.) Amplified Fragment Length Polymorphism AFLP is a DNA fingerprinting technique which detects DNA restriction fragments by means of PCR amplification.
AFLP: technology Comprises of the following steps: The restriction of the DNA with two restriction enzymes, preferably a hexa-cutter and a tetracutter The ligation of double-stranded (ds) adapters to the ends of the restriction fragments The amplification of a subset of the restriction fragments using two primers complementary to the adapter and restriction site sequences, and extended at their 3' ends by "selective nucleotides"
AFLP: methodology
AFLP: methodology Gel electrophoresis of the amplified restriction fragments on denaturing polyacrylamide gels ("sequence gels"); The visualization of the DNA fingerprints by means of autoradiography, phosphoimaging, or other methods.
AFLP: example
AFLP Advantages No prior gene or sequence information is required More reliable than RAPD The technique is very sensitive, and is consequently able to detect low abundance transcripts Disadvantages Mostly dominat Technically demanding and expensive
Amplifying a specific loci Single nucleotide polymorphism (SNP) Advantages It is known what is amplified: known source of polymorphism Often more reliable than anonymous methods (specific primers) Disadvantages Formerly labour-intensive to discover sequences Formerly rely on relatively few loci: violation of assumptions Biallelic Not known mutation model Kwok P. 2001. METHODS FOR GENOTYPING SINGLE NUCLEOTIDE POLYMORPHISMS. Annual Review of Genomics and Human Genetics 2: 235-258; DOI: 10.1146/annurev.genom.2.1.235
SNP? SNP discovery by alignment of sequence traces obtained from direct sequencing of genomic PCR products.
SNP-revolution SNP arrays available for model species NGS-sequencing allows easy detection of SNP:s even for non-model species Next de facto markers for population genetic studies? CanineHD BeadChip More than 170,000 evenly spaced and validated SNPs
Genomic level markers With the advent of next-generation sequencing (NGS), there are several approaches, which are capable of discovering, sequencing and genotyping not hundreds but thousands of markers across almost any genome of interest in a single step, even in populations in which little or no genetic information is available. Kumar et al. 2012. SNP Discovery through Next- Generation Sequencing. -Int. J. Plant Genom. (http://www.hindawi.com/journals/ijpg/2012/831460/) Davey et al. 2011. Genome-wide genetic marker discovery and genotyping using next-generation sequencing. Nature Reviews Genetics 12: 503
Platforms First Generation Sequencing (long reads, high quality, low throughput, high cost) Sanger Sequencing Second Generation Sequencing (PCR Needed) decreased cost, short reads, high throughput (massively parallel), clonal template amplification Roche: 454 (pyrosequencing) X Life Science: IONtorrent & IONproton Illumina: MiSeq & HiSeq (Sequencing by Synthesis) ABI: SOLiD Third Generation Sequencing (Single Molecule Sequencing ) Helicos Biosciences: true single molecule sequencing (tsms) Pacific Biosciences: PacBio RS II (Non--terminal fluorescent sequencing by synthesis, SMRT) Oxford Nanopore: MinION & GridION (single nucleotides are detected as they pass through a nanopore) http://www.molecularecologist.com/next-gen-fieldguide-2014/
Dideoxy sequencing (Sanger) 1977 Selective termination of DNA-synthesis Four reactions (lanes) Separate by denaturing PAGE Visualize by autoradiography
Automatic Sanger sequencing Fluorescent dyes: one lane, four bases
SGS-methods: Pyrosequencing 1998 Ronaghi M., Uhlen M., Nyren P. (1998b) A sequencing method based on real-time pyrophosphate. Science 281:363 365
Ion Torrent Based on monitoring of base incorporation during DNA synthesis, similar to 454 1. Beads placed in wells of a semiconductor chip which contain a ph sensing layer 2. Single nucleotide types are flowed across the chip one at a time 3. Incorporation of a nucleotide by DNA polymerase results in the release of a hydrogen ion, which is detected and recorded. Multiple incorporations lead to a larger signal
Illumina sequencing Metzker 2010 https://www.youtube.com/watch?v=hmycqwhwb8e
Illumina sequencing Also based on observation of incorporated nucleotides during DNA synthesis 1. Clonal amplification of template DNA on a glass flow cell 2. Fluorescent, reversibly-terminated nucleotides are added with DNA polymerase. 3. Stalled DNA synthesis occurs because of terminating nucleotides 4. System is photographed/scanned in A, G, C and T channel 5. Terminator and fluorophore are removed
TGS: Oxford nanopore
Oxford nanopore
SNP-discovery SNP-calling SNP-validation Kumar et al. 2012. SNP Discovery through Next-Generation Sequencing. -Int. J. Plant Genom. SNP-genotyping
SNP-discovery SNP is identified when a nucleotide from an accession read differs from the reference genome at the same nucleotide position. Graphical user interface of Tablet, an assembly visualization program, displays the reference genome on top and the mapped reads with color-coded SNPs on the bottom.
SNP-discovery
Complexity reduction strategies Often use restriction enzyme digestion of target genomes to reduce the complexity Reduced-representation sequencing using reducedrepresentation libraries (RRLs) Complexity reduction of polymorphic sequences (CRoPS), restriction-site-associated DNA sequencing (RAD-seq) low coverage genotyping including MSG and genotyping by sequencing (GBS) Davey et al. 2011. Genome-wide genetic marker discovery and genotyping using next-generation sequencing. Nature Reviews Genetics 12: 503
Genotyping by sequencing (GBS; right panels): barcoded adaptors (yellow) and common adaptors (grey) are ligated to digested fragments, producing fragments with barcode+common, barcode+barcode and common+common adaptor combinations. Samples are pooled and amplified on the Illumina Genome Analyzer flowcell. Only short samples featuring a barcode+common adaptor combination are amplified for sequencing
Cost of sequencing