Biology of STRs
Artifacts in Genotyping STRs A number of artifacts are possible: Stuttering Non-template additions Microvariants Three peaks Allele dropouts Mutations All interfere with reading a DNA profile accurately and consistently
Stuttering Stuttering is caused by the very structure of the STRs that make them good markers They are repeats That are highly polymorphic Stutter product is a band that has the wrong number of repeats Either one repeat more or one less Caused by strand slippage
5 3 5 3 Strand-slippage ATGCGGCGGCGTGTGTGTGTGGCG TACGCCGCCGCACACACACACCGCCG ATGCGGCGGC GTGT GT DNA Replication GTGT Or PCR TACGCCGCCGCACACACACACCGCCG 5 3 GT ATGCGGCGGCGTGTGTGT TACGCCGCCGCACACACACACCGCCG Misalignment 5 3 GT ATGCGGCGGCGTGTGTGTGTGGCGGC TACGCCGCCGCACACACACACCGCCG Elongation
Strand Slippage Occurs during extension step of PCR The newly formed strand of DNA skips one repeat unit starts complementary base pairing with next repeat Pushing out a non-base paired loop from the template strand of DNA Usually causes a deletion of one repeat unit therefore band will be one unit smaller than true genotype
Strand Slippage Naturally this is the mechanism that makes repeats polymorphic When it happens during PCR it can produce a band that is not real: Genotype will be wrong One repeat unit lower or higher than reality Rarer in Tetranucleotides than any other repeats which is why tetra s are used
Amount of Stutter Product Stutter is usually rare Therefore might show a small bump - can usually be differentiated from a true band Earlier in PCR reaction strand slips More stutter product will be produced Or if genotyping protocol doesn t work well true band may be very low Difficult to separate stutter band from true band
Stutter Products Call these genotypes: Stutter Stutter Stutter?
Calling Alleles Biggest problem with stutter bands: They are the same size as a real allele! Especially difficult if you know the DNA sample is mixed Or you are unsure whether sample has been contaminated Difficult to determine: Stutter band Minor allele (because less DNA)
13 CODIS STR Loci All produce some stutter products Longer alleles produce more stuttering Why does this make sense? Stutter percentages for Tetranucleotides: From Less than 1 % Up to 15% - of the true allele size Therefore always calculate percentage of small band s peak height Be sure < 15% height of large band
Reducing Stuttering Products Changing PCR conditions Faster DNA Polymerase Faster it works, less chance for slippage STRs with longer repeats (> 4 bps) More difficult to skip past repeat STRs with imperfect repeat units Complex and compound repeats More difficult to skip past repeat if next repeat unit sequence is different
Summary of Stutter Products One repeat unit more or less than real allele peaks Less then 15% real allele height Quantity of stutter band depends on: When in PCR reaction first slippage occurs Allele size (bigger alleles, more stutter) PCR Conditions Polymerase used Repeat length and sequence
Non-Template Additions Polymerase often adds an extra Adenosine to the end of the newly formed sequence Not a part of the template sequence Makes PCR product one base longer than actual sequence If your PCR reaction forms both +A and -A products then your band will be wide
Non-Template Additions Want to have peaks as clear as possible Therefore want all PCR products to be identical Either all +A or all -A Imagine case where you were genotyping a dinucleotide, with stutter, and half the products were +A and half were -A Impossible to separate genotypes
Non-Template Additions Set up PCR conditions so that every product will be +A Conditions: Final extension for 10 mins Allows all products to be fully adenylated Primer ends in a guanosine Commercially available kits turn every allele (and ladder) into +A
Overloading Sample Signal on gel is too strong will be difficult to call May result in a split peak Or a peak that is off scale Caused by: Too much DNA sample in PCR reaction Primer concentrations too high Why DNA quantification is so important
Non-Template Additions and Overloading Samples DNA Size (bp) Relative Fluorescence (RFUs) -A +A off-scale 10 ng template (overloaded) D3S1358 VWA FGA 2 ng template (suggested level) Figure 6.5, J.M. Butler (2005) Forensic DNA Typing, 2 nd Edition 2005 Elsevier Academic Press
Microvariants Remember these are variants of the repeat that are not a full repeat unit Example TH01 9.3 allele As opposed to stutter allele microvariants are not same size as expected allele Problem is determining whether there is a true microvariant in the person Or you are seeing a normal band being shifted over for some genotyping reason?
Microvariants 1. True microvariants must be validated to happen in many samples Even if variant is rare it must show up in more than one individual to be considered a true microvariant 2. Exact distance in base pairs should be calculated 9.3 means 9 repeats plus 3 bases Always calculate in bases exactly how off the microvariant is
Sequence Microvariants Sometimes there are also sequence differences in these polymorphisms as well as length differences The only way to genotype a sequence variant is to sequence the PCR product Not necessary for Forensics because you are simply matching genotypes These variants are not important for Forensics analysis
Peaks outside of the Ladder Sometimes you will see a peak that it outside of the expected range for any marker (between markers?) What could cause this? Unsuccessful PCR product Primer dimers or etc. Person really has a new allele Check with different set of primers Sequence new allele and region
Three Peaks Sometimes three bands may be seen What could cause three bands? Stuttering Mixed or contaminated samples Genotyping error True duplication or extra chromosome in the individual Need to validate what is seen in gel
Three Peaks 1. Check other markers in panel: 1. Is there evidence of mixed or contaminated samples in any other markers? 2. Check database information for this marker: 1. More than 50 tri-allelic patterns have been reported as possible with 13 CODIS loci 3. Sequence or genotype this region: 1. Is there truly a duplication or extra chromosome in this person?
Allele Dropout Most worrisome problem May call a person homozygous when really they are heterozygous What can this be caused by? Larger allele is not amplified successfully Primer site mutation Rare with chosen tetranucleotides: Alleles are very similar in size Primers have been optimized and chosen in regions that are very stable
Avoiding Allele Dropout Chose primers carefully Work with polymorphisms that have alleles of similar size Always check genotypes with Hardy- Weinberg Equation Make sure you see the expected number of heterozygotes population wide Most commercial kits have taken care of all these issues
Fixing Allele Dropouts Add a degenerate primer Extra primer with known polymorphism Three primers total will be added Lower annealing temperature Reduce the stringency of primer binding Remember that with Forensics what matters is matching genotypes As long as allele always drops out, don t have a problem
Mutations STRs do mutate at an expected mutation rate over time Mutation may cause: New Alleles Change primer binding regions Sequence changes (less important) Very rare events Can be validated by examining families
Mendelization of Alleles Using family members to determine which alleles are possible If you know parent s alleles then there are only so many genotypes possible for children Mendel s law of segregation All STRs have been genotyped on CEPH families huge family sets from Utah
Mendelization of Alleles 8/12 3/14 2/9 5/11 3/8 8/14 3/12 2/11 10/11 As always must validate mutation By sequencing or regenotyping
Mutation Rates Mutations rates of 13 CODIS have been calculated over thousands of meioses All 13 are between 1 to 5 per 1000 generational events Highest mutation rates: Markers that are most polymorphic Lowest mutation rates: Markers that are least informative
Impact of Mutations Paternity testing Can cause problems Because father may not match true child if genotype has change in child Compare many STR loci Identity matching Will not cause a problem Because mutation will be consistent over a person s lifetime and in all tissues
Genotyping Errors All the previous were artifacts that can be explained However the problems you really worry about are unexplained errors Especially if sample may be: Contaminated Mixed samples Need to always validate any artifact Be sure it s not genotyping error
Any Questions? Review Chapters 1 6 Email me at least 2 questions you have about the first 6 chapters Next class will be review for Exam Exam One February 5 th