Kinship in the era of genome-wide data: what does it mean and what use is it?

Size: px
Start display at page:

Download "Kinship in the era of genome-wide data: what does it mean and what use is it?"

Transcription

1 Kinship in the era of genome-wide data: what does it mean and what use is it? David Balding (with much help from Doug Speed, funding: MRC) Institute of Genetics University College London From 1/11/14: Departments of MathStats and Genetics University of Melbourne. Statistical and computational methods for relatedness and relationship inference from genetic marker data ICMS Edinburgh, 23 September, 2014

2 Kinship, heritability and prediction For over a century we ve tried to measure the extent to which phenotypic similarity between pairs of individuals can be explained by their relatedness. Underlying mathematical model involves latent random effects, expressed in terms of components of variance e.g. Var[Y ] = σ 2 aa + σ 2 d D + σ2 cc + σ 2 i I where Y is the phenotype, and A and D are matrices of kinship coefficients corresponding to Additive and Dominance genetic effects, while C and I represent Common and Individual environmental effects (I is the identity matrix). Narrow sense heritability h 2 is σ 2 a/(σ 2 a+σ 2 d +σ2 c+σ 2 i ). Same model underlies prediction using BLUP.

3 Relatedness: what is it? how do we measure it? A B θ(a, B) = X (1+f X )2 g X. Sum is over common ancestors X of A and B within the pedigree, f X = θ(m(x ), F (X )) C Traditionally, relatedness (a general concept) has been measured by kinship coefficients (numerical measures of relatedness) computed from Identity by Descent (IBD) probabilities under Mendelian inheritance in known pedigrees. Most important is coancestry θ(a, B), the probability that a random allele from A is IBD with one from B.

4 Additive kinship coefficient based on pedigrees Actually, 16 possible IBD states among 4 alleles of 2 diploid individuals; reduces to 9 ignoring within-individual ordering. Also ignoring inbreeding: 3 IBD states (IBD = 0, 1, 2). Also ignoring dominance: 1 additive kinship (coancestry) coefficient θ = E[IBD]/4 = P[IBD=1]/4 + P[IBD=2]/2. 9 IBD states: Individual 1 Individual 2 Individual 1 Individual 2 Individual 1 Individual 2 circles = alleles, arcs = IBD. θ used in the A matrix of additive genetic correlations. Works well in some applications, but serious problems have been overlooked in the past, because there wasn t much choice.

5 Problem 1: θ depends on the pedigree you happen to have available For diploids, there is no such thing as a complete pedigree. As more ancestors are added, θ among original pedigree members can only increase and eventually converges to one; so if a complete pedigree were possible, it would be useless. There is also no ideal pedigree in any other sense. Similarly for inbreeding (θ between parents): an inbreeding coefficient depends on the available pedigree, and always increases with increasing pedigree information. Didn t matter much in the past because we could only make use of close relatedness, but with genome-wide date now we can see relatives separated by 10 or more meioses.

6 Problem 2: θ only captures expected, and not realised, genome-sharing θ for half-sibs is 0.125, but 95% CI is (0.092,0.158). Just 6 parent-child transmissions can result in no DNA remaining from the first parent. Two children may share no DNA from their common great-grandparent. Conversely, θ = 0 for many pairs of individuals, yet the levels of genome-sharing among unrelateds can vary substantially; this has been exploited e.g. for prediction or to estimate SNP heritability.

7 Statistics of IBD sharing (update of Donnelly 1983) # # θ(a, B) P[IBD E[# E[rl] Relationship G A E[IBD]/4 95% CI >0] sr] (Mb) Sibling (0.204,0.296) /2-sib (0.092,0.158) Cousin (0.039,0.089) /2-cuz (0.012,0.055) nd-cuz (0.004,0.031) /2-2nd-cuz (0.001,0.020) rd-cuz (0.000,0.012) (0.000,0.005) (1/2) 14 (0.000,0.001) (1/2) G: generations; A: ancestors; sr = shared regions; rl = region length

8 Kinships based on unobserved pedigrees A C Gene Pool Allele fractions p and 1!p A A C A A A A A A A A C C Many pop gen models incorporate: p aa = θp a + (1 θ)p 2 a i.e. 2 alleles are either IBD or random draws from a gene pool. Leads to θ = p aa p 2 a p a (1 p a ) so θ is a correlation coefficient, only +ve correlations possible. If individuals come from a finite pedigree with unrelated founders, and if allele probabilities in founders are known, then average allelic correlation of markers gives an unbiased and efficient estimator of θ, without knowing the pedigree.

9 Interpretation problems In natural populations (finite pedigree + unrelated founders). What use is an MVUE if unbiased for a meaningless parameter? Not only do we not know allele fractions in founders, we can t usually estimate them and generally use estimates from current population, resulting in: downward bias for θ estimates negative estimates are frequent yet θ is positive by definition.

10 IBD genome segments Homologous segments from two haploid genomes are (recombination-sense) IBD if there has been no recombination within the segment since their MRCA (mutation is ignored). Advantages: No need for an explicit pedigree and no founder population. Problems: Recombinations cannot always be inferred. Easy to identify if shared segment is large, almost impossible if short; most shared segments are short, even for close relatives. Limited use as measure relatedness: two haploid genomes are entirely IBD, relatedness is reflected in distribution of IBD fragment lengths, which is hard to infer. Inferred IBD is used for inferences of demographic parameters: but is it needed? or an optimal approach?

11 Fragment lengths IBD from 1, 5, 9 and 11 generations ago Generation 1 Mean Length 30.3 Generation 5 Mean Length 7.6 Frequency Frequency Chunk Length (Mb) Chunk Length (Mb) Generation 9 Mean Length 4.4 Generation 11 Mean Length 3.6 Frequency Frequency Chunk Length (Mb) Chunk Length (Mb)

12 Distribution of TMRCA given IBD fragment length G>20 G=6 G=5 G=4 G=3 G=2 G= Region Length (Mb)

13 Consumer genetics and IBD Large consumer genetics companies have 10 6 customers genotyped at 10 6 SNPs. They are interested to identify IBD segments in order to infer (remote) pedigree relationships. The relationship is usually expressed in terms of the shortest lineage path (e.g. 3rd cousin, path length = 8) but these cannot be distinguished from many other relationships e.g. involving multiple lineage paths. Why should a customer prefer a poorly-inferred pedigree relationship to a direct measure of genome similarity?

14 Conclusion so far Is it time to ditch pedigrees and related concepts from most scientific discussions? Felsenstein s dismal theorem: given full genomes, history is bunk. Corollary: only actual genome similarity matters, not pedigree relationships and not IBD status. We should formulate our conservation/evolutionary/demographic/disease models and analyses in terms of genome similarity allows us to exploit genome-sharing from all common ancestors.

15 How to measure genomic similarity? There is a ton of ways to measure genetic similarity of two individuals from genome-wide genetic markers (SNPs), no obvious canonical SNP-based alternative to θ. One difficulty in humans is that we are all closely related: Any two haploid human genomes share over 99.9% sequence identity due to shared ancestry. This isn t evident for SNPs because they are highly polymorphic, but measures of similarity can depend sensitively on the Minor Allele Fraction (MAF) spectrum. more low-maf sites more similarity.

16 SNP-based kinships: two approaches Genome-wide average of a single-snp measure. Easiest approach to implement. Ignores information in lengths of shared DNA segments, Better for remote relatedness. Average haplotype sharing. Identify (recombination-sense) IBD segments using e.g. FastIBD, or copied haplotypes using Chromopainter or the positional Burrows-Wheeler transform (PBWT). Kinship coefficient is fraction of all genome in these segments/haplotypes. Ascertainment bias: most IBD is in short segments that are inferred poorly, relatively few, longer segments inferred well. OK if close relatedness (one or more short lineage paths) is of primary interest.

17 SNP-based kinship coefficient 1: Average allele-sharing Define in same way as θ: the fraction of random alleles from A that match a random allele from B. Code SNP genotypes as 0,1 and 2. Then (0, 0) or (2, 2) 1 (0, 1), (1, 1) and (1, 2) 1/2 (0, 2) 0 Two heterozygotes (1,1) are consistent with both IBD=2 and IBD=0, but because the former is often of greater interest some authors (and the highly-influential software PLINK) code (1,1) as 1, rather than 0.5. Most published values do not state which coding is used.

18 SNP kinships 2: Average allelic correlation Suggested by correlation form of θ. Upweights sharing of rare shared alleles more evidence for recent common ancestor. Write G Ai is genotype of A at the ith SNP, then 1 m m i=1 (G Ai 2p i )(G Bi 2p i ) 4p i (1 p i ) is a genome-wide average of single-snp sample-size-1 correlation estimates. The p i are in practice the sample fractions from the same individuals downward bias, often negative ve kinships are the work of the devil to those trained on pedigree ideas; interpretation: B and C have less allele sharing than expected if alleles randomly assigned with probabilities given by the p i.

19 More general SNP-based kinships Researchers still regard pedigree-kinships as gold standard but they aren t always very good; pedigrees were useful when we didn t have genome data, now should be consigned to dustbin of history. Unbiased and efficient properties of allelic-correlation-kinships are not meaningful in practice. There is no true measure of kinship between two individuals and there seems no reason in principle e.g. to prefer allele-sharing kinships to allelic-correlation kinships or haplotype-sharing kinships. We are free to invent new ways to measure genome similarity that best fit the application: explain the most variance (i.e. maximise ĥ 2 )? or to provide the best predictive performance (using BLUP)?

20 MAF and effect sizes A possible 1-parameter family of kinship coefficients is given by: 1 m m i=1 (G Ai 2p i )(G Bi 2p i ) [p i (1 p i )] α α = 0 centred genotypes; popular in plant & animal breeding. α = 1 allelic correlation; upweights low-maf SNPs; popular for SNP-based h 2 estimation in human genetics. Different values of α correspond to different assumptions about the genome-wide MAF effect size relationship. It now becomes an empirical question of which α is best the answer is likely to be trait-specific; reflects true effect sizes for that trait.

21 In the following two slides we consider Allelic-correlation-like kinships for α = 2, 1, 0, 1, Allele-sharing kinships (PLINK). average IBD sharing as computed by fastibd with a liberal significance threshold.

22 Heritability of 139 mouse traits, various kinship matrices Heritability pow 2 (0.29) pow 1 (0.29) pow 0 (0.29) pow 1 (0.29) PLINK (0.32) IBD (0.31) Phenotype

23 Prediction of 139 mouse traits, various kinship matrices Phenotype Prediction (r^2) pow 2 (0.16) pow 1 (0.17) pow 0 (0.17) pow 1 (0.17) PLINK (0.17) IBD (0.14)

24 Conclusions Many definitions of kinship and hence of heritability. Genome similarity is the key concept, no need for pedigree ideas (they don t work for most natural populations) or IBD. Is there a useful canonical definition of relatedness? likely candidates are based on coalescent concepts, such as the genome-wide distribution of times since most recent common ancestor; Rousset (2002) has an interesting idea based on coalescent ideas: excess of TMTCA density at short times, where excess is based on an asymptotic fit; no marker-based estimator so no use in practice. We can choose whatever measure of genome similarity best suits our purpose; e.g. choose to optimise model likelihood or predictive accuracy. There is a huge space of possible genomic similarity matrices so overfitting is potentially a serious problem.

25

26

27

Paternity Testing. Chapter 23

Paternity Testing. Chapter 23 Paternity Testing Chapter 23 Kinship and Paternity DNA analysis can also be used for: Kinship testing determining whether individuals are related Paternity testing determining the father of a child Missing

More information

GENOMIC SELECTION: THE FUTURE OF MARKER ASSISTED SELECTION AND ANIMAL BREEDING

GENOMIC SELECTION: THE FUTURE OF MARKER ASSISTED SELECTION AND ANIMAL BREEDING GENOMIC SELECTION: THE FUTURE OF MARKER ASSISTED SELECTION AND ANIMAL BREEDING Theo Meuwissen Institute for Animal Science and Aquaculture, Box 5025, 1432 Ås, Norway, theo.meuwissen@ihf.nlh.no Summary

More information

Population Genetics and Multifactorial Inheritance 2002

Population Genetics and Multifactorial Inheritance 2002 Population Genetics and Multifactorial Inheritance 2002 Consanguinity Genetic drift Founder effect Selection Mutation rate Polymorphism Balanced polymorphism Hardy-Weinberg Equilibrium Hardy-Weinberg Equilibrium

More information

Basics of Marker Assisted Selection

Basics of Marker Assisted Selection asics of Marker ssisted Selection Chapter 15 asics of Marker ssisted Selection Julius van der Werf, Department of nimal Science rian Kinghorn, Twynam Chair of nimal reeding Technologies University of New

More information

PRINCIPLES OF POPULATION GENETICS

PRINCIPLES OF POPULATION GENETICS PRINCIPLES OF POPULATION GENETICS FOURTH EDITION Daniel L. Hartl Harvard University Andrew G. Clark Cornell University UniversitSts- und Landesbibliothek Darmstadt Bibliothek Biologie Sinauer Associates,

More information

Basic Principles of Forensic Molecular Biology and Genetics. Population Genetics

Basic Principles of Forensic Molecular Biology and Genetics. Population Genetics Basic Principles of Forensic Molecular Biology and Genetics Population Genetics Significance of a Match What is the significance of: a fiber match? a hair match? a glass match? a DNA match? Meaning of

More information

The Human Genome. Genetics and Personality. The Human Genome. The Human Genome 2/19/2009. Chapter 6. Controversy About Genes and Personality

The Human Genome. Genetics and Personality. The Human Genome. The Human Genome 2/19/2009. Chapter 6. Controversy About Genes and Personality The Human Genome Chapter 6 Genetics and Personality Genome refers to the complete set of genes that an organism possesses Human genome contains 30,000 80,000 genes on 23 pairs of chromosomes The Human

More information

Investigating the genetic basis for intelligence

Investigating the genetic basis for intelligence Investigating the genetic basis for intelligence Steve Hsu University of Oregon and BGI www.cog-genomics.org Outline: a multidisciplinary subject 1. What is intelligence? Psychometrics 2. g and GWAS: a

More information

Approximating the Coalescent with Recombination. Niall Cardin Corpus Christi College, University of Oxford April 2, 2007

Approximating the Coalescent with Recombination. Niall Cardin Corpus Christi College, University of Oxford April 2, 2007 Approximating the Coalescent with Recombination A Thesis submitted for the Degree of Doctor of Philosophy Niall Cardin Corpus Christi College, University of Oxford April 2, 2007 Approximating the Coalescent

More information

Genetics and Evolution: An ios Application to Supplement Introductory Courses in. Transmission and Evolutionary Genetics

Genetics and Evolution: An ios Application to Supplement Introductory Courses in. Transmission and Evolutionary Genetics G3: Genes Genomes Genetics Early Online, published on April 11, 2014 as doi:10.1534/g3.114.010215 Genetics and Evolution: An ios Application to Supplement Introductory Courses in Transmission and Evolutionary

More information

GAW 15 Problem 3: Simulated Rheumatoid Arthritis Data Full Model and Simulation Parameters

GAW 15 Problem 3: Simulated Rheumatoid Arthritis Data Full Model and Simulation Parameters GAW 15 Problem 3: Simulated Rheumatoid Arthritis Data Full Model and Simulation Parameters Michael B Miller , Michael Li , Gregg Lind , Soon-Young

More information

Heritability: Twin Studies. Twin studies are often used to assess genetic effects on variation in a trait

Heritability: Twin Studies. Twin studies are often used to assess genetic effects on variation in a trait TWINS AND GENETICS TWINS Heritability: Twin Studies Twin studies are often used to assess genetic effects on variation in a trait Comparing MZ/DZ twins can give evidence for genetic and/or environmental

More information

Biology 1406 - Notes for exam 5 - Population genetics Ch 13, 14, 15

Biology 1406 - Notes for exam 5 - Population genetics Ch 13, 14, 15 Biology 1406 - Notes for exam 5 - Population genetics Ch 13, 14, 15 Species - group of individuals that are capable of interbreeding and producing fertile offspring; genetically similar 13.7, 14.2 Population

More information

Genomic Selection in. Applied Training Workshop, Sterling. Hans Daetwyler, The Roslin Institute and R(D)SVS

Genomic Selection in. Applied Training Workshop, Sterling. Hans Daetwyler, The Roslin Institute and R(D)SVS Genomic Selection in Dairy Cattle AQUAGENOME Applied Training Workshop, Sterling Hans Daetwyler, The Roslin Institute and R(D)SVS Dairy introduction Overview Traditional breeding Genomic selection Advantages

More information

Summary. 16 1 Genes and Variation. 16 2 Evolution as Genetic Change. Name Class Date

Summary. 16 1 Genes and Variation. 16 2 Evolution as Genetic Change. Name Class Date Chapter 16 Summary Evolution of Populations 16 1 Genes and Variation Darwin s original ideas can now be understood in genetic terms. Beginning with variation, we now know that traits are controlled by

More information

Globally, about 9.7% of cancers in men are prostate cancers, and the risk of developing the

Globally, about 9.7% of cancers in men are prostate cancers, and the risk of developing the Chapter 5 Analysis of Prostate Cancer Association Study Data 5.1 Risk factors for Prostate Cancer Globally, about 9.7% of cancers in men are prostate cancers, and the risk of developing the disease has

More information

SNP Essentials The same SNP story

SNP Essentials The same SNP story HOW SNPS HELP RESEARCHERS FIND THE GENETIC CAUSES OF DISEASE SNP Essentials One of the findings of the Human Genome Project is that the DNA of any two people, all 3.1 billion molecules of it, is more than

More information

MAGIC design. and other topics. Karl Broman. Biostatistics & Medical Informatics University of Wisconsin Madison

MAGIC design. and other topics. Karl Broman. Biostatistics & Medical Informatics University of Wisconsin Madison MAGIC design and other topics Karl Broman Biostatistics & Medical Informatics University of Wisconsin Madison biostat.wisc.edu/ kbroman github.com/kbroman kbroman.wordpress.com @kwbroman CC founders compgen.unc.edu

More information

Genetics Lecture Notes 7.03 2005. Lectures 1 2

Genetics Lecture Notes 7.03 2005. Lectures 1 2 Genetics Lecture Notes 7.03 2005 Lectures 1 2 Lecture 1 We will begin this course with the question: What is a gene? This question will take us four lectures to answer because there are actually several

More information

Gene Mapping Techniques

Gene Mapping Techniques Gene Mapping Techniques OBJECTIVES By the end of this session the student should be able to: Define genetic linkage and recombinant frequency State how genetic distance may be estimated State how restriction

More information

Pedigree-free descent-based gene mapping from population samples

Pedigree-free descent-based gene mapping from population samples Pedigree-free descent-based gene mapping from population samples Chris Glazner and Elizabeth Thompson Department of Statistics Technical Report # 632 University of Washington, Seattle, WA, USA January,

More information

Name: Class: Date: ID: A

Name: Class: Date: ID: A Name: Class: _ Date: _ Meiosis Quiz 1. (1 point) A kidney cell is an example of which type of cell? a. sex cell b. germ cell c. somatic cell d. haploid cell 2. (1 point) How many chromosomes are in a human

More information

Lecture 6: Single nucleotide polymorphisms (SNPs) and Restriction Fragment Length Polymorphisms (RFLPs)

Lecture 6: Single nucleotide polymorphisms (SNPs) and Restriction Fragment Length Polymorphisms (RFLPs) Lecture 6: Single nucleotide polymorphisms (SNPs) and Restriction Fragment Length Polymorphisms (RFLPs) Single nucleotide polymorphisms or SNPs (pronounced "snips") are DNA sequence variations that occur

More information

Package forensic. February 19, 2015

Package forensic. February 19, 2015 Type Package Title Statistical Methods in Forensic Genetics Version 0.2 Date 2007-06-10 Package forensic February 19, 2015 Author Miriam Marusiakova (Centre of Biomedical Informatics, Institute of Computer

More information

Marker-Assisted Backcrossing. Marker-Assisted Selection. 1. Select donor alleles at markers flanking target gene. Losing the target allele

Marker-Assisted Backcrossing. Marker-Assisted Selection. 1. Select donor alleles at markers flanking target gene. Losing the target allele Marker-Assisted Backcrossing Marker-Assisted Selection CS74 009 Jim Holland Target gene = Recurrent parent allele = Donor parent allele. Select donor allele at markers linked to target gene.. Select recurrent

More information

Forensic DNA Testing Terminology

Forensic DNA Testing Terminology Forensic DNA Testing Terminology ABI 310 Genetic Analyzer a capillary electrophoresis instrument used by forensic DNA laboratories to separate short tandem repeat (STR) loci on the basis of their size.

More information

Chapter 9 Patterns of Inheritance

Chapter 9 Patterns of Inheritance Bio 100 Patterns of Inheritance 1 Chapter 9 Patterns of Inheritance Modern genetics began with Gregor Mendel s quantitative experiments with pea plants History of Heredity Blending theory of heredity -

More information

Evolution (18%) 11 Items Sample Test Prep Questions

Evolution (18%) 11 Items Sample Test Prep Questions Evolution (18%) 11 Items Sample Test Prep Questions Grade 7 (Evolution) 3.a Students know both genetic variation and environmental factors are causes of evolution and diversity of organisms. (pg. 109 Science

More information

Matthew Kaplan and Taylor Edwards. University of Arizona Tucson, Arizona

Matthew Kaplan and Taylor Edwards. University of Arizona Tucson, Arizona Matthew Kaplan and Taylor Edwards University of Arizona Tucson, Arizona Unresolved paternity Consent for testing Ownership of Samples Uncovering genetic disorders SRY Reversal / Klinefelter s Syndrome

More information

Tests in a case control design including relatives

Tests in a case control design including relatives Tests in a case control design including relatives Stefanie Biedermann i, Eva Nagel i, Axel Munk ii, Hajo Holzmann ii, Ansgar Steland i Abstract We present a new approach to handle dependent data arising

More information

Mendelian and Non-Mendelian Heredity Grade Ten

Mendelian and Non-Mendelian Heredity Grade Ten Ohio Standards Connection: Life Sciences Benchmark C Explain the genetic mechanisms and molecular basis of inheritance. Indicator 6 Explain that a unit of hereditary information is called a gene, and genes

More information

BioSci 2200 General Genetics Problem Set 1 Answer Key Introduction and Mitosis/ Meiosis

BioSci 2200 General Genetics Problem Set 1 Answer Key Introduction and Mitosis/ Meiosis BioSci 2200 General Genetics Problem Set 1 Answer Key Introduction and Mitosis/ Meiosis Introduction - Fields of Genetics To answer the following question, review the three traditional subdivisions of

More information

Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model

Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model 1 September 004 A. Introduction and assumptions The classical normal linear regression model can be written

More information

FAQs: Gene drives - - What is a gene drive?

FAQs: Gene drives - - What is a gene drive? FAQs: Gene drives - - What is a gene drive? During normal sexual reproduction, each of the two versions of a given gene has a 50 percent chance of being inherited by a particular offspring (Fig 1A). Gene

More information

Worksheet - COMPARATIVE MAPPING 1

Worksheet - COMPARATIVE MAPPING 1 Worksheet - COMPARATIVE MAPPING 1 The arrangement of genes and other DNA markers is compared between species in Comparative genome mapping. As early as 1915, the geneticist J.B.S Haldane reported that

More information

Pedigree Based Analysis using FlexQTL TM software

Pedigree Based Analysis using FlexQTL TM software Pedigree Based Analysis using FlexQTL TM software Marco Bink Eric van de Weg Roeland Voorrips Hans Jansen Outline Current Status: QTL mapping in pedigreed populations IBD probability of founder alleles

More information

Deterministic computer simulations were performed to evaluate the effect of maternallytransmitted

Deterministic computer simulations were performed to evaluate the effect of maternallytransmitted Supporting Information 3. Host-parasite simulations Deterministic computer simulations were performed to evaluate the effect of maternallytransmitted parasites on the evolution of sex. Briefly, the simulations

More information

The Concept of Inclusive Fitness 1 Ethology and Behavioral Ecology Spring 2008

The Concept of Inclusive Fitness 1 Ethology and Behavioral Ecology Spring 2008 The Concept of Inclusive Fitness 1 Ethology and Behavioral Ecology Spring 2008 I. The components of Fitness A. Direct fitness W d, darwinian fitness, W gained by increasing ones own reproduction relative

More information

Combining Data from Different Genotyping Platforms. Gonçalo Abecasis Center for Statistical Genetics University of Michigan

Combining Data from Different Genotyping Platforms. Gonçalo Abecasis Center for Statistical Genetics University of Michigan Combining Data from Different Genotyping Platforms Gonçalo Abecasis Center for Statistical Genetics University of Michigan The Challenge Detecting small effects requires very large sample sizes Combined

More information

7A The Origin of Modern Genetics

7A The Origin of Modern Genetics Life Science Chapter 7 Genetics of Organisms 7A The Origin of Modern Genetics Genetics the study of inheritance (the study of how traits are inherited through the interactions of alleles) Heredity: the

More information

A and B are not absolutely linked. They could be far enough apart on the chromosome that they assort independently.

A and B are not absolutely linked. They could be far enough apart on the chromosome that they assort independently. Name Section 7.014 Problem Set 5 Please print out this problem set and record your answers on the printed copy. Answers to this problem set are to be turned in to the box outside 68-120 by 5:00pm on Friday

More information

Logistic Regression (1/24/13)

Logistic Regression (1/24/13) STA63/CBB540: Statistical methods in computational biology Logistic Regression (/24/3) Lecturer: Barbara Engelhardt Scribe: Dinesh Manandhar Introduction Logistic regression is model for regression used

More information

Introductory genetics for veterinary students

Introductory genetics for veterinary students Introductory genetics for veterinary students Michel Georges Introduction 1 References Genetics Analysis of Genes and Genomes 7 th edition. Hartl & Jones Molecular Biology of the Cell 5 th edition. Alberts

More information

Evolution, Natural Selection, and Adaptation

Evolution, Natural Selection, and Adaptation Evolution, Natural Selection, and Adaptation Nothing in biology makes sense except in the light of evolution. (Theodosius Dobzhansky) Charles Darwin (1809-1882) Voyage of HMS Beagle (1831-1836) Thinking

More information

Biology Final Exam Study Guide: Semester 2

Biology Final Exam Study Guide: Semester 2 Biology Final Exam Study Guide: Semester 2 Questions 1. Scientific method: What does each of these entail? Investigation and Experimentation Problem Hypothesis Methods Results/Data Discussion/Conclusion

More information

Chapter 4. Quantitative genetics: measuring heritability

Chapter 4. Quantitative genetics: measuring heritability Chapter 4 Quantitative genetics: measuring heritability Quantitative genetics: measuring heritability Introduction 4.1 The field of quantitative genetics originated around 1920, following statistical

More information

SeattleSNPs Interactive Tutorial: Web Tools for Site Selection, Linkage Disequilibrium and Haplotype Analysis

SeattleSNPs Interactive Tutorial: Web Tools for Site Selection, Linkage Disequilibrium and Haplotype Analysis SeattleSNPs Interactive Tutorial: Web Tools for Site Selection, Linkage Disequilibrium and Haplotype Analysis Goal: This tutorial introduces several websites and tools useful for determining linkage disequilibrium

More information

Mendelian inheritance and the

Mendelian inheritance and the Mendelian inheritance and the most common genetic diseases Cornelia Schubert, MD, University of Goettingen, Dept. Human Genetics EUPRIM-Net course Genetics, Immunology and Breeding Mangement German Primate

More information

Biology Behind the Crime Scene Week 4: Lab #4 Genetics Exercise (Meiosis) and RFLP Analysis of DNA

Biology Behind the Crime Scene Week 4: Lab #4 Genetics Exercise (Meiosis) and RFLP Analysis of DNA Page 1 of 5 Biology Behind the Crime Scene Week 4: Lab #4 Genetics Exercise (Meiosis) and RFLP Analysis of DNA Genetics Exercise: Understanding how meiosis affects genetic inheritance and DNA patterns

More information

GOBII. Genomic & Open-source Breeding Informatics Initiative

GOBII. Genomic & Open-source Breeding Informatics Initiative GOBII Genomic & Open-source Breeding Informatics Initiative My Background BS Animal Science, University of Tennessee MS Animal Breeding, University of Georgia Random regression models for longitudinal

More information

Genetics 1. Defective enzyme that does not make melanin. Very pale skin and hair color (albino)

Genetics 1. Defective enzyme that does not make melanin. Very pale skin and hair color (albino) Genetics 1 We all know that children tend to resemble their parents. Parents and their children tend to have similar appearance because children inherit genes from their parents and these genes influence

More information

BASIC STATISTICAL METHODS FOR GENOMIC DATA ANALYSIS

BASIC STATISTICAL METHODS FOR GENOMIC DATA ANALYSIS BASIC STATISTICAL METHODS FOR GENOMIC DATA ANALYSIS SEEMA JAGGI Indian Agricultural Statistics Research Institute Library Avenue, New Delhi-110 012 seema@iasri.res.in Genomics A genome is an organism s

More information

Title: Genetics and Hearing Loss: Clinical and Molecular Characteristics

Title: Genetics and Hearing Loss: Clinical and Molecular Characteristics Session # : 46 Day/Time: Friday, May 1, 2015, 1:00 4:00 pm Title: Genetics and Hearing Loss: Clinical and Molecular Characteristics Presenter: Kathleen S. Arnos, PhD, Gallaudet University This presentation

More information

LAB : THE CHI-SQUARE TEST. Probability, Random Chance, and Genetics

LAB : THE CHI-SQUARE TEST. Probability, Random Chance, and Genetics Period Date LAB : THE CHI-SQUARE TEST Probability, Random Chance, and Genetics Why do we study random chance and probability at the beginning of a unit on genetics? Genetics is the study of inheritance,

More information

Single Nucleotide Polymorphisms (SNPs)

Single Nucleotide Polymorphisms (SNPs) Single Nucleotide Polymorphisms (SNPs) Additional Markers 13 core STR loci Obtain further information from additional markers: Y STRs Separating male samples Mitochondrial DNA Working with extremely degraded

More information

Chapter 8: Recombinant DNA 2002 by W. H. Freeman and Company Chapter 8: Recombinant DNA 2002 by W. H. Freeman and Company

Chapter 8: Recombinant DNA 2002 by W. H. Freeman and Company Chapter 8: Recombinant DNA 2002 by W. H. Freeman and Company Genetic engineering: humans Gene replacement therapy or gene therapy Many technical and ethical issues implications for gene pool for germ-line gene therapy what traits constitute disease rather than just

More information

Popstats Unplugged. 14 th International Symposium on Human Identification. John V. Planz, Ph.D. UNT Health Science Center at Fort Worth

Popstats Unplugged. 14 th International Symposium on Human Identification. John V. Planz, Ph.D. UNT Health Science Center at Fort Worth Popstats Unplugged 14 th International Symposium on Human Identification John V. Planz, Ph.D. UNT Health Science Center at Fort Worth Forensic Statistics From the ground up Why so much attention to statistics?

More information

I. Genes found on the same chromosome = linked genes

I. Genes found on the same chromosome = linked genes Genetic recombination in Eukaryotes: crossing over, part 1 I. Genes found on the same chromosome = linked genes II. III. Linkage and crossing over Crossing over & chromosome mapping I. Genes found on the

More information

Y Chromosome Markers

Y Chromosome Markers Y Chromosome Markers Lineage Markers Autosomal chromosomes recombine with each meiosis Y and Mitochondrial DNA does not This means that the Y and mtdna remains constant from generation to generation Except

More information

6.3 Conditional Probability and Independence

6.3 Conditional Probability and Independence 222 CHAPTER 6. PROBABILITY 6.3 Conditional Probability and Independence Conditional Probability Two cubical dice each have a triangle painted on one side, a circle painted on two sides and a square painted

More information

GENOMIC information is transforming animal and plant

GENOMIC information is transforming animal and plant GENOMIC SELECTION Genomic Prediction in Animals and Plants: Simulation of Data, Validation, Reporting, and Benchmarking Hans D. Daetwyler,*,1 Mario P. L. Calus, Ricardo Pong-Wong, Gustavo de los Campos,

More information

(1-p) 2. p(1-p) From the table, frequency of DpyUnc = ¼ (p^2) = #DpyUnc = p^2 = 0.0004 ¼(1-p)^2 + ½(1-p)p + ¼(p^2) #Dpy + #DpyUnc

(1-p) 2. p(1-p) From the table, frequency of DpyUnc = ¼ (p^2) = #DpyUnc = p^2 = 0.0004 ¼(1-p)^2 + ½(1-p)p + ¼(p^2) #Dpy + #DpyUnc Advanced genetics Kornfeld problem set_key 1A (5 points) Brenner employed 2-factor and 3-factor crosses with the mutants isolated from his screen, and visually assayed for recombination events between

More information

September 2015. Population analysis of the Retriever (Flat Coated) breed

September 2015. Population analysis of the Retriever (Flat Coated) breed Population analysis of the Retriever (Flat Coated) breed Genetic analysis of the Kennel Club pedigree records of the UK Retriever (Flat Coated) population has been carried out with the aim of estimating

More information

14.3 Studying the Human Genome

14.3 Studying the Human Genome 14.3 Studying the Human Genome Lesson Objectives Summarize the methods of DNA analysis. State the goals of the Human Genome Project and explain what we have learned so far. Lesson Summary Manipulating

More information

Comparison of Major Domination Schemes for Diploid Binary Genetic Algorithms in Dynamic Environments

Comparison of Major Domination Schemes for Diploid Binary Genetic Algorithms in Dynamic Environments Comparison of Maor Domination Schemes for Diploid Binary Genetic Algorithms in Dynamic Environments A. Sima UYAR and A. Emre HARMANCI Istanbul Technical University Computer Engineering Department Maslak

More information

AP: LAB 8: THE CHI-SQUARE TEST. Probability, Random Chance, and Genetics

AP: LAB 8: THE CHI-SQUARE TEST. Probability, Random Chance, and Genetics Ms. Foglia Date AP: LAB 8: THE CHI-SQUARE TEST Probability, Random Chance, and Genetics Why do we study random chance and probability at the beginning of a unit on genetics? Genetics is the study of inheritance,

More information

2. True or False? The sequence of nucleotides in the human genome is 90.9% identical from one person to the next. False (it s 99.

2. True or False? The sequence of nucleotides in the human genome is 90.9% identical from one person to the next. False (it s 99. 1. True or False? A typical chromosome can contain several hundred to several thousand genes, arranged in linear order along the DNA molecule present in the chromosome. True 2. True or False? The sequence

More information

AP Biology Essential Knowledge Student Diagnostic

AP Biology Essential Knowledge Student Diagnostic AP Biology Essential Knowledge Student Diagnostic Background The Essential Knowledge statements provided in the AP Biology Curriculum Framework are scientific claims describing phenomenon occurring in

More information

5 GENETIC LINKAGE AND MAPPING

5 GENETIC LINKAGE AND MAPPING 5 GENETIC LINKAGE AND MAPPING 5.1 Genetic Linkage So far, we have considered traits that are affected by one or two genes, and if there are two genes, we have assumed that they assort independently. However,

More information

The Human Genome Project

The Human Genome Project The Human Genome Project Brief History of the Human Genome Project Physical Chromosome Maps Genetic (or Linkage) Maps DNA Markers Sequencing and Annotating Genomic DNA What Have We learned from the HGP?

More information

HLA data analysis in anthropology: basic theory and practice

HLA data analysis in anthropology: basic theory and practice HLA data analysis in anthropology: basic theory and practice Alicia Sanchez-Mazas and José Manuel Nunes Laboratory of Anthropology, Genetics and Peopling history (AGP), Department of Anthropology and Ecology,

More information

A 6-day Course on Statistical Genetics

A 6-day Course on Statistical Genetics Study Coordinating Centre Hypertension and Cardiovascular Rehabilitation Unit Department of Cardiovascular Diseases University of Leuven A 6-day Course on Statistical Genetics 17-22 July 2006 The course

More information

CHROMOSOMES AND INHERITANCE

CHROMOSOMES AND INHERITANCE SECTION 12-1 REVIEW CHROMOSOMES AND INHERITANCE VOCABULARY REVIEW Distinguish between the terms in each of the following pairs of terms. 1. sex chromosome, autosome 2. germ-cell mutation, somatic-cell

More information

Genetics Module B, Anchor 3

Genetics Module B, Anchor 3 Genetics Module B, Anchor 3 Key Concepts: - An individual s characteristics are determines by factors that are passed from one parental generation to the next. - During gamete formation, the alleles for

More information

Hardy-Weinberg Equilibrium Problems

Hardy-Weinberg Equilibrium Problems Hardy-Weinberg Equilibrium Problems 1. The frequency of two alleles in a gene pool is 0.19 (A) and 0.81(a). Assume that the population is in Hardy-Weinberg equilibrium. (a) Calculate the percentage of

More information

I Have the Results of My Genetic Genealogy Test, Now What?

I Have the Results of My Genetic Genealogy Test, Now What? I Have the Results of My Genetic Genealogy Test, Now What? Version 2.1 1 I Have the Results of My Genetic Genealogy Test, Now What? Chapter 1: What Is (And Isn t) Genetic Genealogy? Chapter 2: How Do I

More information

INTRODUCTION TO GENETIC EPIDEMIOLOGY (EPID0754) Prof. Dr. Dr. K. Van Steen

INTRODUCTION TO GENETIC EPIDEMIOLOGY (EPID0754) Prof. Dr. Dr. K. Van Steen INTRODUCTION TO GENETIC EPIDEMIOLOGY (EPID0754) Prof. Dr. Dr. K. Van Steen Introduction to Genetic Epidemiology DIFFERENT FACES OF GENETIC EPIDEMIOLOGY 1 Basic epidemiology 1.a Aims of epidemiology 1.b

More information

Tutorial on gplink. http://pngu.mgh.harvard.edu/~purcell/plink/gplink.shtml. PLINK tutorial, December 2006; Shaun Purcell, shaun@pngu.mgh.harvard.

Tutorial on gplink. http://pngu.mgh.harvard.edu/~purcell/plink/gplink.shtml. PLINK tutorial, December 2006; Shaun Purcell, shaun@pngu.mgh.harvard. Tutorial on gplink http://pngu.mgh.harvard.edu/~purcell/plink/gplink.shtml Basic gplink analyses Data management Summary statistics Association analysis Population stratification IBD-based analysis gplink

More information

Biology 1406 Exam 4 Notes Cell Division and Genetics Ch. 8, 9

Biology 1406 Exam 4 Notes Cell Division and Genetics Ch. 8, 9 Biology 1406 Exam 4 Notes Cell Division and Genetics Ch. 8, 9 Ch. 8 Cell Division Cells divide to produce new cells must pass genetic information to new cells - What process of DNA allows this? Two types

More information

Two-locus population genetics

Two-locus population genetics Two-locus population genetics Introduction So far in this course we ve dealt only with variation at a single locus. There are obviously many traits that are governed by more than a single locus in whose

More information

Presentation by: Ahmad Alsahaf. Research collaborator at the Hydroinformatics lab - Politecnico di Milano MSc in Automation and Control Engineering

Presentation by: Ahmad Alsahaf. Research collaborator at the Hydroinformatics lab - Politecnico di Milano MSc in Automation and Control Engineering Johann Bernoulli Institute for Mathematics and Computer Science, University of Groningen 9-October 2015 Presentation by: Ahmad Alsahaf Research collaborator at the Hydroinformatics lab - Politecnico di

More information

Step by Step Guide to Importing Genetic Data into JMP Genomics

Step by Step Guide to Importing Genetic Data into JMP Genomics Step by Step Guide to Importing Genetic Data into JMP Genomics Page 1 Introduction Data for genetic analyses can exist in a variety of formats. Before this data can be analyzed it must imported into one

More information

Gene mutation and molecular medicine Chapter 15

Gene mutation and molecular medicine Chapter 15 Gene mutation and molecular medicine Chapter 15 Lecture Objectives What Are Mutations? How Are DNA Molecules and Mutations Analyzed? How Do Defective Proteins Lead to Diseases? What DNA Changes Lead to

More information

Online Supplement to Polygenic Influence on Educational Attainment. Genotyping was conducted with the Illumina HumanOmni1-Quad v1 platform using

Online Supplement to Polygenic Influence on Educational Attainment. Genotyping was conducted with the Illumina HumanOmni1-Quad v1 platform using Online Supplement to Polygenic Influence on Educational Attainment Construction of Polygenic Score for Educational Attainment Genotyping was conducted with the Illumina HumanOmni1-Quad v1 platform using

More information

Principles of Evolution - Origin of Species

Principles of Evolution - Origin of Species Theories of Organic Evolution X Multiple Centers of Creation (de Buffon) developed the concept of "centers of creation throughout the world organisms had arisen, which other species had evolved from X

More information

Missing Data. A Typology Of Missing Data. Missing At Random Or Not Missing At Random

Missing Data. A Typology Of Missing Data. Missing At Random Or Not Missing At Random [Leeuw, Edith D. de, and Joop Hox. (2008). Missing Data. Encyclopedia of Survey Research Methods. Retrieved from http://sage-ereference.com/survey/article_n298.html] Missing Data An important indicator

More information

Worksheet: The theory of natural selection

Worksheet: The theory of natural selection Worksheet: The theory of natural selection Senior Phase Grade 7-9 Learning area: Natural Science Strand: Life and living Theme: Biodiversity, change and continuity Specific Aim 1: Acquiring knowledge of

More information

P (B) In statistics, the Bayes theorem is often used in the following way: P (Data Unknown)P (Unknown) P (Data)

P (B) In statistics, the Bayes theorem is often used in the following way: P (Data Unknown)P (Unknown) P (Data) 22S:101 Biostatistics: J. Huang 1 Bayes Theorem For two events A and B, if we know the conditional probability P (B A) and the probability P (A), then the Bayes theorem tells that we can compute the conditional

More information

Chapter 10. Key Ideas Correlation, Correlation Coefficient (r),

Chapter 10. Key Ideas Correlation, Correlation Coefficient (r), Chapter 0 Key Ideas Correlation, Correlation Coefficient (r), Section 0-: Overview We have already explored the basics of describing single variable data sets. However, when two quantitative variables

More information

Web Data Extraction: 1 o Semestre 2007/2008

Web Data Extraction: 1 o Semestre 2007/2008 Web Data : Given Slides baseados nos slides oficiais do livro Web Data Mining c Bing Liu, Springer, December, 2006. Departamento de Engenharia Informática Instituto Superior Técnico 1 o Semestre 2007/2008

More information

Answer Key Problem Set 5

Answer Key Problem Set 5 7.03 Fall 2003 1 of 6 1. a) Genetic properties of gln2- and gln 3-: Answer Key Problem Set 5 Both are uninducible, as they give decreased glutamine synthetase (GS) activity. Both are recessive, as mating

More information

Genetics for the Novice

Genetics for the Novice Genetics for the Novice by Carol Barbee Wait! Don't leave yet. I know that for many breeders any article with the word genetics in the title causes an immediate negative reaction. Either they quickly turn

More information

Holland s GA Schema Theorem

Holland s GA Schema Theorem Holland s GA Schema Theorem v Objective provide a formal model for the effectiveness of the GA search process. v In the following we will first approach the problem through the framework formalized by

More information

Continuous and discontinuous variation

Continuous and discontinuous variation Continuous and discontinuous variation Variation, the small differences that exist between individuals, can be described as being either discontinuous or continuous. Discontinuous variation This is where

More information

A Primer of Genome Science THIRD

A Primer of Genome Science THIRD A Primer of Genome Science THIRD EDITION GREG GIBSON-SPENCER V. MUSE North Carolina State University Sinauer Associates, Inc. Publishers Sunderland, Massachusetts USA Contents Preface xi 1 Genome Projects:

More information

Chapter 13: Meiosis and Sexual Life Cycles

Chapter 13: Meiosis and Sexual Life Cycles Name Period Concept 13.1 Offspring acquire genes from parents by inheriting chromosomes 1. Let s begin with a review of several terms that you may already know. Define: gene locus gamete male gamete female

More information

B2B Customer Satisfaction Research

B2B Customer Satisfaction Research Circle Research White Paper B2B Customer Satisfaction B2B Customer Satisfaction Research IN SUMMARY This paper on B2B customer satisfaction research: Identifies why customer satisfaction matters Provides

More information

Bayesian coalescent inference of population size history

Bayesian coalescent inference of population size history Bayesian coalescent inference of population size history Alexei Drummond University of Auckland Workshop on Population and Speciation Genomics, 2016 1st February 2016 1 / 39 BEAST tutorials Population

More information

Mendelian Genetics in Drosophila

Mendelian Genetics in Drosophila Mendelian Genetics in Drosophila Lab objectives: 1) To familiarize you with an important research model organism,! Drosophila melanogaster. 2) Introduce you to normal "wild type" and various mutant phenotypes.

More information

Tuesday 14 May 2013 Morning

Tuesday 14 May 2013 Morning THIS IS A NEW SPECIFICATION H Tuesday 14 May 2013 Morning GCSE TWENTY FIRST CENTURY SCIENCE BIOLOGY A A161/02 Modules B1 B2 B3 (Higher Tier) *A137150613* Candidates answer on the Question Paper. A calculator

More information