Introduction to Genetic Epidemiology. Erwin Schurr McGill International TB Centre McGill University

Similar documents
Heritability: Twin Studies. Twin studies are often used to assess genetic effects on variation in a trait

Population Genetics and Multifactorial Inheritance 2002

Deterministic computer simulations were performed to evaluate the effect of maternallytransmitted

Lecture 6: Single nucleotide polymorphisms (SNPs) and Restriction Fragment Length Polymorphisms (RFLPs)

NGS and complex genetics

Bio EOC Topics for Cell Reproduction: Bio EOC Questions for Cell Reproduction:

Globally, about 9.7% of cancers in men are prostate cancers, and the risk of developing the

School of Nursing. Presented by Yvette Conley, PhD

Genetics of Rheumatoid Arthritis Markey Lecture Series

A and B are not absolutely linked. They could be far enough apart on the chromosome that they assort independently.

(1-p) 2. p(1-p) From the table, frequency of DpyUnc = ¼ (p^2) = #DpyUnc = p^2 = ¼(1-p)^2 + ½(1-p)p + ¼(p^2) #Dpy + #DpyUnc

Heredity - Patterns of Inheritance

Basics of Marker Assisted Selection

Genetics Lecture Notes Lectures 1 2

GAW 15 Problem 3: Simulated Rheumatoid Arthritis Data Full Model and Simulation Parameters

Genetics Module B, Anchor 3

2 GENETIC DATA ANALYSIS

Chapter 9 Patterns of Inheritance

5 GENETIC LINKAGE AND MAPPING

I. Genes found on the same chromosome = linked genes

Chapter 8: Recombinant DNA 2002 by W. H. Freeman and Company Chapter 8: Recombinant DNA 2002 by W. H. Freeman and Company

Name: Class: Date: ID: A

A trait is a variation of a particular character (e.g. color, height). Traits are passed from parents to offspring through genes.

Genetics 1. Defective enzyme that does not make melanin. Very pale skin and hair color (albino)

B I O I N F O R M A T I C S

Mendelian inheritance and the

Marker-Assisted Backcrossing. Marker-Assisted Selection. 1. Select donor alleles at markers flanking target gene. Losing the target allele

Heredity. Sarah crosses a homozygous white flower and a homozygous purple flower. The cross results in all purple flowers.

Paternity Testing. Chapter 23

MEDICAL GENETICS GENERAL OBJECTIVE SPECIFIC OBJECTIVES

PRACTICE PROBLEMS - PEDIGREES AND PROBABILITIES

A Primer of Genome Science THIRD

Title: Genetics and Hearing Loss: Clinical and Molecular Characteristics

GENOMIC SELECTION: THE FUTURE OF MARKER ASSISTED SELECTION AND ANIMAL BREEDING

Logistic Regression (1/24/13)

MAGIC design. and other topics. Karl Broman. Biostatistics & Medical Informatics University of Wisconsin Madison

Gene Mapping Techniques

AP: LAB 8: THE CHI-SQUARE TEST. Probability, Random Chance, and Genetics

INTRODUCTION TO GENETIC EPIDEMIOLOGY (EPID0754) Prof. Dr. Dr. K. Van Steen

P (B) In statistics, the Bayes theorem is often used in the following way: P (Data Unknown)P (Unknown) P (Data)

SNPbrowser Software v3.5

A Multi-locus Genetic Risk Score for Abdominal Aortic Aneurysm

The correct answer is c A. Answer a is incorrect. The white-eye gene must be recessive since heterozygous females have red eyes.

Forensic DNA Testing Terminology

GENETIC CROSSES. Monohybrid Crosses

Haplotype analysis of case-control data

Chromosomes, Mapping, and the Meiosis Inheritance Connection

GENETICS AND INSURANCE: QUANTIFYING THE IMPACT OF GENETIC INFORMATION

Hardy-Weinberg Equilibrium Problems

Report. A Note on Exact Tests of Hardy-Weinberg Equilibrium. Janis E. Wigginton, 1 David J. Cutler, 2 and Gonçalo R. Abecasis 1

Biology 1406 Exam 4 Notes Cell Division and Genetics Ch. 8, 9

Human Genome Organization: An Update. Genome Organization: An Update

What is Pharmacogenomics? Personalization of Medications for You! Michigan State Medical Assistants Conference May 6, 2006

PRINCIPLES OF POPULATION GENETICS

Terms: The following terms are presented in this lesson (shown in bold italics and on PowerPoint Slides 2 and 3):

Basic Principles of Forensic Molecular Biology and Genetics. Population Genetics

Biology Final Exam Study Guide: Semester 2

LAB : THE CHI-SQUARE TEST. Probability, Random Chance, and Genetics

Sample Reproducibility of Genetic Association Using Different Multimarker TDTs in Genome-Wide Association Studies: Characterization and a New Approach

GENETICS AND HEALTH INSURANCE

Mendelian and Non-Mendelian Heredity Grade Ten

What is Cancer? Cancer is a genetic disease: Cancer typically involves a change in gene expression/function:

Autoimmunity and immunemediated. FOCiS. Lecture outline

Name: 4. A typical phenotypic ratio for a dihybrid cross is a) 9:1 b) 3:4 c) 9:3:3:1 d) 1:2:1:2:1 e) 6:3:3:6

7A The Origin of Modern Genetics

HLA data analysis in anthropology: basic theory and practice

Introduction to genetic testing and pharmacogenomics

Genomic Selection in. Applied Training Workshop, Sterling. Hans Daetwyler, The Roslin Institute and R(D)SVS

Tests in a case control design including relatives

somatic cell egg genotype gamete polar body phenotype homologous chromosome trait dominant autosome genetics recessive

CHROMOSOMES AND INHERITANCE

The Human Genome Project

An example of bioinformatics application on plant breeding projects in Rijk Zwaan

Cystic Fibrosis Webquest Sarah Follenweider, The English High School 2009 Summer Research Internship Program

Risk Factors for Alcoholism among Taiwanese Aborigines

SNP Essentials The same SNP story

Investigating the genetic basis for intelligence

BioBoot Camp Genetics

Genetic Mutations. Indicator 4.8: Compare the consequences of mutations in body cells with those in gametes.

SNP Data Integration and Analysis for Drug- Response Biomarker Discovery

Mendelian Genetics in Drosophila

This fact sheet describes how genes affect our health when they follow a well understood pattern of genetic inheritance known as autosomal recessive.

Gene mutation and molecular medicine Chapter 15

Breast cancer and the role of low penetrance alleles: a focus on ATM gene

Influence of Sex on Genetics. Chapter Six

The following chapter is called "Preimplantation Genetic Diagnosis (PGD)".

Biology Notes for exam 5 - Population genetics Ch 13, 14, 15

Popstats Unplugged. 14 th International Symposium on Human Identification. John V. Planz, Ph.D. UNT Health Science Center at Fort Worth

TRACKS GENETIC EPIDEMIOLOGY

FHF prosjekt :

Molecular typing of VTEC: from PFGE to NGS-based phylogeny

Step-by-Step Guide to Bi-Parental Linkage Mapping WHITE PAPER

Y Chromosome Markers

ITT Advanced Medical Technologies - A Programmer's Overview

Von Mäusen und Menschen E - 1

MCB41: Second Midterm Spring 2009

Combining Data from Different Genotyping Platforms. Gonçalo Abecasis Center for Statistical Genetics University of Michigan

Forensic Statistics. From the ground up. 15 th International Symposium on Human Identification

BI122 Introduction to Human Genetics, Fall 2014

Epigenetic variation and complex disease risk

Transcription:

Introduction to Genetic Epidemiology Erwin Schurr McGill International TB Centre McGill University

Methods of investigation in humans Phenotype Rare (very severe forms) Common (infection/affection status) Sample Small Large Causality monogenic complex Main tools Mendelian Genetics Genetic Epidemiology Rare mutation Strong effect Common polymorphism Modest effect

Complex phenotypes In contrast to monogenic disease Complex trait : Environmental factors Genetic factors major gene other genes gene*environment interactions gene*gene interactions Examples: Cancers Cardiovascular diseases Neurological diseases Infectious diseases

Genetic epidemiology: objectives / tools Do genetic factors play a role? Epidemiological observations / Experimental model What is their nature? Segregation analysis What is their chromosomal location? Linkage analysis Which allelic variant is implicated? Association studies What is its function? Functional studies

Genetic epidemiology: overview Segregation analysis Sample # affected sibs DNA Markers Main goal Families 0 n No - genetic model Linkage analysis Families 2 n Yes Microsat/ SNPs candidate regions Association studies Families 1 n Yes SNPs candidate alleles Cases/controls - Yes SNPs candidate alleles

Spectrum of genetic predisposition Casanova, Abel EMBO J 2007

Interplay of age and genetics <1 1-4 5-14 15-24 25-34 35-44 45-54 55-64 65-74 Annual motraility rate Age class in years

Do genetic factors play a role? Epidemiological observations What is their nature? Segregation analysis What is their chromosomal location? Linkage analysis What is the causal variant? Association studies What is the function?

Family level Twin studies DZ TWINS MZ TWINS 2 fertilizations 1 fertilization Share 50% of genetic background Share 100% of genetic background

MZ Twins DZ Twins Twin 1 + - Twin 1 + - Twin 2 + A B - C D Twin 2 + A B - C D Concordance Rate = 2A / (2A + B + C) Genetic contribution: C MZ vs. C DZ

Concordance Discordance MZ G E DZ Genetic contribution: C MZ > C DZ

Do genetic factors play a role? Epidemiological observations What is their nature? Segregation analysis What is their chromosomal location? Linkage analysis What is the causal variant? Association studies What is the function?

Do genetic factors play a role? Epidemiological observations What is their nature? Segregation analysis What is their chromosomal location? Linkage analysis What is the causal variant? Association studies What is the function?

Model-based linkage analysis Need to specify the relation between the phenotype and the genotype frequency of the disease allele probability to be affected given genotype and risk factors Most powerful method IF the genetic model is correct Estimation of the recombination fraction θ between the phenotype locus (to locate) and the marker locus (known location) Linkage test = θ < 0.5? Example: schistosomiasis (infection intensities, severe hepatic fibrosis)

Model-free linkage analysis 1,2 3,4 RESOLUTION FOR GENE LOCALIZATION: ~ 10 million base pairs Alleles shared Probability 1,3 2,4 2,3 1,4 1,3 0 ¼ 1 ¼ 1 ¼ 2 ¼

Linkage only look at few meiosis Can we look at more can we see dead people? Yes by studying Linkage Disequilibrium

Single Nucleotide Polymorphism In a population the same building block (=nucleotide) of DNA can occur in two alternative forms i.e. at a given DNA position two different nucleotides (=alleles) can occur. If in a population the less frequent allele occurs with >2% we call it common variation. 94% C T T A G C T T 99.9% C T T A G C T T 6% C T T A G T T T 0.1% C T T A G T T T Common SNP Rare SNP

Univariate analysis One binary phenotype One candidate SNP in a candidate gene One association study

Genotypic analysis cases controls AA c 0 t 0 AB c 1 t 1 BB c 2 t 2 Goodness-of-fit test = Chi-square 2 df

Hypothesis testing general strategy 1. Formulate null (H 0 ) and alternative (H 1 ) hypothesis 2. Build a test statistic according to the data to come 3. Identify distribution of the test statistic under H 0 4. Define a decision rule (i.e. type I error) 5. Make the experiment and compute the test statistic 6. Conclude, i.e. reject or not H 0 and precise p-value 7. Interpret the conclusion

Hypothesis testing genetic association 1. H 0 : cases = controls H 1 : cases controls 2. 3. Under H 0, χ² is distributed as a chi-square with 2 df 4. Type I error 5% reject H 0 if χ² > 5.99 5. χ² = 348 6. χ² > 5.99 therefore we reject H 0 (p-value<0.001) 7. The genotypic distribution is significantly different in cases and in controls

Genotypic analysis cases controls AA 200 500 700 AB 200 300 500 BB 600 200 200 1,000 1,000 2,000 Expected AA cases = 1,000*700 / 2,000 =350 Expected AB cases = 1,000*500 / 2,000 =250 Etc

Goodness-of-fit test = Chi-square 2 df X²=[(200-350)²/350] + [(200-250)²/250] + [(600-400)²/400] + [(500-350)²/350] + [(300-250)²/250] + [(200-400)²/400] X²=348.5 with 2 df

Genotypic analysis cases controls AA c 0 t 0 AB c 1 t 1 BB c 2 t 2 Odds ratio AB vs. AA = c 1 *t 0 / c 0 *t 1

Genotypic analysis cases controls AA 200 500 AB 200 300 BB 600 200 Odds ratio AB vs. AA = 200*500 / 200*300=1.66 Odds ratio BB vs. AA = 600*500 / 200*200=7.5

Genotypic analysis general strategy cases controls AA c 0 t 0 AB c 1 t 1 BB c 2 t 2 Estimate OR Estimate OR Optimize coding scheme

Genotypic analysis dominance effect B dominant B recessive cases controls AA c 0 t 0 cases controls AA+AB c 0 + c 1 t 0 +t 1 AB or BB c 1 + c 2 t 1 +t 2 BB c 2 t 2 Goodness-of-fit test = chi-square 1 df

Example cases controls OR P-value AA 100 200 1.00 AB+BB 200 100 4.00 <0.001

Interpretation? Type I error Allele B phenotype = B is the causal allele Allele B is in linkage disequilibrium with the causal allele Genetic linkage between disease locus and marker locus PLUS Allele B is preferentially associated with the causal allele

Linkage is a relation between loci Linkage disequilibrium is a relation between alleles

Descriptors of Linkage Disequilibrium

D Linkage equilibrium (expected for distant loci) Linkage disequilibrium (expected for nearby loci) P AB = P A P B P A P B P Ab = P A P b P A P b P ab = P a P B P a P B P ab = P a P b P a P b D AB = P AB P A P B

D AB = P AB P A P B sign is arbitrary range allele frequencies Hardly allows comparisons Scaled version D AB = D AB / Dmax Dmax = min (PAPb; PaPB) r² AB = D² AB / (P A P B P a P b )

Inflated Type I Error: Stratification Replicate the results in an independent sample/population Incorporate genomic information in the analysis genome records demography history use genome to discover hidden structure by looking at null markers Use familial controls = family-based association studies

Allelic controls TDT Transmitted alleles vs. non-transmitted alleles M 1 M 2 M 2 M 2 M 1 M 2

Transmitted alleles vs. non-transmitted alleles Non-Transmitted Allele Transmitted M 1 M 2 M 1 n 11 n 12 M 2 n 21 n 22 TDT = (n 12 - n 21 ) 2 (n 12 + n 21 ) ~ χ² (1 df)

Genome-wide association studies (GWAS) Study of the common genetic variation across the entire human genome designed to identify genetic association with observable traits NIH 2006

GWAS in leprosy NLP 20 18 19 17 16 14 13 1 15 9 11 10 8 12 7 6 4 3 5 2 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 202122 Chromosome Zhang et al N Engl J Med 2009;361:2609-18.

GWAS in Infectious Diseases