A heterozygote advantage analysis for tuberculosis resistance in dairy cattle Smaragda Tsairidou 19.6.14 Tsairidou S. 1, Woolliams J.A. 1, Allen A.R. 2, Skuce R.A. 2,3, McBride S.H. 2, Wright D.M. 3, Bermingham M.L. 1, Pong-Wong R. 1, Matika O. 1, Pooley C.M. 1, McDowell S.W.J. 2, Glass E.J. 1, Bishop S.C. 1 (1) The Roslin Institute and RDVS, University of Edinburgh (2) Agri-Food and Biosciences Institute, Belfast (3) Queen s University of Belfast, Belfast
btb status: it s complicated
btb status: it s complicated Genetic Resistance
Part 1: Genomic Selection using Genetic Markers
Genomic Selection Genomic Selection» selection of animals based on their genotypes» genotyping using dense SNP chip data Genetic Markers» SNPs (Single Nucleotide Polymorphisms) across the genome, likely to be linked with QTLs (Quantitative Trait Loci)
Genomic Selection Genomic Selection» selection of animals based on their genotypes» genotyping using dense SNP chip data Genetic Markers» SNPs (Single Nucleotide Polymorphisms) across the genome, likely to be linked with QTLs (Quantitative Trait Loci) We can identify more resistant individuals using markers» Marker Assisted Selection (MAS)
Genomic Selection Genomic Selection» selection of animals based on their genotypes» genotyping using dense SNP chip data Genetic Markers» SNPs (Single Nucleotide Polymorphisms) across the genome, likely to be linked with QTLs (Quantitative Trait Loci) We can identify more resistant individuals using markers» Marker Assisted Selection (MAS)» genetic selection of btb resistant individuals may offer a complementary control strategy for btb
Marker Based Selection: previous examples MAS for Infectious Pancreatic Necrosis (IPN) in Atlantic Salmon Selection for Scrapie resistance in sheep based on the PrP genotype
btb infection: a complex trait Susceptibility to btb is a complex trait:» environment + many genetic variants
btb infection: a complex trait Susceptibility to btb is a complex trait:» environment + many genetic variants The principles of genetic selection apply in disease control» requirement: presence of heritable genetic variation in host susceptibility
btb infection: a complex trait Susceptibility to btb is a complex trait:» environment + many genetic variants The principles of genetic selection apply in disease control» requirement: presence of heritable genetic variation in host susceptibility Different loci have different properties and all loci do NOT affect the trait in the same way» additive and non-additive genetic variation
Genetic variation Additive: (AA BB) / 2
Genetic variation Dominance: AB - (AA + BB) / 2
Genetic variation Complete dominance: AB - (AA + BB) / 2
Genetic variation - The heterozygote advantage hypothesis Heterozygote advantage and hybrid vigour Disease resistance: both alleles present more options to respond to infection Overdominance: AB - (AA + BB) / 2
Genetic variation Heterozygote dis-advantage Underdominance: AB - (AA + BB) / 2
Part 2: Genome Wide Association Studies (GWAS) and Heterozygote (dis)advantage Analysis
Aims of the study Study the genetic architecture of resistance to btb: is it a single QTL affecting btb resistance? Investigate the hypothesis that loci showing heterozygote (dis)advantage are associated with resistance to btb Compare such loci to those obtained from standard GWAS
Phenotypes and Genotypes Holstein-Friesian cows (n = 1,151) Commercial herds in Northern Ireland Age, Year, Season, Reason, Breed Illumina BovineHD BeadChip (617,885 SNPs)
Phenotypes and Genotypes Holstein-Friesian cows (n = 1,151) Commercial herds in Northern Ireland Age, Year, Season, Reason, Breed Illumina BovineHD BeadChip (617,885 SNPs) CASES: positive SICCT i.e. >4mm than reaction to M. avium after 72 h + confirmed lesions by post-mortem examination CONTROLS: multiple negative test results (higher prevalence herds)
Datasets» Dataset 1: all animals included» Dataset 2: animals clustering separately in CMDS removed Classical Multidimensional Scaling» Dataset 3: herds contributing no controls removed (n = 222) Quality Control: MAF>0.05, call rate>95%, HWE p<0.000001 Dataset n animals n SNPs n animals after QC n SNPs after QC 1 1151 617885 1150 549687 2 1111 617885 1110 549835 3 929 617885 929 550108
Standard Genome Scans Genome Wide Association Study (GWAS)» method for identifying SNPs across the genome having an effect on the trait under study based on their p-values» assumption: they reside within or are linked to a QTL SNP Chr Additive model p-value rs42494357 13 8.6 x 10-07 rs110465273 13 6.1 x 10-07 rs42494342 13 5.9 x 10-07 rs109809949 13 1.2 x 10-06 rs109042660 13 5.2 x 10-07 rs137562332 13 6.1 x 10-07 rs132841890 13 5.9 x 10-07 (Bermingham et al. 2014)
GWAS for Heterozygote Advantage Heterozygote advantage GWAS» assumption: underlying non-additive genetic variation» genotypes recoded into heterozygotes (A 1 A 2 ) and homozygotes (A 1 A 1 & A 2 A 2 )» GWAS repeated for each of the three datasets (GenABEL package, R/2.15.2) Accounting for relatedness» genomic kinship (K) matrix calculated using SNP information
GWAS for Heterozygote Advantage Heterozygote advantage GWAS» assumption: underlying non-additive genetic variation» genotypes recoded into heterozygotes (A 1 A 2 ) and homozygotes (A 1 A 1 & A 2 A 2 )» GWAS repeated for each of the three datasets (GenABEL package, R/2.15.2) Accounting for relatedness» genomic kinship (K) matrix calculated using SNP information Binary phenotype (btb status): case = 1, control = 0 Reason Y = a + D + S + R + B + u + e ijkmpq i j k m p i Genetic effects Age Year Season Breed
Results - Dataset 1 GWAS for Heterozygote advantage: SNP on BTA6 (position: 10,245,091) significant at the suggestive level Significance thresholds Bonferroni correction for multiple testing: -log 10 (0.05 / N) genome-wide threshold -log 10 (1 / N) suggestive threshold where N, the number of SNPs Genome-Wide Suggestive rs43032684 Chr 6 rs109960101 Chr 25 7.09 5.79 6.29 4.94
Results - Dataset 1 GWAS for Heterozygote advantage: SNP on BTA6 (position: 10,245,091) significant at the suggestive level Further Q.C. for the SNP Significance thresholds Bonferroni correction for multiple testing: -log 10 (0.05 / N) genome-wide threshold -log 10 (1 / N) suggestive threshold where N, the number of SNPs Genome-Wide Suggestive rs43032684 Chr 6 rs109960101 Chr 25 7.09 5.79 6.29 4.94
CMDS - Dataset 2 Classical Multidimensional Scaling (CMDS)» visualising the level of dissimilarities between individuals based on their genome-wide IBS pairwise distances matrix ( GenABEL, R/2.15.2)
CMDS - Dataset 2 Classical Multidimensional Scaling (CMDS)» visualising the level of dissimilarities between individuals based on their genome-wide IBS pairwise distances matrix ( GenABEL, R/2.15.2)» Calculate Principal Components (PCs)» Identify individuals clustering separately Individuals forming a distinct cluster (n = 40) originating from the same herd (possibly crossbreeding)
CMDS - Dataset 2 Approaches to control for possible genetic substructure 1. Adjust for stratification by principal components 2. Remove the 40 individuals clustering separately
Results - Dataset 2 CMDS analysis: the SNP identified was consistently present across all approaches used to control for substructure Genome- Wide Suggestive rs43032684 chr 6 (1. PCs) rs43032684 chr 6 (2. minor cluster removed) rs109960101 chr 25 (3. minor cluster removed) 7.09 5.79 6.43 6.09 5.05
Results - Dataset 3 Analysis for the reduced dataset after removing herds contributing no controls pattern on chromosome 6 still visible Genome-Wide Suggestive rs43032684 chr 6 rs109682541 chr 17 7.09 5.79 5.25 6.12
Results Genotypic frequencies and HWE test Hardy-Weinberg Equilibrium (HWE) test» genotypic frequencies are calculated for each of the genotypic classes» x 2 test: significant departure for the observed genotypic frequencies from the expectations under HWE» (a) full dataset, (b) separately for cases and controls and (c) pairwise between cases and controls rs43032684 CASES CONTROLS A/A 239 286 A/G 254 154 G/G 62 63 TOTALS 555 503 CONTROLS AA AG GG Total Observed 286 154 63 503 Expected 262 203 39 NA 36 56 Significant departure from HWE expectations for the controls (x 2 =28.6, p<0.001) but not for the cases Significantly less heterozygotes in the controls (x 2 = 11.5, p<0.001)
ASReml Analysis 1. Predicted genotypic means in ASReml (Gilmour et al. 2002)» LM: Linear mixed model (genotypes for the SNP of interest as a fixed effect) Y ijkmpβ = μ + a i + D j + S k + R m + B p + X β + u i + e i SNP genotypes 2. Linear mixed model with interaction component with a significant SNP identified by standard GWAS 3. Generalized Linear (Mixed) Model for binomial analysis (case-control)» Threshold Models (TMs) link functions: PROBIT, LOGIT, LOGLOG
ASReml analysis - Estimated Genotypic Effects F = 13.45 (P < 0.001) Predicted Value Standard Error GG 0.59 0.05 AG 0.72 0.03 AA 0.55 0.03 F = 13.74 (P < 0.001) PROBIT Predicted Value Standard Error GG 0.61 0.05 AG 0.73 0.03 AA 0.56 0.03 F = 12.64 (P < 0.001) F = 15.22 (P < 0.001) LOGIT Predicted Value Standard Error GG 0.60 0.05 AG 0.73 0.03 AA 0.56 0.03 LOGLOG Predicted Value Standard Error GG 0.62 0.05 AG 0.76 0.03 AA 0.56 0.03 Interaction was not significant (P>0.1) The SNP identified explains 1.7% of the total phenotypic variance
ASReml analysis - Estimated Genotypic Effects F = 13.45 (P < 0.001) Predicted Value Standard Error GG 0.59 0.05 AG 0.72 0.03 AA 0.55 0.03 F = 13.74 (P < 0.001) PROBIT Predicted Value Standard Error GG 0.61 0.05 AG 0.73 0.03 AA 0.56 0.03 F = 12.64 (P < 0.001) F = 15.22 (P < 0.001) LOGIT Predicted Value Standard Error GG 0.60 0.05 AG 0.73 0.03 AA 0.56 0.03 LOGLOG Predicted Value Standard Error GG 0.62 0.05 AG 0.76 0.03 AA 0.56 0.03 Interaction was not significant (P>0.1) The SNP identified explains 1.7% of the total phenotypic variance Heterozygotes are more likely to have a diseased phenotype, displaying a heterozygote disadvantage pattern
Discussion SNP resides within the peroxiredoxin-6-like pseudogene» pseudogene contains two exons and the SNP lies down-stream of exon 1 Pseudogene Functional homologue Pseudogenes» non-functional but sometimes maintain some functionality (e.g. gene expression regulation)» derive from parental gene through (a) reverse transcription or (b) gene duplication Parental gene (PRDX6) (peroxiredoxin-6: enzyme involved in lipid internalising and degradation in alveolar macrophages)
Discussion SNP adjacent to candidate genes and CNV regions» Ensemble genome browser, region: 6:8,300,000-10,300,000 bp» contains three previously identified Copy Number Variation (CNV) regions associated with gastrointestinal nematodes resistance in cattle (Hou et al. 2012)
Conclusions It is feasible to use markers across the genome to select individuals more resistant to bovine Tuberculosis A SNP identified suggesting association between locus heterozygosity and increased susceptibility to btb in cattle» significantly fewer than expected heterozygotes in the controls: i.e. heterozygotes more likely to be diseased» possibly a fitness disadvantage for the heterozygotes» as documented for Rhesus blood group system & chromosomal rearrangements Comparison with results from standard genome scans» no interaction was observed with significant marker from GWAS Further studies are needed to confirm these findings» effects of the sample size on the power of the GWAS
Practical Implementation Breeding for resistance to btb Implementing results» Calculating Breeding Values for btb resistance in dairy cattle Stage 1: Using the skin test and abattoir confirmed results and linking the animals by pedigree Stage 2: Expand to Genomic Selection with SNPs» SNP chip genotypes on widely used sires are already available Stage 3: Incorporating results from our studies» after validation
Thank you! Bishop S.C. Woolliams J.A. Glass E.J. Pong-Wong R. Matika O. Pooley C.M. Bermingham M.L. Allen A.R. Skuce R.A. McBride S.H. Wright D.M. McDowell S.W.J. We acknowledge financial support from the Roslin Institute and the University of Edinburgh through the Principal's Career Development PhD, the Greek State Scholarship Foundation and the Biotechnology and Biological Sciences Research Council.