Fully powered polygenic prediction using summary statistics

Size: px
Start display at page:

Download "Fully powered polygenic prediction using summary statistics"

Transcription

1 Fully powered polygenic prediction using summary statistics Alkes L. Price Harvard T.H. Chan School of Public Health October 7, 015 To download slides of this talk: google Alkes HSPH

2 Summary statistics are widely available Nat Genet editorial, July 01

3 Outline 1. A brief history of summary statistic genetics. Introduction to polygenic prediction using summary statistics 3. LDpred method for polygenic prediction using summary statistics 4. Application of LDpred to real data sets

4 Outline 1. A brief history of summary statistic genetics. Introduction to polygenic prediction using summary statistics 3. LDpred method for polygenic prediction using summary statistics 4. Application of LDpred to real data sets

5 Definition of summary statistics Definition: Summary statistics consist of: GWAS association z-scores for each typed or imputed SNP + Sample sizes on which z-scores were computed (may vary by SNP) Note: Many applications also require LD information computed from a reference panel (e.g Genomes or UK10K) using a population very similar to the target sample.

6 Meta-analysis can be performed using summary statistics Evangelou & Ioannidis 013 Nat Rev Genet

7 Joint and conditional analysis can be performed using summary statistics Yang et al. 01 Nat Genet

8 Lee et al. 013 Bioinformatics; Pasaniuc et al. 014 Bioinformatics also see Park et al. 015 Bioinformatics, Lee et al. 015 Bioinformatics Imputation can be performed using summary statistics

9 Rare variant meta-analysis can be performed using summary statistics Lee et al. 013 AJHG; Hu et al. 013 AJHG; Liu et al. 014 Nat Genet also see Clarke et al. 013 PLoS Genet, Tang & Lin 015 AJHG

10 Genetic variance and covariance can be inferred using summary statistics Palla & Dudbridge 015 AJHG; Bulik-Sullivan et al. 015 Nat Genet

11 Functional enrichment can be inferred using summary statistics Pickrell 014 AJHG; Kichaev & Pasaniuc 015 AJHG; Finucane et al. 015 Nat Genet

12 Many projects at ASHG 015 using summary statistics Invited talks Pickrell, Pasaniuc, Im (this session) Platform talks 11 Gusev, 77 Cichonska, 0 Golan, 7 Park Posters 791 Kichaev, 797 Shi, 807 Roytman, 860 Salem, 868 Pare, 1301 Wu, 1334 Zhu, 1357 Chatterjee, 1477 Brown, 1618 Li, 1668 Khawaja, 1686 Lee, 1687 Zhao, 178 Torres, 1867 O Connor

13 Outline 1. A brief history of summary statistic genetics. Introduction to polygenic prediction using summary statistics 3. LDpred method for polygenic prediction using summary statistics 4. Application of LDpred to real data sets

14 Genetic prediction: why care? Erbe et al. 01 J Dairy Sci; Goss et al. 011 New Engl J Med

15 Using only genome-wide significant SNPs is a Stone Age genetic prediction method How should we conduct genetic prediction, Fred? ˆ k ˆ i x i (published SNPs) ik φ k = phenotype for sample k β i = effect size for SNP i x ik = genotype for SNP i, sample k Prediction r is less than half the r attained by polygenic prediction PGC-SCZ 014 Nature; Vilhjalmsson et al. 015 AJHG

16 Polygenic prediction can be performed using genome-wide summary statistics ˆ k ˆ i x i (all GWAS SNPs) ik φ k = phenotype for sample k β i = effect size for SNP i x ik = genotype for SNP i, sample k

17 Daetwyler et al. 008 PLoS ONE; Wray et al. 013 Nat Rev Genet also see Speed & Balding 014 Genome Res (multiblup) Is polygenic prediction using raw genotypes more accurate than using summary statistics? Answer: slightly. h g = heritability explained by SNPs M = number of (unlinked) SNPs N = number of training samples r h g h g h g M / N < r h g h g (1 h r g ) M / N using summary statistics: fit each SNP individually using raw genotypes: fit all SNPs simultaneously (BLUP prediction; Henderson 1975 Biometrics)

18 Accounting for non-infinitesimal architectures can improve polygenic prediction Infinitesimal (Gaussian) architecture: i ~ N 0, hg / M ˆ hg i ~ i N 0,1 / N => E( i ˆ i ) ˆ i hg M / N Uniform shrink on estimated effect sizes is appropriate ˆi

19 Accounting for non-infinitesimal architectures can improve polygenic prediction Non-infinitesimal architecture: (e.g. point-normal mixture, mixture of normals, etc.) Non-uniform shrink on estimated effect sizes is appropriate ˆi

20 Accounting for non-infinitesimal architectures can improve polygenic prediction Infinitesimal (Gaussian) architecture: i ~ N 0, hg / M ˆ hg i ~ i N 0,1 / N => E( i ˆ i ) ˆ i hg M / N Uniform shrink on estimated effect sizes is appropriate Non-infinitesimal architecture: (e.g. point-normal mixture, mixture of normals, etc.) Non-uniform shrink on estimated effect sizes ˆi is appropriate Standard heuristic approach: P-value thresholding ˆ ˆ k i x (Note: requires optimization of ik P T threshold in validation samples) i P-value < P T Purcell et al. 009 Nature; Chatterjee et al. 013 Nat Genet; Dudbridge 013 PLoS Genet ˆi

21 Purcell et al. 009 Nature; Stahl et al. 01 Nat Genet also see Rietveld et al. 013 Science (COJO) Accounting for linkage disequilibrium Problem: can improve polygenic prediction ˆ k ˆ i x i P-value < P T ik does not account for LD b/t SNPs Standard heuristic approaches: Random LD-pruning: prune SNPs (e.g. r < 0.), removing one of each pair of linked SNPs (decide randomly which SNP to remove) Informed LD-pruning (LD-clumping): prune SNPs, removing one of each pair of linked SNPs (remove SNP with less significant P-value in training data)

22 Pruning + Thresholding is widely used Purcell et al. 009 Nature; Lango Allen et al. 010 Nature; Ripke et al. 011 Nat Genet; Stahl et al. 01 Nat Genet; Deloukas et al. 013 Nat Genet; Ripke et al. 013 Nat Genet; Chatterjee et al. 013 Nat Genet; Dudbridge 013 PLoS Genet; PGC-SCZ 014 Nature

23 Pruning + Thresholding is widely used, but does not attain maximum prediction accuracy Simulations at different proportions p of causal SNPs: Non-infinitesimal Non-infinitesimal Infinitesimal Infinitesimal h g Vilhjalmsson et al. 015 AJHG

24 Outline 1. A brief history of summary statistic genetics. Introduction to polygenic prediction using summary statistics 3. LDpred method for polygenic prediction using summary statistics 4. Application of LDpred to real data sets

25 LDpred computes posterior means under a ˆ point-normal prior, accounting for LD k where E ( i ˆ i ) x i (all GWAS SNPs) E ˆ ) ( i i ik φ k = phenotype for sample k β i = effect size for SNP i x ik = genotype for SNP i, sample k are posterior mean effect sizes Vilhjalmsson et al. 015 AJHG

26 LDpred computes posterior means under a ˆ point-normal prior, accounting for LD k E ( i ˆ i ) x i (all GWAS SNPs) ik φ k = phenotype for sample k β i = effect size for SNP i x ik = genotype for SNP i, sample k where E( ˆ i i ) are posterior mean effect sizes based on point-normal prior with parameters: h g = heritability explained by SNPs (estimated from training data) p = proportion of causal SNPs (optimized in validation samples) LD from a reference panel Use validation samples as LD reference (restrict to SNPs with validation data) Vilhjalmsson et al. 015 AJHG

27 In the special case of no LD between SNPs, posterior means can be computed analytically E ˆ ( i i ) hg h g Mp / N p i ˆ i h g = heritability explained by SNPs p = proportion of causal SNPs M = number of (unlinked) SNPs N = number of training samples where p i h g / p Mp h g p / Mp 1/ N 1/ N e ( h g ˆ i e ( h / Mp 1/ N ) g ˆ i / Mp 1/ N ) 1 p 1/ N e ˆ i (1/ N ) is the posterior probability that i 0, i.e. SNP i is causal (generalizes uniform shrink when p = 1: infinitesimal prior, no LD)

28 In the special case of infinitesimal prior (with LD), posterior means can be computed analytically E( i ˆ ) i D M Nh g I 1 ˆ i h g = heritability explained by SNPs M = number of (unlinked) SNPs N = number of training samples where D is an LD matrix from a reference panel (generalizes uniform shrink when D = I: infinitesimal prior, no LD)

29 General case of non-infinitesimal prior with LD: posterior means cannot be computed analytically

30 General case of non-infinitesimal prior with LD: posterior means cannot be computed analytically Possible solutions: Assume 1 causal variant per locus

31 General case of non-infinitesimal prior with LD: posterior means cannot be computed analytically Possible solutions: Assume 1 causal variant per locus Iterative approach

32 General case of non-infinitesimal prior with LD: posterior means cannot be computed analytically Possible solutions: Assume 1 causal variant per locus Iterative approach MCMC

33 General case of non-infinitesimal prior with LD: posterior means cannot be computed analytically Solution: use MCMC. Initialize i = 0 At each big iteration For each SNP i Re-sample i based on Point-normal prior on i Observed ˆ ~ N( D, D / N) N T ˆ 1 D D ( ˆ D ) f ( i ˆ) ~ f ( i ) e, where f ) reflects point-normal prior (based on and p) ( i h g

34 General case of non-infinitesimal prior with LD: posterior means cannot be computed analytically Solution: use MCMC. Initialize i = 0 At each big iteration For each SNP i Re-sample i based on Point-normal prior on i Observed ˆ ~ N( D, D / N) 100 big iterations generally suffice for convergence Rao-Blackwellization: average the posterior means sampled Related MCMC methods for prediction from raw genotypes are described in Erbe et al. 01 J Dairy Sci, Zhou et al. 013 PLoS Genet, Moser et al. 015 PLoS Genet

35 LDpred performs well in simulations Simulations with real genotypes, 1% of SNPs causal

36 Understanding polygenic prediction Let s hide away and dance. -- Freddie K. Let s hide away with data. -- Alkes

37 Outline 1. A brief history of summary statistic genetics. Introduction to polygenic prediction using summary statistics 3. LDpred method for polygenic prediction using summary statistics 4. Application of LDpred to real data sets

38 Data from WTCCC 007 Nature. Results are similar to MCMC-based methods that require raw genotypes: Zhou et al. 013 PLoS Genet, Moser et al. 015 PLoS Genet LDpred performs well on within-cohort prediction of WTCCC traits

39 Data from WTCCC 007 Nature. Results are similar to MCMC-based methods that require raw genotypes: Zhou et al. 013 PLoS Genet, Moser et al. 015 PLoS Genet LDpred performs well on within-cohort prediction of WTCCC traits R nag R obs R liab (see Lee et al. 01 Genet Epidemiol)

40 Data from WTCCC 007 Nature. Results are similar to MCMC-based methods that require raw genotypes: Zhou et al. 013 PLoS Genet, Moser et al. 015 PLoS Genet LDpred performs well on within-cohort prediction of WTCCC traits Dominated by HLA

41 Data from WTCCC 007 Nature. Results are similar to MCMC-based methods that require raw genotypes: Zhou et al. 013 PLoS Genet, Moser et al. 015 PLoS Genet LDpred performs well on within-cohort prediction of WTCCC traits Do not validate in new cohort

42 but within-cohort prediction accuracy may be too good to be true R nag Training: WTCCC Validation: WTCCC Training: WTCCC Validation: WGHS CAD TD Results presented for LDpred; similar relative results for other methods Cryptic relatedness? Population structure? (Wray et al. 013 Nat Rev Genet)

43 LDpred performs well on summary statistics with independent validation cohorts Training N=70K PGC-SCZ 014 Nature; MGS replication sample

44 LDpred performs well on summary statistics with independent validation cohorts Training N=70K Training N=30K Training N=60K

45 LDpred performs well on summary statistics with independent validation cohorts Training N=70K Training N=30K Training N=60K Training N=70K Training N=90K

46 LDpred performs well on summary statistics with independent validation cohorts Height: complexities due to population stratification. Including PCs can improve prediction accuracy. (Chen et al. 015 Genet Epidemiol) Training N=130K (Lango Allen et al. 010 Nature)

47 Conclusions Explicitly modeling both LD and non-infinitesimal architectures improves polygenic prediction from summary statistics. Polygenic prediction should be evaluated using independent validation cohorts. Although polygenic predictions are not yet clinically useful, prediction accuracies will increase as sample sizes increase (bounded by heritability explained by SNPs; ). h g

48 and Future directions Polygenic prediction in non-european samples is challenging. How to combine training data from Europeans (large sample size) with training data from target population (small sample size)? (cross-population genetic correlation; Poster 1477 Brown) Enrichment of heritability in functional annotation classes could potentially be used to improve polygenic prediction (Poster 1357 Chatterjee) Methods for large raw genotype data sets (e.g. UK Biobank) should be developed in parallel with summary statistic methods (Platform talk 38 Loh; Platform talk 170 Young)

49 Acknowledgements Bjarni Vilhjalmsson + Vilhjalmsson et al. 015 AJHG co-authors Everyone in alkesgrp. Please check out our other ASHG 015 talks: Platform talk 11 Gusev Large-scale transcriptome-wide association study Platform talk 38 Loh Platform talk 196 Bhatia Platform talk 35 Galinsky Population differentiation analysis of 54,734 Platform talk 346 Hayeck Platform talk 354 Palamara Leveraging distant relatedness to quantify Contrasting regional architectures of schizophrenia Haplotypes of common SNPs explain a large Mixed model association with family-biased

Online Supplement to Polygenic Influence on Educational Attainment. Genotyping was conducted with the Illumina HumanOmni1-Quad v1 platform using

Online Supplement to Polygenic Influence on Educational Attainment. Genotyping was conducted with the Illumina HumanOmni1-Quad v1 platform using Online Supplement to Polygenic Influence on Educational Attainment Construction of Polygenic Score for Educational Attainment Genotyping was conducted with the Illumina HumanOmni1-Quad v1 platform using

More information

Factors for success in big data science

Factors for success in big data science Factors for success in big data science Damjan Vukcevic Data Science Murdoch Childrens Research Institute 16 October 2014 Big Data Reading Group (Department of Mathematics & Statistics, University of Melbourne)

More information

EHRs and large scale comparative effectiveness research

EHRs and large scale comparative effectiveness research EHRs and large scale comparative effectiveness research September 16, 2014 Dana C. Crawford, PhD Associate Professor Epidemiology and Biostatistics Institute for Computational Biology Single Nucleotide

More information

Logistic Regression (1/24/13)

Logistic Regression (1/24/13) STA63/CBB540: Statistical methods in computational biology Logistic Regression (/24/3) Lecturer: Barbara Engelhardt Scribe: Dinesh Manandhar Introduction Logistic regression is model for regression used

More information

GENOMIC SELECTION: THE FUTURE OF MARKER ASSISTED SELECTION AND ANIMAL BREEDING

GENOMIC SELECTION: THE FUTURE OF MARKER ASSISTED SELECTION AND ANIMAL BREEDING GENOMIC SELECTION: THE FUTURE OF MARKER ASSISTED SELECTION AND ANIMAL BREEDING Theo Meuwissen Institute for Animal Science and Aquaculture, Box 5025, 1432 Ås, Norway, theo.meuwissen@ihf.nlh.no Summary

More information

Investigating the genetic basis for intelligence

Investigating the genetic basis for intelligence Investigating the genetic basis for intelligence Steve Hsu University of Oregon and BGI www.cog-genomics.org Outline: a multidisciplinary subject 1. What is intelligence? Psychometrics 2. g and GWAS: a

More information

GOBII. Genomic & Open-source Breeding Informatics Initiative

GOBII. Genomic & Open-source Breeding Informatics Initiative GOBII Genomic & Open-source Breeding Informatics Initiative My Background BS Animal Science, University of Tennessee MS Animal Breeding, University of Georgia Random regression models for longitudinal

More information

Marker-Assisted Backcrossing. Marker-Assisted Selection. 1. Select donor alleles at markers flanking target gene. Losing the target allele

Marker-Assisted Backcrossing. Marker-Assisted Selection. 1. Select donor alleles at markers flanking target gene. Losing the target allele Marker-Assisted Backcrossing Marker-Assisted Selection CS74 009 Jim Holland Target gene = Recurrent parent allele = Donor parent allele. Select donor allele at markers linked to target gene.. Select recurrent

More information

From Disease Association to Risk Assessment: An Optimistic View from Genome-Wide Association Studies on Type 1 Diabetes

From Disease Association to Risk Assessment: An Optimistic View from Genome-Wide Association Studies on Type 1 Diabetes From Disease Association to Risk Assessment: An Optimistic View from Genome-Wide Association Studies on Type 1 Diabetes Zhi Wei 1., Kai Wang 2., Hui-Qi Qu 3, Haitao Zhang 2, Jonathan Bradfield 2, Cecilia

More information

GEMMA User Manual. Xiang Zhou. May 18, 2016

GEMMA User Manual. Xiang Zhou. May 18, 2016 GEMMA User Manual Xiang Zhou May 18, 2016 Contents 1 Introduction 4 1.1 What is GEMMA...................................... 4 1.2 How to Cite GEMMA................................... 4 1.3 Models............................................

More information

Towards running complex models on big data

Towards running complex models on big data Towards running complex models on big data Working with all the genomes in the world without changing the model (too much) Daniel Lawson Heilbronn Institute, University of Bristol 2013 1 / 17 Motivation

More information

Publication List. Chen Zehua Department of Statistics & Applied Probability National University of Singapore

Publication List. Chen Zehua Department of Statistics & Applied Probability National University of Singapore Publication List Chen Zehua Department of Statistics & Applied Probability National University of Singapore Publications Journal Papers 1. Y. He and Z. Chen (2014). A sequential procedure for feature selection

More information

Combining Data from Different Genotyping Platforms. Gonçalo Abecasis Center for Statistical Genetics University of Michigan

Combining Data from Different Genotyping Platforms. Gonçalo Abecasis Center for Statistical Genetics University of Michigan Combining Data from Different Genotyping Platforms Gonçalo Abecasis Center for Statistical Genetics University of Michigan The Challenge Detecting small effects requires very large sample sizes Combined

More information

GAW 15 Problem 3: Simulated Rheumatoid Arthritis Data Full Model and Simulation Parameters

GAW 15 Problem 3: Simulated Rheumatoid Arthritis Data Full Model and Simulation Parameters GAW 15 Problem 3: Simulated Rheumatoid Arthritis Data Full Model and Simulation Parameters Michael B Miller , Michael Li , Gregg Lind , Soon-Young

More information

Core Facility Genomics

Core Facility Genomics Core Facility Genomics versatile genome or transcriptome analyses based on quantifiable highthroughput data ascertainment 1 Topics Collaboration with Harald Binder and Clemens Kreutz Project: Microarray

More information

Global Alliance. Ewan Birney Associate Director EMBL-EBI

Global Alliance. Ewan Birney Associate Director EMBL-EBI Global Alliance Ewan Birney Associate Director EMBL-EBI Our world is changing Research to Medical Research English as language Lightweight legal Identical/similar systems Open data Publications Grant-funding

More information

Building risk prediction models - with a focus on Genome-Wide Association Studies. Charles Kooperberg

Building risk prediction models - with a focus on Genome-Wide Association Studies. Charles Kooperberg Building risk prediction models - with a focus on Genome-Wide Association Studies Risk prediction models Based on data: (D i, X i1,..., X ip ) i = 1,..., n we like to fit a model P(D = 1 X 1,..., X p )

More information

C-Reactive Protein and Diabetes: proving a negative, for a change?

C-Reactive Protein and Diabetes: proving a negative, for a change? C-Reactive Protein and Diabetes: proving a negative, for a change? Eric Brunner PhD FFPH Reader in Epidemiology and Public Health MRC Centre for Causal Analyses in Translational Epidemiology 2 March 2009

More information

VISUAL INTEGRATION OF RESULTS FROM A LARGE DNA BIOBANK (BIOVU) USING SYNTHESIS-VIEW *

VISUAL INTEGRATION OF RESULTS FROM A LARGE DNA BIOBANK (BIOVU) USING SYNTHESIS-VIEW * VISUAL INTEGRATION OF RESULTS FROM A LARGE DNA BIOBANK (BIOVU) USING SYNTHESIS-VIEW * SARAH PENDERGRASS Center for Human Genetics Research, Department of Molecular Physiology and Biophysics, Vanderbilt

More information

Data Science - A Glossary of Downloadabytes

Data Science - A Glossary of Downloadabytes Our future in big data science Damjan Vukcevic http://damjan.vukcevic.net/ 13 October 2015 SSA Canberra, Young Statisticians Workshop What is big data? You know it when you see it? Tell-tale signs: Need

More information

Genomic Selection in. Applied Training Workshop, Sterling. Hans Daetwyler, The Roslin Institute and R(D)SVS

Genomic Selection in. Applied Training Workshop, Sterling. Hans Daetwyler, The Roslin Institute and R(D)SVS Genomic Selection in Dairy Cattle AQUAGENOME Applied Training Workshop, Sterling Hans Daetwyler, The Roslin Institute and R(D)SVS Dairy introduction Overview Traditional breeding Genomic selection Advantages

More information

Globally, about 9.7% of cancers in men are prostate cancers, and the risk of developing the

Globally, about 9.7% of cancers in men are prostate cancers, and the risk of developing the Chapter 5 Analysis of Prostate Cancer Association Study Data 5.1 Risk factors for Prostate Cancer Globally, about 9.7% of cancers in men are prostate cancers, and the risk of developing the disease has

More information

SeattleSNPs Interactive Tutorial: Web Tools for Site Selection, Linkage Disequilibrium and Haplotype Analysis

SeattleSNPs Interactive Tutorial: Web Tools for Site Selection, Linkage Disequilibrium and Haplotype Analysis SeattleSNPs Interactive Tutorial: Web Tools for Site Selection, Linkage Disequilibrium and Haplotype Analysis Goal: This tutorial introduces several websites and tools useful for determining linkage disequilibrium

More information

Seeing Faces and History through Human Genome Sequences

Seeing Faces and History through Human Genome Sequences Seeing Faces and History through Human Genome Sequences CAS/MPG Partner Group on the Human Functional Genetic Variations Shanghai-Leipzig, 2011.2.1 2016.1.31 Prof. Dr. TANG Kun (middle) with his cooperator,

More information

Work Package 13.5: Authors: Paul Flicek and Ilkka Lappalainen. 1. Introduction

Work Package 13.5: Authors: Paul Flicek and Ilkka Lappalainen. 1. Introduction Work Package 13.5: Report summarising the technical feasibility of the European Genotype Archive to collect, store, and use genotype data stored in European biobanks in a manner that complies with all

More information

NGS and complex genetics

NGS and complex genetics NGS and complex genetics Robert Kraaij Genetic Laboratory Department of Internal Medicine r.kraaij@erasmusmc.nl Gene Hunting Rotterdam Study and GWAS Next Generation Sequencing Gene Hunting Mendelian gene

More information

False Discovery Rates

False Discovery Rates False Discovery Rates John D. Storey Princeton University, Princeton, USA January 2010 Multiple Hypothesis Testing In hypothesis testing, statistical significance is typically based on calculations involving

More information

Basics of Marker Assisted Selection

Basics of Marker Assisted Selection asics of Marker ssisted Selection Chapter 15 asics of Marker ssisted Selection Julius van der Werf, Department of nimal Science rian Kinghorn, Twynam Chair of nimal reeding Technologies University of New

More information

SNPbrowser Software v3.5

SNPbrowser Software v3.5 Product Bulletin SNP Genotyping SNPbrowser Software v3.5 A Free Software Tool for the Knowledge-Driven Selection of SNP Genotyping Assays Easily visualize SNPs integrated with a physical map, linkage disequilibrium

More information

School of Nursing. Presented by Yvette Conley, PhD

School of Nursing. Presented by Yvette Conley, PhD Presented by Yvette Conley, PhD What we will cover during this webcast: Briefly discuss the approaches introduced in the paper: Genome Sequencing Genome Wide Association Studies Epigenomics Gene Expression

More information

CASSI: Genome-Wide Interaction Analysis Software

CASSI: Genome-Wide Interaction Analysis Software CASSI: Genome-Wide Interaction Analysis Software 1 Contents 1 Introduction 3 2 Installation 3 3 Using CASSI 3 3.1 Input Files................................... 4 3.2 Options....................................

More information

Computational Requirements

Computational Requirements Workshop on Establishing a Central Resource of Data from Genome Sequencing Projects Computational Requirements Steve Sherry, Lisa Brooks, Paul Flicek, Anton Nekrutenko, Kenna Shaw, Heidi Sofia High-density

More information

Comparative genomic hybridization Because arrays are more than just a tool for expression analysis

Comparative genomic hybridization Because arrays are more than just a tool for expression analysis Microarray Data Analysis Workshop MedVetNet Workshop, DTU 2008 Comparative genomic hybridization Because arrays are more than just a tool for expression analysis Carsten Friis ( with several slides from

More information

Predicting The Risk Of Rheumatoid Arthritis

Predicting The Risk Of Rheumatoid Arthritis Predicting The Risk Of Rheumatoid Arthritis Modelling Genetic And Environmental Risk Factors Ian Scott Arthritis Research UK Clinical Research Fellow Declaration Of Interests: No Competing Interests Describe

More information

Epigenetic variation and complex disease risk

Epigenetic variation and complex disease risk Epigenetic variation and complex disease risk Caroline Relton Institute of Human Genetics Newcastle University ALSPAC Research Symposium 2 & 3 March 2009 Missing heritability Even when dozens of genes

More information

UKB_WCSGAX: UK Biobank 500K Samples Genotyping Data Generation by the Affymetrix Research Services Laboratory. April, 2015

UKB_WCSGAX: UK Biobank 500K Samples Genotyping Data Generation by the Affymetrix Research Services Laboratory. April, 2015 UKB_WCSGAX: UK Biobank 500K Samples Genotyping Data Generation by the Affymetrix Research Services Laboratory April, 2015 1 Contents Overview... 3 Rare Variants... 3 Observation... 3 Approach... 3 ApoE

More information

Delivering the power of the world s most successful genomics platform

Delivering the power of the world s most successful genomics platform Delivering the power of the world s most successful genomics platform NextCODE Health is bringing the full power of the world s largest and most successful genomics platform to everyday clinical care NextCODE

More information

Deterministic computer simulations were performed to evaluate the effect of maternallytransmitted

Deterministic computer simulations were performed to evaluate the effect of maternallytransmitted Supporting Information 3. Host-parasite simulations Deterministic computer simulations were performed to evaluate the effect of maternallytransmitted parasites on the evolution of sex. Briefly, the simulations

More information

High-Order Interactions in Rheumatoid Arthritis Detected by Bayesian Method using Genome-Wide Association Studies Data

High-Order Interactions in Rheumatoid Arthritis Detected by Bayesian Method using Genome-Wide Association Studies Data American Medical Journal 3 (1): 56-66, 2012 ISSN 1949-0070 2012 Science Publications High-Order Interactions in Rheumatoid Arthritis Detected by Bayesian Method using Genome-Wide Association Studies Data

More information

One essential problem for population genetics is to characterize

One essential problem for population genetics is to characterize Posterior predictive checks to quantify lack-of-fit in admixture models of latent population structure David Mimno a, David M. Blei b, and Barbara E. Engelhardt c,1 a Department of Information Science,

More information

Tutorial on Markov Chain Monte Carlo

Tutorial on Markov Chain Monte Carlo Tutorial on Markov Chain Monte Carlo Kenneth M. Hanson Los Alamos National Laboratory Presented at the 29 th International Workshop on Bayesian Inference and Maximum Entropy Methods in Science and Technology,

More information

Heritability: Twin Studies. Twin studies are often used to assess genetic effects on variation in a trait

Heritability: Twin Studies. Twin studies are often used to assess genetic effects on variation in a trait TWINS AND GENETICS TWINS Heritability: Twin Studies Twin studies are often used to assess genetic effects on variation in a trait Comparing MZ/DZ twins can give evidence for genetic and/or environmental

More information

Robust procedures for Canadian Test Day Model final report for the Holstein breed

Robust procedures for Canadian Test Day Model final report for the Holstein breed Robust procedures for Canadian Test Day Model final report for the Holstein breed J. Jamrozik, J. Fatehi and L.R. Schaeffer Centre for Genetic Improvement of Livestock, University of Guelph Introduction

More information

European Educational Programme in Epidemiology

European Educational Programme in Epidemiology European Educational Programme in Epidemiology 29 th RESIDENTIAL SUMMER COURSE FLORENCE, ITALY Pre-courses 13 17 JUNE 2016 1/13 European Educational Programme in Epidemiology Pre-Course: Introduction to

More information

Genetic Epidemiology Core Laboratory

Genetic Epidemiology Core Laboratory 2012 CGM Report Genetic Epidemiology Core Laboratory 卓 越 成 員 Remarkable member Wei J. Chen 陳 為 堅 Professor/ / EDUCATION AND POSITION HELD Bachelor of Medicine, College of Medicine, National Taiwan University,

More information

Big Data for Population Health

Big Data for Population Health Big Data for Population Health Prof Martin Landray Nuffield Department of Population Health Deputy Director, Big Data Institute, Li Ka Shing Centre for Health Information and Discovery University of Oxford

More information

-Power/Sample Size Considerations

-Power/Sample Size Considerations -Power/Sample Size Considerations Jing Hua Zhao 1,2 1 MRC Unit 2 Institute of Metabolic Science Addenbrooke s Hospital Cambridge CB2 0QQ United Kingdom http://www.mrc-epid.cam.ac.uk/~jinghua.zhao E-mail:

More information

How To Find Rare Variants In The Human Genome

How To Find Rare Variants In The Human Genome UNIVERSITÀ DEGLI STUDI DI SASSARI Scuola di Dottorato in Scienze Biomediche XXV CICLO DOTTORATO DI RICERCA IN SCIENZE BIOMEDICHE INDIRIZZO DI GENETICA MEDICA, MALATTIE METABOLICHE E NUTRIGENOMICA Direttore:

More information

European Genome-phenome Archive database of human data consented for use in biomedical research at the European Bioinformatics Institute

European Genome-phenome Archive database of human data consented for use in biomedical research at the European Bioinformatics Institute European Genome-phenome Archive database of human data consented for use in biomedical research at the European Bioinformatics Institute Justin Paschall Team Leader Genetic Variation / EGA ! European Genome-phenome

More information

Pedigree-free descent-based gene mapping from population samples

Pedigree-free descent-based gene mapping from population samples Pedigree-free descent-based gene mapping from population samples Chris Glazner and Elizabeth Thompson Department of Statistics Technical Report # 632 University of Washington, Seattle, WA, USA January,

More information

GENOMIC information is transforming animal and plant

GENOMIC information is transforming animal and plant GENOMIC SELECTION Genomic Prediction in Animals and Plants: Simulation of Data, Validation, Reporting, and Benchmarking Hans D. Daetwyler,*,1 Mario P. L. Calus, Ricardo Pong-Wong, Gustavo de los Campos,

More information

Genotyping and quality control of UK Biobank, a large- scale, extensively phenotyped prospective resource

Genotyping and quality control of UK Biobank, a large- scale, extensively phenotyped prospective resource Genotyping and quality control of UK Biobank, a large- scale, extensively phenotyped prospective resource Information for researchers Interim Data Release, 2015 1 Introduction... 3 1.1 UK Biobank... 3

More information

Advances in Natural and Applied Sciences

Advances in Natural and Applied Sciences AENSI Journals Advances in Natural and Applied Sciences ISSN:1995-0772 EISSN: 1998-1090 Journal home page: www.aensiweb.com/anas Clustering Algorithm Based On Hadoop for Big Data 1 Jayalatchumy D. and

More information

Assessing the Causal Relationship of Maternal Height on Birth Size and Gestational Age at Birth: A Mendelian Randomization Analysis

Assessing the Causal Relationship of Maternal Height on Birth Size and Gestational Age at Birth: A Mendelian Randomization Analysis RESEARCH ARTICLE Assessing the Causal Relationship of Maternal Height on Birth Size and Gestational Age at Birth: A Mendelian Randomization Analysis Ge Zhang 1,2 *, Jonas Bacelis 3, Candice Lengyel 2,

More information

Leading Genomics. Diagnostic. Discove. Collab. harma. Shanghai Cambridge, MA Reykjavik

Leading Genomics. Diagnostic. Discove. Collab. harma. Shanghai Cambridge, MA Reykjavik Leading Genomics Diagnostic harma Discove Collab Shanghai Cambridge, MA Reykjavik Global leadership for using the genome to create better medicine WuXi NextCODE provides a uniquely proven and integrated

More information

B I O I N F O R M A T I C S

B I O I N F O R M A T I C S B I O I N F O R M A T I C S Kristel Van Steen, PhD 2 Montefiore Institute - Systems and Modeling GIGA - Bioinformatics ULg kristel.vansteen@ulg.ac.be CHAPTER 4: GENOME-WIDE ASSOCIATION STUDIES 1 Setting

More information

DISCOVERY TOOL FOR GENOME-WIDE ASSOCIATION STUDIES

DISCOVERY TOOL FOR GENOME-WIDE ASSOCIATION STUDIES IPINBPA: AN INTEGRATIVE NETWORK-BASED FUNCTIONAL MODULE DISCOVERY TOOL FOR GENOME-WIDE ASSOCIATION STUDIES LILI WANG School of Computing, Queen s University 25 Union Street, Goodwin Hall, Kingston, Ontario,

More information

A Primer of Genome Science THIRD

A Primer of Genome Science THIRD A Primer of Genome Science THIRD EDITION GREG GIBSON-SPENCER V. MUSE North Carolina State University Sinauer Associates, Inc. Publishers Sunderland, Massachusetts USA Contents Preface xi 1 Genome Projects:

More information

7.36/7.91/20.390/20.490/6.802/6.874 PROBLEM SET 5. Network Statistics, Chromatin Structure, Heritability, Association Testing (24 Points)

7.36/7.91/20.390/20.490/6.802/6.874 PROBLEM SET 5. Network Statistics, Chromatin Structure, Heritability, Association Testing (24 Points) 7.36/7.91/20.390/20.490/6.802/6.874 PROBLEM SET 5. Network Statistics, Chromatin Structure, Heritability, Association Testing (24 Points) Due: Thursday, May 1 st at noon. Python Scripts All Python scripts

More information

AN APPLICATION AND EMPIRICAL COMPARISON OF STATISTICAL ANALYSIS METHODS FOR ASSOCIATING RARE VARIANTS TO A COMPLEX PHENOTYPE

AN APPLICATION AND EMPIRICAL COMPARISON OF STATISTICAL ANALYSIS METHODS FOR ASSOCIATING RARE VARIANTS TO A COMPLEX PHENOTYPE AN APPLICATION AND EMPIRICAL COMPARISON OF STATISTICAL ANALYSIS METHODS FOR ASSOCIATING RARE VARIANTS TO A COMPLEX PHENOTYPE VIKAS BANSAL *, ONDREJ LIBIGER *, ALI TORKAMANI * The Scripps Translational

More information

APPLIED MISSING DATA ANALYSIS

APPLIED MISSING DATA ANALYSIS APPLIED MISSING DATA ANALYSIS Craig K. Enders Series Editor's Note by Todd D. little THE GUILFORD PRESS New York London Contents 1 An Introduction to Missing Data 1 1.1 Introduction 1 1.2 Chapter Overview

More information

BAPS: Bayesian Analysis of Population Structure

BAPS: Bayesian Analysis of Population Structure BAPS: Bayesian Analysis of Population Structure Manual v. 6.0 NOTE: ANY INQUIRIES CONCERNING THE PROGRAM SHOULD BE SENT TO JUKKA CORANDER (first.last at helsinki.fi). http://www.helsinki.fi/bsg/software/baps/

More information

Workshop on Establishing a Central Resource of Data from Genome Sequencing Projects

Workshop on Establishing a Central Resource of Data from Genome Sequencing Projects Report on the Workshop on Establishing a Central Resource of Data from Genome Sequencing Projects Background and Goals of the Workshop June 5 6, 2012 The use of genome sequencing in human research is growing

More information

The Human Genome. Genetics and Personality. The Human Genome. The Human Genome 2/19/2009. Chapter 6. Controversy About Genes and Personality

The Human Genome. Genetics and Personality. The Human Genome. The Human Genome 2/19/2009. Chapter 6. Controversy About Genes and Personality The Human Genome Chapter 6 Genetics and Personality Genome refers to the complete set of genes that an organism possesses Human genome contains 30,000 80,000 genes on 23 pairs of chromosomes The Human

More information

Population Genetics and Multifactorial Inheritance 2002

Population Genetics and Multifactorial Inheritance 2002 Population Genetics and Multifactorial Inheritance 2002 Consanguinity Genetic drift Founder effect Selection Mutation rate Polymorphism Balanced polymorphism Hardy-Weinberg Equilibrium Hardy-Weinberg Equilibrium

More information

The Functional but not Nonfunctional LILRA3 Contributes to Sex Bias in Susceptibility and Severity of ACPA-Positive Rheumatoid Arthritis

The Functional but not Nonfunctional LILRA3 Contributes to Sex Bias in Susceptibility and Severity of ACPA-Positive Rheumatoid Arthritis The Functional but not Nonfunctional LILRA3 Contributes to Sex Bias in Susceptibility and Severity of ACPA-Positive Rheumatoid Arthritis Yan Du Peking University People s Hospital 100044 Beijing CHINA

More information

Data Acquisition. DNA microarrays. The functional genomics pipeline. Experimental design affects outcome data analysis

Data Acquisition. DNA microarrays. The functional genomics pipeline. Experimental design affects outcome data analysis Data Acquisition DNA microarrays The functional genomics pipeline Experimental design affects outcome data analysis Data acquisition microarray processing Data preprocessing scaling/normalization/filtering

More information

University of Glasgow - Programme Structure Summary C1G5-5100 MSc Bioinformatics, Polyomics and Systems Biology

University of Glasgow - Programme Structure Summary C1G5-5100 MSc Bioinformatics, Polyomics and Systems Biology University of Glasgow - Programme Structure Summary C1G5-5100 MSc Bioinformatics, Polyomics and Systems Biology Programme Structure - the MSc outcome will require 180 credits total (full-time only) - 60

More information

Automatic Mining of Internet Translation Reference Knowledge Based on Multiple Search Engines

Automatic Mining of Internet Translation Reference Knowledge Based on Multiple Search Engines , 22-24 October, 2014, San Francisco, USA Automatic Mining of Internet Translation Reference Knowledge Based on Multiple Search Engines Baosheng Yin, Wei Wang, Ruixue Lu, Yang Yang Abstract With the increasing

More information

GWASrap User Manual v1.1

GWASrap User Manual v1.1 GWASrap User Manual v1.1 1 / 28 Table of contents Introduction... 3 System Requirements... 3 Welcome... 3 Features... 4 Create New Run... 5 GWAS Representation... 7 GWAS Annotation... 13 GWAS Prioritization...

More information

Chapter 4. Quantitative genetics: measuring heritability

Chapter 4. Quantitative genetics: measuring heritability Chapter 4 Quantitative genetics: measuring heritability Quantitative genetics: measuring heritability Introduction 4.1 The field of quantitative genetics originated around 1920, following statistical

More information

Genetics of Rheumatoid Arthritis Markey Lecture Series

Genetics of Rheumatoid Arthritis Markey Lecture Series Genetics of Rheumatoid Arthritis Markey Lecture Series Al Kim akim@dom.wustl.edu 2012.09.06 Overview of Rheumatoid Arthritis Rheumatoid Arthritis (RA) Autoimmune disease primarily targeting the synovium

More information

BIG DATA: CONVENTIONAL METHODS MEET UNCONVENTIONAL DATA

BIG DATA: CONVENTIONAL METHODS MEET UNCONVENTIONAL DATA BIG DATA: CONVENTIONAL METHODS MEET UNCONVENTIONAL DATA Harvard Medical School & Harvard School of Public Health sharon@hcp.med.harvard.edu October 14, 2014 1 / 7 THE SETTING Unprecedented advances in

More information

Statistical Analysis for Genetic Epidemiology (S.A.G.E.) Version 6.2 Graphical User Interface (GUI) Manual

Statistical Analysis for Genetic Epidemiology (S.A.G.E.) Version 6.2 Graphical User Interface (GUI) Manual Statistical Analysis for Genetic Epidemiology (S.A.G.E.) Version 6.2 Graphical User Interface (GUI) Manual Department of Epidemiology and Biostatistics Wolstein Research Building 2103 Cornell Rd Case Western

More information

Missing Data in Longitudinal Studies: To Impute or not to Impute? Robert Platt, PhD McGill University

Missing Data in Longitudinal Studies: To Impute or not to Impute? Robert Platt, PhD McGill University Missing Data in Longitudinal Studies: To Impute or not to Impute? Robert Platt, PhD McGill University 1 Outline Missing data definitions Longitudinal data specific issues Methods Simple methods Multiple

More information

LABEL PROPAGATION ON GRAPHS. SEMI-SUPERVISED LEARNING. ----Changsheng Liu 10-30-2014

LABEL PROPAGATION ON GRAPHS. SEMI-SUPERVISED LEARNING. ----Changsheng Liu 10-30-2014 LABEL PROPAGATION ON GRAPHS. SEMI-SUPERVISED LEARNING ----Changsheng Liu 10-30-2014 Agenda Semi Supervised Learning Topics in Semi Supervised Learning Label Propagation Local and global consistency Graph

More information

GENETIC STUDIES OF AUTOIMMUNE DISEASES. Benedicte Alexandre Lie Institute of Immunology Rikshospitalet University Hospital

GENETIC STUDIES OF AUTOIMMUNE DISEASES. Benedicte Alexandre Lie Institute of Immunology Rikshospitalet University Hospital GENETIC STUDIES OF AUTOIMMUNE DISEASES Benedicte Alexandre Lie Institute of Immunology Rikshospitalet University Hospital Autoimmune diseases Affects approximately 5 % of the population Results from an

More information

Methods for big data in medical genomics

Methods for big data in medical genomics Methods for big data in medical genomics Parallel Hidden Markov Models in Population Genetics Chris Holmes, (Peter Kecskemethy & Chris Gamble) Department of Statistics and, Nuffield Department of Medicine

More information

Association analysis for quantitative traits by data mining: QHPM

Association analysis for quantitative traits by data mining: QHPM Ann. Hum. Genet. (2002), 66, 419 429 University College London DOI: 10.1017 S0003480002001318 Printed in the United Kingdom 419 Association analysis for quantitative traits by data mining: QHPM P. ONKAMO,,

More information

Bioinformatics for cancer immunology and immunotherapy

Bioinformatics for cancer immunology and immunotherapy Bioinformatics for cancer immunology and immunotherapy Zlatko Trajanoski Biocenter, Division for Bioinformatics Innsbruck Medical University Innrain 80, 6020 Innsbruck, Austria Email: zlatko.trajanoski@i-med.ac.at

More information

Digital Health: Catapulting Personalised Medicine Forward STRATIFIED MEDICINE

Digital Health: Catapulting Personalised Medicine Forward STRATIFIED MEDICINE Digital Health: Catapulting Personalised Medicine Forward STRATIFIED MEDICINE CRUK Stratified Medicine Initiative Somatic mutation testing for prediction of treatment response in patients with solid tumours:

More information

Genomics and Health Data Standards: Lessons from the Past and Present for a Genome-enabled Future

Genomics and Health Data Standards: Lessons from the Past and Present for a Genome-enabled Future Genomics and Health Data Standards: Lessons from the Past and Present for a Genome-enabled Future Daniel Masys, MD Professor and Chair Department of Biomedical Informatics Professor of Medicine Vanderbilt

More information

Childhood intelligence is heritable, highly polygenic

Childhood intelligence is heritable, highly polygenic Molecular Psychiatry (2013), 1 6 & 2013 Macmillan Publishers Limited All rights reserved 1359-4184/13 www.nature.com/mp ORIGINAL ARTICLE Childhood intelligence is heritable, highly polygenic and associated

More information

Admixture 1.23 Software Manual. David H. Alexander John Novembre Kenneth Lange

Admixture 1.23 Software Manual. David H. Alexander John Novembre Kenneth Lange Admixture 1.23 Software Manual David H. Alexander John Novembre Kenneth Lange August 22, 2013 Contents 1 Quick start 1 2 Reference 3 2.1 How do I choose the correct value for K?................... 3 2.1.1

More information

GENETICS OF ALCOHOL USE AND LIVER ENZYMES:

GENETICS OF ALCOHOL USE AND LIVER ENZYMES: GENETICS OF ALCOHOL USE AND LIVER ENZYMES: SUMMARY AND GENERAL DISCUSSION The studies described in this thesis aimed to unravel the genetic architecture of variation in alcohol use and blood levels of

More information

Litteratur. Lärandemål för undervisningstillfälle. Lecture Overview. Basic principles The twin design The adoption design

Litteratur. Lärandemål för undervisningstillfälle. Lecture Overview. Basic principles The twin design The adoption design Litteratur Behavioral Genetics Twin and Adoptions studies Twin and adoption methods (Kapitel 5; sid 70-91) Henrik Larsson MEB Lärandemål för undervisningstillfälle - Studenten ska kunna redogöra för kvantitativa-genetiska

More information

PRINCIPLES OF POPULATION GENETICS

PRINCIPLES OF POPULATION GENETICS PRINCIPLES OF POPULATION GENETICS FOURTH EDITION Daniel L. Hartl Harvard University Andrew G. Clark Cornell University UniversitSts- und Landesbibliothek Darmstadt Bibliothek Biologie Sinauer Associates,

More information

TOWARD BIG DATA ANALYSIS WORKSHOP

TOWARD BIG DATA ANALYSIS WORKSHOP TOWARD BIG DATA ANALYSIS WORKSHOP 邁 向 巨 量 資 料 分 析 研 討 會 摘 要 集 2015.06.05-06 巨 量 資 料 之 矩 陣 視 覺 化 陳 君 厚 中 央 研 究 院 統 計 科 學 研 究 所 摘 要 視 覺 化 (Visualization) 與 探 索 式 資 料 分 析 (Exploratory Data Analysis, EDA)

More information

Samuel Zuvekas. Agency for Healthcare Research and Quality Working Paper No. 09003. August 2009

Samuel Zuvekas. Agency for Healthcare Research and Quality Working Paper No. 09003. August 2009 Validity of Household Reports of Medicare-covered Home Health Agency Use Samuel Zuvekas Agency for Healthcare Research and Quality Working Paper No. 09003 August 2009 Suggested citation: Zuvekas S. Validity

More information

INTRODUCTION TO GENETIC EPIDEMIOLOGY (EPID0754) Prof. Dr. Dr. K. Van Steen

INTRODUCTION TO GENETIC EPIDEMIOLOGY (EPID0754) Prof. Dr. Dr. K. Van Steen INTRODUCTION TO GENETIC EPIDEMIOLOGY (EPID0754) Prof. Dr. Dr. K. Van Steen Introduction to Genetic Epidemiology DIFFERENT FACES OF GENETIC EPIDEMIOLOGY 1 Basic epidemiology 1.a Aims of epidemiology 1.b

More information

UNSUPERVISED MACHINE LEARNING TECHNIQUES IN GENOMICS

UNSUPERVISED MACHINE LEARNING TECHNIQUES IN GENOMICS UNSUPERVISED MACHINE LEARNING TECHNIQUES IN GENOMICS Dwijesh C. Mishra I.A.S.R.I., Library Avenue, New Delhi-110 012 dcmishra@iasri.res.in What is Learning? "Learning denotes changes in a system that enable

More information

MAGIC design. and other topics. Karl Broman. Biostatistics & Medical Informatics University of Wisconsin Madison

MAGIC design. and other topics. Karl Broman. Biostatistics & Medical Informatics University of Wisconsin Madison MAGIC design and other topics Karl Broman Biostatistics & Medical Informatics University of Wisconsin Madison biostat.wisc.edu/ kbroman github.com/kbroman kbroman.wordpress.com @kwbroman CC founders compgen.unc.edu

More information

Missing Data: Part 1 What to Do? Carol B. Thompson Johns Hopkins Biostatistics Center SON Brown Bag 3/20/13

Missing Data: Part 1 What to Do? Carol B. Thompson Johns Hopkins Biostatistics Center SON Brown Bag 3/20/13 Missing Data: Part 1 What to Do? Carol B. Thompson Johns Hopkins Biostatistics Center SON Brown Bag 3/20/13 Overview Missingness and impact on statistical analysis Missing data assumptions/mechanisms Conventional

More information

Novel Rheumatoid Arthritis Susceptibility Locus at 22q12 Identified in an Extended UK Genome-Wide Association Study

Novel Rheumatoid Arthritis Susceptibility Locus at 22q12 Identified in an Extended UK Genome-Wide Association Study ARTHRITIS & RHEUMATOLOGY Vol. 66, No. 1, January 2014, pp 24 30 DOI 10.1002/art.38196 2014 The Authors. Arthritis & Rheumatology is published by Wiley Periodicals, Inc. on behalf of the American College

More information

Single-Cell Whole Genome Sequencing on the C1 System: a Performance Evaluation

Single-Cell Whole Genome Sequencing on the C1 System: a Performance Evaluation PN 100-9879 A1 TECHNICAL NOTE Single-Cell Whole Genome Sequencing on the C1 System: a Performance Evaluation Introduction Cancer is a dynamic evolutionary process of which intratumor genetic and phenotypic

More information

Redwood Building, Room T204, Stanford University School of Medicine, Stanford, CA 94305-5405.

Redwood Building, Room T204, Stanford University School of Medicine, Stanford, CA 94305-5405. W hittemoretxt050806.tex A Bayesian False Discovery Rate for Multiple Testing Alice S. Whittemore Department of Health Research and Policy Stanford University School of Medicine Correspondence Address:

More information

MISSING DATA TECHNIQUES WITH SAS. IDRE Statistical Consulting Group

MISSING DATA TECHNIQUES WITH SAS. IDRE Statistical Consulting Group MISSING DATA TECHNIQUES WITH SAS IDRE Statistical Consulting Group ROAD MAP FOR TODAY To discuss: 1. Commonly used techniques for handling missing data, focusing on multiple imputation 2. Issues that could

More information

Haplotype analysis of case-control data

Haplotype analysis of case-control data Haplotype analysis of case-control data Yulia Marchenko Senior Statistician StataCorp LP 2010 UK Stata Users Group Meeting Yulia Marchenko (StataCorp) Haplotype analysis of case-control data September

More information

Supplementary Material: Covariate-adjusted matrix visualization via correlation decomposition

Supplementary Material: Covariate-adjusted matrix visualization via correlation decomposition Supplementary Material: Covariate-adjusted matrix visualization via correlation decomposition Han-Ming Wu 1, Yin-Jing Tien 2, Meng-Ru Ho 3,4,5, Hai-Gwo Hwu 6, Wen-chang Lin 5, Mi-Hua Tao 5, and Chun-Houh

More information

Are differences in methylation in cord blood DNA associated with prenatal exposure to alcohol?

Are differences in methylation in cord blood DNA associated with prenatal exposure to alcohol? Are differences in methylation in cord blood DNA associated with prenatal exposure to alcohol? Luisa Zuccolo l.zuccolo@bristol.ac.uk MRC IEU, School of Social and Community Medicine Outline Background

More information