Online Supplement to Polygenic Influence on Educational Attainment. Genotyping was conducted with the Illumina HumanOmni1-Quad v1 platform using

Size: px
Start display at page:

Download "Online Supplement to Polygenic Influence on Educational Attainment. Genotyping was conducted with the Illumina HumanOmni1-Quad v1 platform using"

Transcription

1 Online Supplement to Polygenic Influence on Educational Attainment Construction of Polygenic Score for Educational Attainment Genotyping was conducted with the Illumina HumanOmni1-Quad v1 platform using DNA extracted (via Oragene saliva collection) from individuals at Wave 4. Complete details of the QC process that resulted in the data used here are available in McQueen et al. (2014). We removed 18,665 SNPs from the original panel of genetic data (based on a missingness threshold of 5% computed in a sample of individuals who contained information on at least 90% of all available SNPs) to arrive at a genetic database consisting of 940,862 SNPs. In the original QC analysis, 74 genetic samples (some of which may have been duplicates included as part of the QA process) were dropped due to missingness concerns. SNPs in the Add Health Sibling Pairs genetic database were matched to SNPs with reported results in the educational attainment GWAS (Rietveld et al., 2013). 1 Over 2/3 of the SNPs in the Add Health genetic database were included in the GWAS results (642,627 SNPs). For each of these SNPs, a loading was calculated as the number of education associated alleles multiplied by the effect-size estimated in the original GWAS. SNPs with relatively large p-values will have small effects (and thus be down weighted in creating the composite), so we do not impose a p-value threshold. Research has suggested that accounting for linkage disequilibrium (LD) structure can improve the predictive performance of polygenic scores (Vilhjalmsson et al., 2015). To test the sensitivity of the score, we computed a secondary score based on a randomly chosen sample of unrelated EA respondents (N=507). The matched set of SNPs was pruned to account for linkage disequilibrium using the clumping procedure (which considers the level of association 1 Results are publicly available, 1

2 between the SNP and the phenotype, not simply LD) in the second-generation PLINK software (Chang et al., 2014). Clumping takes place in two steps. 2 The first pass is done in fairly narrow windows (250kb) for all SNPs (the p-value significance thresholds for both index and secondary SNPs is set to 1) with a liberal LD threshold (r 2 =0.5). In a second pass, SNPs remaining after the first prune are again pruned in broader windows (5000kb) but with a more conservative LD threshold (r 2 =0.2). SNPs are then weighted based on effect sizes reported in the GWAS as discussed above. Amongst the EA respondents, the clumped score was correlated with the original score at Again amongst the EA respondents, correlation with educational attainment was slightly higher for the un-clumped score, r=0.18 (see Table 2 of main text), as compared to the correlation of educational attainment with the clumped score (r=0.15). Given the fact that the clumped score did not improve predictive performance and since clumping depends upon LD patterns that may vary across samples, we report results for the un-clumped score in the main text. Construction of Neighborhood Disadvantage Index We constructed a measure of neighborhood disadvantage using data from an individual s census block group at the baseline interview. 3 Construction of this variable was performed as follows. In a first step, we identified those contextual variables associated with educational attainment in the full Add Health sample (p<0.05 when regressed on educational attainment). We then conducted factor analysis (using the algorithm of Stacklies et al., 2007) of this subset of 2 We used thresholds suggested by Sarah Medland. Please see 3 In particular, we used the 29 variables described here: 2

3 contextual variables to generate a neighborhood disadvantage factor score based on the first principal component. Loadings for the first principal component, which explained 19.2% of the variance, can be found in Table S1. Table 1 shows that those amongst both EA and AA respondents, our analytic sample contains individuals from more disadvantaged neighborhoods than those in the full subsample of the AH cohort. Calculation of Power for sibling-based analyses We conducted a power analysis of our ability to detect an effect in the sibling analyses in the following manner. We first restricted our sample to those EA respondents in sibling pairs (N=772). We then residualized educational attainment on birthyear so as to simplify the subsequent analyses. For residualized attainment, we estimated β W from model 3 to be coefficient of We then standardized this using the SD of the residualized attainment and obtained Finally, we simulated data accounting for the clustering of both scores and attainment within families. For individual i in family j, we generate data via the following two equations: y ij = b score ij + ε ij score ij = μ j + ε ij where the distributions of μ j, ε ij, and ε ij are based on the empirical data. We generated 500 datasets for different values of b. Results are shown in Figure S2. The observed coefficient of 0.28 corresponds to a power of approximately

4 References Chang, C. C., Chow, C. C., Tellier, L. C., Vattikuti, S., Purcell, S. M., & Lee, J. J. (2014). Second-generation PLINK: rising to the challenge of larger and richer datasets. arxiv preprint arxiv: Domingue, B. W., Belsky, D. W., Harris, K. M., Smolen, A., McQueen, M. B., & Boardman, J. D. (2014). Polygenic risk predicts obesity in both white and black young adults. PloS one, 9(7), e McQueen, M. B., Boardman, J. D., Domingue, B. W., Smolen, A., Tabor, J., Killeya-Jones, L.,... & Harris, K. M. (2014). The National Longitudinal Study of Adolescent to Adult Health (Add Health) Sibling Pairs Genome-Wide Data. Behavior genetics, Rietveld, C. A., Medland, S. E., Derringer, J., Yang, J., Esko, T., Martin, N. W.,...& McMahon, G. (2013). GWAS of 126,559 individuals identifies genetic variants associated with educational attainment. Science, 340(6139), Stacklies, W., Redestig, H., Scholz, M., Walther, D., & Selbig, J. (2007). pcamethods a bioconductor package providing PCA methods for incomplete data. Bioinformatics, 23(9), Vilhjalmsson, B., Yang, J., Finucane, H. K., Gusev, A., Lindstrom, S., Ripke, S.,... & Schizophrenia Working Group of the Psychiatric Genomics Consortium. (2015). Modeling Linkage Disequilibrium Increases Accuracy of Polygenic Risk Scores. biorxiv,

5 Table S1. Factor loadings for contextual variables (taken from respondent s census block group) used to create neighborhood disadvantage. In a first stage, contextual variables unassociated with educational attainment were screened out. The second block shows variables which were transformed from their original coding. All variables were standardized to have mean of 0 and SD of 1. Variable Description Loading BST90P04 Proportion hispanic 0.11 BST90P06 Median age BST90P07 Dispersion in age distribution BST90P10 Propotion population < 5 years old 0.12 BST90P12 Dispersion in migration status 0.02 BST90P15 Median household income BST90P16 Dispersion in household income 0.08 BST90P17 Median family income BST90P18 Dispersion in family income 0.18 BST90P19 Proportion persons with income < poverty level 0.37 BST90P20 Modal educational attainment BST90P21 Dispersion in educational attainment

6 BST90P22 Proportion females in labor force BST90P23 Unemployment rate 0.29 BST90P26 Tenure of occupied housing units BST90P27 Proportion occupied housing units moved into between 1985 and BST90P28 Median value of housing units BST90P29 Dispersion in value of housing units BST90P02 Modal race = black 0.22 BST90P02 Modal race = other 0.00 BST90P08 Modal marital status = married BST90P08 Modal marital status = divorced 0.05 BST90P13 Modal housedhold type = other 0.20 BST90P13 Modal household type = non-family 0.09 BST90P24 Modal occupation type = technical BST90P24 Modal occupation type = service 0.15 BST90P24 Modal occupation type = production 0.04 BST90P24 Modal occupation type = laborers

7 Figure S1. Top row: Density plots of polygenic scores for EA and AA respondents. Middle row: Histograms of birth years and educational attainment at wave 4 as well as a scatterplot (r=-0.08) comparing the two for the genotyped EA subsample of respondents. Bottom row: Histograms of birth years and educational attainment at wave 4 as well as a scatterplot (r=0.03) comparing the two for the genotyped AA subsample of respondents. 7

8 Figure S2. Power curve for the sibling-based analyses. The vertical line is the observed estimate (b=0.28) in a sibling-based analysis of residualized educational attainment. 8

Genotyping and quality control of UK Biobank, a large- scale, extensively phenotyped prospective resource

Genotyping and quality control of UK Biobank, a large- scale, extensively phenotyped prospective resource Genotyping and quality control of UK Biobank, a large- scale, extensively phenotyped prospective resource Information for researchers Interim Data Release, 2015 1 Introduction... 3 1.1 UK Biobank... 3

More information

SNPbrowser Software v3.5

SNPbrowser Software v3.5 Product Bulletin SNP Genotyping SNPbrowser Software v3.5 A Free Software Tool for the Knowledge-Driven Selection of SNP Genotyping Assays Easily visualize SNPs integrated with a physical map, linkage disequilibrium

More information

UKB_WCSGAX: UK Biobank 500K Samples Genotyping Data Generation by the Affymetrix Research Services Laboratory. April, 2015

UKB_WCSGAX: UK Biobank 500K Samples Genotyping Data Generation by the Affymetrix Research Services Laboratory. April, 2015 UKB_WCSGAX: UK Biobank 500K Samples Genotyping Data Generation by the Affymetrix Research Services Laboratory April, 2015 1 Contents Overview... 3 Rare Variants... 3 Observation... 3 Approach... 3 ApoE

More information

Selected Socio-Economic Data. Baker County, Florida

Selected Socio-Economic Data. Baker County, Florida Selected Socio-Economic Data African American and White, Not Hispanic www.fairvote2020.org www.fairdata2000.com 5-Feb-12 C03002. HISPANIC OR LATINO ORIGIN BY RACE - Universe: TOTAL POPULATION Population

More information

Rethinking the Cultural Context of Schooling Decisions in Disadvantaged Neighborhoods: From Deviant Subculture to Cultural Heterogeneity

Rethinking the Cultural Context of Schooling Decisions in Disadvantaged Neighborhoods: From Deviant Subculture to Cultural Heterogeneity Rethinking the Cultural Context of Schooling Decisions in Disadvantaged Neighborhoods: From Deviant Subculture to Cultural Heterogeneity Sociology of Education David J. Harding, University of Michigan

More information

Factors for success in big data science

Factors for success in big data science Factors for success in big data science Damjan Vukcevic Data Science Murdoch Childrens Research Institute 16 October 2014 Big Data Reading Group (Department of Mathematics & Statistics, University of Melbourne)

More information

Univariate Regression

Univariate Regression Univariate Regression Correlation and Regression The regression line summarizes the linear relationship between 2 variables Correlation coefficient, r, measures strength of relationship: the closer r is

More information

A Multi-locus Genetic Risk Score for Abdominal Aortic Aneurysm

A Multi-locus Genetic Risk Score for Abdominal Aortic Aneurysm A Multi-locus Genetic Risk Score for Abdominal Aortic Aneurysm Zi Ye, 1 MD, Erin Austin, 1,2 PhD, Daniel J Schaid, 2 PhD, Iftikhar J. Kullo, 1 MD Affiliations: 1 Division of Cardiovascular Diseases and

More information

Statistical Profile of Unmarried Women: Texas 1

Statistical Profile of Unmarried Women: Texas 1 1 I. Population, 18 years and older, 2014 2 Total: 16.844 million Men: 8.076 million (47.9% of total) 8.767 million (52.1% of total) Married women: 3 4.443 million (26.4% of total; 50.7% of women) Unmarried

More information

Statistical Profile of Unmarried Women: New York 1

Statistical Profile of Unmarried Women: New York 1 1 I. Population, 18 years and older, 2014 2 Total: 13.611 million Men: 6.460 million (47.5% of total) 7.150 million (52.5% of total) Married women: 3 3.237 million (23.8% of total; 45.3% of women) Unmarried

More information

Demographic Analysis of the Salt River Pima-Maricopa Indian Community Using 2010 Census and 2010 American Community Survey Estimates

Demographic Analysis of the Salt River Pima-Maricopa Indian Community Using 2010 Census and 2010 American Community Survey Estimates Demographic Analysis of the Salt River Pima-Maricopa Indian Community Using 2010 Census and 2010 American Community Survey Estimates Completed for: Grants & Contract Office The Salt River Pima-Maricopa

More information

Social Security Eligibility and the Labor Supply of Elderly Immigrants. George J. Borjas Harvard University and National Bureau of Economic Research

Social Security Eligibility and the Labor Supply of Elderly Immigrants. George J. Borjas Harvard University and National Bureau of Economic Research Social Security Eligibility and the Labor Supply of Elderly Immigrants George J. Borjas Harvard University and National Bureau of Economic Research Updated for the 9th Annual Joint Conference of the Retirement

More information

Major Depressive Disorder: Stage 1 Genomewide Association in Population-Based Samples.

Major Depressive Disorder: Stage 1 Genomewide Association in Population-Based Samples. Major Depressive Disorder: Stage 1 Genomewide Association in Population-Based Samples. Patrick Sullivan 1, Danyu Lin 1, Jung-Ying Tzeng 4, Gonneke Willemsen 2, Eco de Geus 2, Dorret Boomsma 2 Jan Smit

More information

GAW 15 Problem 3: Simulated Rheumatoid Arthritis Data Full Model and Simulation Parameters

GAW 15 Problem 3: Simulated Rheumatoid Arthritis Data Full Model and Simulation Parameters GAW 15 Problem 3: Simulated Rheumatoid Arthritis Data Full Model and Simulation Parameters Michael B Miller , Michael Li , Gregg Lind , Soon-Young

More information

SeattleSNPs Interactive Tutorial: Web Tools for Site Selection, Linkage Disequilibrium and Haplotype Analysis

SeattleSNPs Interactive Tutorial: Web Tools for Site Selection, Linkage Disequilibrium and Haplotype Analysis SeattleSNPs Interactive Tutorial: Web Tools for Site Selection, Linkage Disequilibrium and Haplotype Analysis Goal: This tutorial introduces several websites and tools useful for determining linkage disequilibrium

More information

Tutorial on gplink. http://pngu.mgh.harvard.edu/~purcell/plink/gplink.shtml. PLINK tutorial, December 2006; Shaun Purcell, shaun@pngu.mgh.harvard.

Tutorial on gplink. http://pngu.mgh.harvard.edu/~purcell/plink/gplink.shtml. PLINK tutorial, December 2006; Shaun Purcell, shaun@pngu.mgh.harvard. Tutorial on gplink http://pngu.mgh.harvard.edu/~purcell/plink/gplink.shtml Basic gplink analyses Data management Summary statistics Association analysis Population stratification IBD-based analysis gplink

More information

Logistic Regression (1/24/13)

Logistic Regression (1/24/13) STA63/CBB540: Statistical methods in computational biology Logistic Regression (/24/3) Lecturer: Barbara Engelhardt Scribe: Dinesh Manandhar Introduction Logistic regression is model for regression used

More information

GWAS Data Cleaning. GENEVA Coordinating Center Department of Biostatistics University of Washington. January 13, 2016.

GWAS Data Cleaning. GENEVA Coordinating Center Department of Biostatistics University of Washington. January 13, 2016. GWAS Data Cleaning GENEVA Coordinating Center Department of Biostatistics University of Washington January 13, 2016 Contents 1 Overview 2 2 Preparing Data 3 2.1 Data formats used in GWASTools............................

More information

Differential privacy in health care analytics and medical research An interactive tutorial

Differential privacy in health care analytics and medical research An interactive tutorial Differential privacy in health care analytics and medical research An interactive tutorial Speaker: Moritz Hardt Theory Group, IBM Almaden February 21, 2012 Overview 1. Releasing medical data: What could

More information

Northumberland Knowledge

Northumberland Knowledge Northumberland Knowledge Know Guide How to Analyse Data - November 2012 - This page has been left blank 2 About this guide The Know Guides are a suite of documents that provide useful information about

More information

Can Annuity Purchase Intentions Be Influenced?

Can Annuity Purchase Intentions Be Influenced? Can Annuity Purchase Intentions Be Influenced? Jodi DiCenzo, CFA, CPA Behavioral Research Associates, LLC Suzanne Shu, Ph.D. UCLA Anderson School of Management Liat Hadar, Ph.D. The Arison School of Business,

More information

8. Using Sampling Weights on SIPP Files

8. Using Sampling Weights on SIPP Files 8. Using Sampling Weights on SIPP Files This chapter describes the use of sampling weights in analyzing data from the Survey of Income and Program Participation (SIPP) for the current panels. 1 Each SIPP

More information

Business Cycles and Divorce: Evidence from Microdata *

Business Cycles and Divorce: Evidence from Microdata * Business Cycles and Divorce: Evidence from Microdata * Judith K. Hellerstein 1 Melinda Sandler Morrill 2 Ben Zou 3 We use individual-level data to show that divorce is pro-cyclical on average, a finding

More information

Statistical Profile of Unmarried Women: Virginia 1

Statistical Profile of Unmarried Women: Virginia 1 1 I. Population, 18 years and older, 2014 2 Total: 5.888 million Men: 2.842 million (48.3% of total) 3.046 million (51.7% of total) Married women: 3 1.646 million (28.0% of total; 54.0% of women) Unmarried

More information

Statistical Profile of Unmarried Women: North Carolina 1

Statistical Profile of Unmarried Women: North Carolina 1 1 I. Population, 18 years and older, 2014 2 Total: 6.857 million Men: 3.204 million (46.7% of total) 3.654 million (53.3% of total) Married women: 3 1.850 million (27.0% of total; 50.6% of women) Unmarried

More information

SELECTED POPULATION PROFILE IN THE UNITED STATES. 2013 American Community Survey 1-Year Estimates

SELECTED POPULATION PROFILE IN THE UNITED STATES. 2013 American Community Survey 1-Year Estimates S0201 SELECTED POPULATION PROFILE IN THE UNITED STATES 2013 American Community Survey 1-Year Estimates Supporting documentation on code lists, subject definitions, data accuracy, and statistical testing

More information

Globally, about 9.7% of cancers in men are prostate cancers, and the risk of developing the

Globally, about 9.7% of cancers in men are prostate cancers, and the risk of developing the Chapter 5 Analysis of Prostate Cancer Association Study Data 5.1 Risk factors for Prostate Cancer Globally, about 9.7% of cancers in men are prostate cancers, and the risk of developing the disease has

More information

Pervasive Area Poverty: a pilot study applying modelled household income in a NILS context

Pervasive Area Poverty: a pilot study applying modelled household income in a NILS context Pervasive Area Poverty: a pilot study applying modelled household income in a NILS context April 2009 Alan McClelland OFMdFM Equality Directorate Research Branch and David Donnelly Northern Ireland Longitudinal

More information

Statistical Profile of Unmarried Women: Florida 1

Statistical Profile of Unmarried Women: Florida 1 1 I. Population, 18 years and older, 2014 2 Total: 13.879 million Men: 6.551 million (47.2% of total) 7.327 million (52.8% of total) Married women: 3 3.689 million (26.6% of total; 50.4% of women) Unmarried

More information

Statistical Profile of Unmarried Women: Colorado 1

Statistical Profile of Unmarried Women: Colorado 1 1 I. Population, 18 years and older, 2014 2 Total: 3.732 million Men: 1.849 million (49.5% of total) 1.883 million (50.5% of total) Married women: 3 1.053 million (28.2% of total; 55.9% of women) Unmarried

More information

Combining Data from Different Genotyping Platforms. Gonçalo Abecasis Center for Statistical Genetics University of Michigan

Combining Data from Different Genotyping Platforms. Gonçalo Abecasis Center for Statistical Genetics University of Michigan Combining Data from Different Genotyping Platforms Gonçalo Abecasis Center for Statistical Genetics University of Michigan The Challenge Detecting small effects requires very large sample sizes Combined

More information

Gene Expression Analysis

Gene Expression Analysis Gene Expression Analysis Jie Peng Department of Statistics University of California, Davis May 2012 RNA expression technologies High-throughput technologies to measure the expression levels of thousands

More information

Presentation Overview

Presentation Overview Treatment and Self-help Availability in Disadvantaged and Minority Neighborhoods Katherine J. Karriker-Jaffe, PhD Deidre Patterson, MPH Lee Ann Kaskutas, DrPH R01AA020328 to K.J. Karriker-Jaffe Presentation

More information

Investigating the genetic basis for intelligence

Investigating the genetic basis for intelligence Investigating the genetic basis for intelligence Steve Hsu University of Oregon and BGI www.cog-genomics.org Outline: a multidisciplinary subject 1. What is intelligence? Psychometrics 2. g and GWAS: a

More information

Variable Listing by Data Source

Variable Listing by Data Source Variable Listing by Data Source Source Name: US Census Femeld90 H006001_90 H006002_90 H006003_90 H007002_90 H007003_90 H020005_90 H020011_90 HHINCDISP_90 HHINCDISP_ELD_90 HOUSEDEN_90 Land_Area_00 LCOMMUTE_90

More information

UCTC Final Report. Why Do Inner City Residents Pay Higher Premiums? The Determinants of Automobile Insurance Premiums

UCTC Final Report. Why Do Inner City Residents Pay Higher Premiums? The Determinants of Automobile Insurance Premiums UCTC Final Report Why Do Inner City Residents Pay Higher Premiums? The Determinants of Automobile Insurance Premiums Paul M. Ong and Michael A. Stoll School of Public Affairs UCLA 3250 Public Policy Bldg.,

More information

Comparison of Imputation Methods in the Survey of Income and Program Participation

Comparison of Imputation Methods in the Survey of Income and Program Participation Comparison of Imputation Methods in the Survey of Income and Program Participation Sarah McMillan U.S. Census Bureau, 4600 Silver Hill Rd, Washington, DC 20233 Any views expressed are those of the author

More information

ONLINE APPENDIX Education Choices and Returns to Schooling: Mothers and Youths Subjective Expectations and their Role by Gender.

ONLINE APPENDIX Education Choices and Returns to Schooling: Mothers and Youths Subjective Expectations and their Role by Gender. ONLINE APPENDIX Education Choices and Returns to Schooling: Mothers and Youths Subjective Expectations and their Role by Gender. Orazio P. Attanasio and Katja M. Kaufmann February 2014, first draft: June

More information

Epigenetic variation and complex disease risk

Epigenetic variation and complex disease risk Epigenetic variation and complex disease risk Caroline Relton Institute of Human Genetics Newcastle University ALSPAC Research Symposium 2 & 3 March 2009 Missing heritability Even when dozens of genes

More information

Student Loan Information Provision and Academic Choices

Student Loan Information Provision and Academic Choices Student Loan Information Provision and Academic Choices By Maximilian Schmeiser, Christiana Stoddard, and Carly Urban As the cost of pursuing post-secondary education in the United States has continued

More information

SPSS TRAINING SESSION 3 ADVANCED TOPICS (PASW STATISTICS 17.0) Sun Li Centre for Academic Computing lsun@smu.edu.sg

SPSS TRAINING SESSION 3 ADVANCED TOPICS (PASW STATISTICS 17.0) Sun Li Centre for Academic Computing lsun@smu.edu.sg SPSS TRAINING SESSION 3 ADVANCED TOPICS (PASW STATISTICS 17.0) Sun Li Centre for Academic Computing lsun@smu.edu.sg IN SPSS SESSION 2, WE HAVE LEARNT: Elementary Data Analysis Group Comparison & One-way

More information

Analysing Questionnaires using Minitab (for SPSS queries contact -) Graham.Currell@uwe.ac.uk

Analysing Questionnaires using Minitab (for SPSS queries contact -) Graham.Currell@uwe.ac.uk Analysing Questionnaires using Minitab (for SPSS queries contact -) Graham.Currell@uwe.ac.uk Structure As a starting point it is useful to consider a basic questionnaire as containing three main sections:

More information

Remarriage in the United States

Remarriage in the United States Remarriage in the United States Poster presented at the annual meeting of the American Sociological Association, Montreal, August 10-14, 2006 Rose M. Kreider U.S. Census Bureau rose.kreider@census.gov

More information

Educational Attainment of the Population 25 Years and Over, by Selected Characteristics: 2011

Educational Attainment of the Population 25 Years and Over, by Selected Characteristics: 2011 of Eastern Oklahoma The U.S. Census Bureau created CICs to help make census information available to the public as a tool for better planning and action. Educational Attainment of the Population 25 Years

More information

The relationship between mental wellbeing and financial management among older people

The relationship between mental wellbeing and financial management among older people The relationship between mental wellbeing and financial management among older people An analysis using the third wave of Understanding Society January 2014 www.pfrc.bris.ac.uk www.ilcuk.org.uk A working

More information

Heritability: Twin Studies. Twin studies are often used to assess genetic effects on variation in a trait

Heritability: Twin Studies. Twin studies are often used to assess genetic effects on variation in a trait TWINS AND GENETICS TWINS Heritability: Twin Studies Twin studies are often used to assess genetic effects on variation in a trait Comparing MZ/DZ twins can give evidence for genetic and/or environmental

More information

Journal of Statistical Software

Journal of Statistical Software JSS Journal of Statistical Software October 2006, Volume 16, Code Snippet 3. http://www.jstatsoft.org/ LDheatmap: An R Function for Graphical Display of Pairwise Linkage Disequilibria between Single Nucleotide

More information

Are differences in methylation in cord blood DNA associated with prenatal exposure to alcohol?

Are differences in methylation in cord blood DNA associated with prenatal exposure to alcohol? Are differences in methylation in cord blood DNA associated with prenatal exposure to alcohol? Luisa Zuccolo l.zuccolo@bristol.ac.uk MRC IEU, School of Social and Community Medicine Outline Background

More information

Marker-Assisted Backcrossing. Marker-Assisted Selection. 1. Select donor alleles at markers flanking target gene. Losing the target allele

Marker-Assisted Backcrossing. Marker-Assisted Selection. 1. Select donor alleles at markers flanking target gene. Losing the target allele Marker-Assisted Backcrossing Marker-Assisted Selection CS74 009 Jim Holland Target gene = Recurrent parent allele = Donor parent allele. Select donor allele at markers linked to target gene.. Select recurrent

More information

THE MORTGAGE INTEREST DEDUCTION ACROSS ZIP CODES. Benjamin H. Harris and Lucie Parker Urban-Brookings Tax Policy Center December 4, 2014 ABSTRACT

THE MORTGAGE INTEREST DEDUCTION ACROSS ZIP CODES. Benjamin H. Harris and Lucie Parker Urban-Brookings Tax Policy Center December 4, 2014 ABSTRACT THE MORTGAGE INTEREST DEDUCTION ACROSS ZIP CODES Benjamin H. Harris and Lucie Parker Urban-Brookings Tax Policy Center December 4, 2014 ABSTRACT This brief examines characteristics of the mortgage interest

More information

STATISTICA. Clustering Techniques. Case Study: Defining Clusters of Shopping Center Patrons. and

STATISTICA. Clustering Techniques. Case Study: Defining Clusters of Shopping Center Patrons. and Clustering Techniques and STATISTICA Case Study: Defining Clusters of Shopping Center Patrons STATISTICA Solutions for Business Intelligence, Data Mining, Quality Control, and Web-based Analytics Table

More information

Education and Wage Differential by Race: Convergence or Divergence? *

Education and Wage Differential by Race: Convergence or Divergence? * Education and Wage Differential by Race: Convergence or Divergence? * Tian Luo Thesis Advisor: Professor Andrea Weber University of California, Berkeley Department of Economics April 2009 Abstract This

More information

Owner-Occupied Shelter in Experimental Poverty Measures. Thesia I. Garner 1 and Kathleen S. Short 2. November 15, 2001

Owner-Occupied Shelter in Experimental Poverty Measures. Thesia I. Garner 1 and Kathleen S. Short 2. November 15, 2001 Owner-Occupied Shelter in Experimental Poverty Measures By Thesia I. Garner 1 and Kathleen S. Short 2 November 15, 2001 1 Senior Research Economist Division of Price and Index Number Research Bureau of

More information

Moving Beyond the Gap

Moving Beyond the Gap Moving Beyond the Gap Racial Disparities in September 2014 Central Corridor St. Paul Hopkins Blake Rd Corridor Eastside St. Paul South Minneapolis September 2014 Overview This report is part of a larger

More information

New Estimates of Broadband Supply and Demand

New Estimates of Broadband Supply and Demand New Estimates of Broadband Supply and Demand Wei-Min Hu and James E. Prieger Department of Economics University of California, Davis jeprieger@ucdavis.edu 1 Broadband Access to the Internet The Latest

More information

AN EVALUATION OF THE SAMPLING ALGORITHMS IMPLEMENTED IN CAPI FOR THE NATIONAL MEDICAL EXPENDITURE SURVEY - 3 FIELD PRETEST

AN EVALUATION OF THE SAMPLING ALGORITHMS IMPLEMENTED IN CAPI FOR THE NATIONAL MEDICAL EXPENDITURE SURVEY - 3 FIELD PRETEST AN EVALUATION OF THE SAMPLING ALGORITHMS IMPLEMENTED IN CAPI FOR THE NATIONAL MEDICAL EXPENDITURE SURVEY - 3 FIELD PRETEST Mamatha S. Pancholi, Steven B. Cohen, Agency for Health Care Policy and Research

More information

Health Insurance Participation: The Role of Cognitive Ability and Risk Aversion

Health Insurance Participation: The Role of Cognitive Ability and Risk Aversion Theoretical and Applied Economics Volume XVII (2010), No. 11(552), pp. 103-112 Health Insurance Participation: The Role of Cognitive Ability and Risk Aversion Swarn CHATTERJEE University of Georgia, Athens,

More information

COMPARISONS OF CUSTOMER LOYALTY: PUBLIC & PRIVATE INSURANCE COMPANIES.

COMPARISONS OF CUSTOMER LOYALTY: PUBLIC & PRIVATE INSURANCE COMPANIES. 277 CHAPTER VI COMPARISONS OF CUSTOMER LOYALTY: PUBLIC & PRIVATE INSURANCE COMPANIES. This chapter contains a full discussion of customer loyalty comparisons between private and public insurance companies

More information

Broadband speed impact on GDP growth and household income: Comparing OECD and BRIC

Broadband speed impact on GDP growth and household income: Comparing OECD and BRIC Broadband speed impact on GDP growth and household income: Comparing OECD and BRIC Erik Bohlin Chalmers University of Technology Presented for the ITU Workshop: New Trends for Building and Financing Broadband:

More information

Working Beyond Retirement-Age

Working Beyond Retirement-Age Working Beyond Retirement-Age Kelly A. Holder and Sandra L. Clark U.S. Census Bureau Housing and Household Economics Division Labor Force Statistics Branch Presented at the American Sociological Association

More information

Location matters. 3 techniques to incorporate geo-spatial effects in one's predictive model

Location matters. 3 techniques to incorporate geo-spatial effects in one's predictive model Location matters. 3 techniques to incorporate geo-spatial effects in one's predictive model Xavier Conort xavier.conort@gear-analytics.com Motivation Location matters! Observed value at one location is

More information

MTH 140 Statistics Videos

MTH 140 Statistics Videos MTH 140 Statistics Videos Chapter 1 Picturing Distributions with Graphs Individuals and Variables Categorical Variables: Pie Charts and Bar Graphs Categorical Variables: Pie Charts and Bar Graphs Quantitative

More information

Economic inequality and educational attainment across a generation

Economic inequality and educational attainment across a generation Economic inequality and educational attainment across a generation Mary Campbell, Robert Haveman, Gary Sandefur, and Barbara Wolfe Mary Campbell is an assistant professor of sociology at the University

More information

The relationship between socioeconomic status and healthy behaviors: A mediational analysis. Jenn Risch Ashley Papoy.

The relationship between socioeconomic status and healthy behaviors: A mediational analysis. Jenn Risch Ashley Papoy. Running head: SOCIOECONOMIC STATUS AND HEALTHY BEHAVIORS The relationship between socioeconomic status and healthy behaviors: A mediational analysis Jenn Risch Ashley Papoy Hanover College Prior research

More information

Demographics of Atlanta, Georgia:

Demographics of Atlanta, Georgia: Demographics of Atlanta, Georgia: A Visual Analysis of the 2000 and 2010 Census Data 36-315 Final Project Rachel Cohen, Kathryn McKeough, Minnar Xie & David Zimmerman Ethnicities of Atlanta Figure 1: From

More information

Step-by-Step Guide to Bi-Parental Linkage Mapping WHITE PAPER

Step-by-Step Guide to Bi-Parental Linkage Mapping WHITE PAPER Step-by-Step Guide to Bi-Parental Linkage Mapping WHITE PAPER JMP Genomics Step-by-Step Guide to Bi-Parental Linkage Mapping Introduction JMP Genomics offers several tools for the creation of linkage maps

More information

ASSESSING FINANCIAL EDUCATION: EVIDENCE FROM BOOTCAMP. William Skimmyhorn. Online Appendix

ASSESSING FINANCIAL EDUCATION: EVIDENCE FROM BOOTCAMP. William Skimmyhorn. Online Appendix ASSESSING FINANCIAL EDUCATION: EVIDENCE FROM BOOTCAMP William Skimmyhorn Online Appendix Appendix Table 1. Treatment Variable Imputation Procedure Step Description Percent 1 Using administrative data,

More information

NCSS Statistical Software Principal Components Regression. In ordinary least squares, the regression coefficients are estimated using the formula ( )

NCSS Statistical Software Principal Components Regression. In ordinary least squares, the regression coefficients are estimated using the formula ( ) Chapter 340 Principal Components Regression Introduction is a technique for analyzing multiple regression data that suffer from multicollinearity. When multicollinearity occurs, least squares estimates

More information

GOBII. Genomic & Open-source Breeding Informatics Initiative

GOBII. Genomic & Open-source Breeding Informatics Initiative GOBII Genomic & Open-source Breeding Informatics Initiative My Background BS Animal Science, University of Tennessee MS Animal Breeding, University of Georgia Random regression models for longitudinal

More information

Example: Credit card default, we may be more interested in predicting the probabilty of a default than classifying individuals as default or not.

Example: Credit card default, we may be more interested in predicting the probabilty of a default than classifying individuals as default or not. Statistical Learning: Chapter 4 Classification 4.1 Introduction Supervised learning with a categorical (Qualitative) response Notation: - Feature vector X, - qualitative response Y, taking values in C

More information

DNA Copy Number and Loss of Heterozygosity Analysis Algorithms

DNA Copy Number and Loss of Heterozygosity Analysis Algorithms DNA Copy Number and Loss of Heterozygosity Analysis Algorithms Detection of copy-number variants and chromosomal aberrations in GenomeStudio software. Introduction Illumina has developed several algorithms

More information

Admixture 1.23 Software Manual. David H. Alexander John Novembre Kenneth Lange

Admixture 1.23 Software Manual. David H. Alexander John Novembre Kenneth Lange Admixture 1.23 Software Manual David H. Alexander John Novembre Kenneth Lange August 22, 2013 Contents 1 Quick start 1 2 Reference 3 2.1 How do I choose the correct value for K?................... 3 2.1.1

More information

BASIC STATISTICAL METHODS FOR GENOMIC DATA ANALYSIS

BASIC STATISTICAL METHODS FOR GENOMIC DATA ANALYSIS BASIC STATISTICAL METHODS FOR GENOMIC DATA ANALYSIS SEEMA JAGGI Indian Agricultural Statistics Research Institute Library Avenue, New Delhi-110 012 seema@iasri.res.in Genomics A genome is an organism s

More information

Working Poor Profiles in Rochester, NY

Working Poor Profiles in Rochester, NY Working Poor Profiles in Rochester, NY Profile 1: Individuals at or below the poverty level Prepared for Leonard Brock, Ed.D., RMAPI Director Kara S. Finnigan, Ph.D. Madeleine Feldman July 27, 2015 Data

More information

Using the National Longitudinal Survey

Using the National Longitudinal Survey Who goes to college? Evidence from the NLSY97 s from the National Longitudinal Survey of Youth 997 show that sex, race, and ethnicity are unrelated to the student s decision to complete the first year

More information

SAP HANA Enabling Genome Analysis

SAP HANA Enabling Genome Analysis SAP HANA Enabling Genome Analysis Joanna L. Kelley, PhD Postdoctoral Scholar, Stanford University Enakshi Singh, MSc HANA Product Management, SAP Labs LLC Outline Use cases Genomics review Challenges in

More information

Schools Value-added Information System Technical Manual

Schools Value-added Information System Technical Manual Schools Value-added Information System Technical Manual Quality Assurance & School-based Support Division Education Bureau 2015 Contents Unit 1 Overview... 1 Unit 2 The Concept of VA... 2 Unit 3 Control

More information

Basic Statistics and Data Analysis for Health Researchers from Foreign Countries

Basic Statistics and Data Analysis for Health Researchers from Foreign Countries Basic Statistics and Data Analysis for Health Researchers from Foreign Countries Volkert Siersma siersma@sund.ku.dk The Research Unit for General Practice in Copenhagen Dias 1 Content Quantifying association

More information

Who Marries Differently-Aged Spouses? Ability, Education, Occupation, Earnings, and Appearance. Hani Mansour, University of Colorado Denver and IZA

Who Marries Differently-Aged Spouses? Ability, Education, Occupation, Earnings, and Appearance. Hani Mansour, University of Colorado Denver and IZA Who Marries Differently-Aged Spouses? Ability, Education, Occupation, Earnings, and Appearance by Hani Mansour, University of Colorado Denver and IZA Terra McKinnish, University of Colorado Boulder Forthcoming,

More information

Outcome Data, Links to Electronic Medical Records. Dan Roden Vanderbilt University

Outcome Data, Links to Electronic Medical Records. Dan Roden Vanderbilt University Outcome Data, Links to Electronic Medical Records Dan Roden Vanderbilt University Coordinating Center Type II Diabetes Case Algorithm * Abnormal lab= Random glucose > 200mg/dl, Fasting glucose > 125 mg/dl,

More information

Educational Attainment of Veterans: 2000 to 2009

Educational Attainment of Veterans: 2000 to 2009 Educational Attainment of Veterans: to 9 January 11 NCVAS National Center for Veterans Analysis and Statistics Data Source and Methods Data for this analysis come from years of the Current Population Survey

More information

Simple linear regression

Simple linear regression Simple linear regression Introduction Simple linear regression is a statistical method for obtaining a formula to predict values of one variable from another where there is a causal relationship between

More information

Building risk prediction models - with a focus on Genome-Wide Association Studies. Charles Kooperberg

Building risk prediction models - with a focus on Genome-Wide Association Studies. Charles Kooperberg Building risk prediction models - with a focus on Genome-Wide Association Studies Risk prediction models Based on data: (D i, X i1,..., X ip ) i = 1,..., n we like to fit a model P(D = 1 X 1,..., X p )

More information

Iowa School District Profiles. Central City

Iowa School District Profiles. Central City Iowa School District Profiles Overview This profile describes enrollment trends, student performance, income levels, population, and other characteristics of the Central City public school district. The

More information

What It s Worth: Field of Training and Economic Status in 2009

What It s Worth: Field of Training and Economic Status in 2009 What It s Worth: Field of Training and Economic Status in 2009 Household Economic Studies Issued February 2012 P70-129 INTRODUCTION The relationship between educational attainment and economic outcomes

More information

APPENDIX V METHODOLOGY OF 2011 MONTANA HEALTH INSURANCE SURVEYS

APPENDIX V METHODOLOGY OF 2011 MONTANA HEALTH INSURANCE SURVEYS APPENDIX V METHODOLOGY OF 2011 MONTANA HEALTH INSURANCE SURVEYS The purpose of the 2011 Health Insurance Surveys is a: Study of the insured, underinsured and uninsured Montanans Study of Montana s current

More information

2003 National Survey of College Graduates Nonresponse Bias Analysis 1

2003 National Survey of College Graduates Nonresponse Bias Analysis 1 2003 National Survey of College Graduates Nonresponse Bias Analysis 1 Michael White U.S. Census Bureau, Washington, DC 20233 Abstract The National Survey of College Graduates (NSCG) is a longitudinal survey

More information

Curriculum Map Statistics and Probability Honors (348) Saugus High School Saugus Public Schools 2009-2010

Curriculum Map Statistics and Probability Honors (348) Saugus High School Saugus Public Schools 2009-2010 Curriculum Map Statistics and Probability Honors (348) Saugus High School Saugus Public Schools 2009-2010 Week 1 Week 2 14.0 Students organize and describe distributions of data by using a number of different

More information

CASSI: Genome-Wide Interaction Analysis Software

CASSI: Genome-Wide Interaction Analysis Software CASSI: Genome-Wide Interaction Analysis Software 1 Contents 1 Introduction 3 2 Installation 3 3 Using CASSI 3 3.1 Input Files................................... 4 3.2 Options....................................

More information

Performance Metrics for Graph Mining Tasks

Performance Metrics for Graph Mining Tasks Performance Metrics for Graph Mining Tasks 1 Outline Introduction to Performance Metrics Supervised Learning Performance Metrics Unsupervised Learning Performance Metrics Optimizing Metrics Statistical

More information

The Artificial Prediction Market

The Artificial Prediction Market The Artificial Prediction Market Adrian Barbu Department of Statistics Florida State University Joint work with Nathan Lay, Siemens Corporate Research 1 Overview Main Contributions A mathematical theory

More information

ONLINE APPENDIX FOR PUBLIC HEALTH INSURANCE, LABOR SUPPLY,

ONLINE APPENDIX FOR PUBLIC HEALTH INSURANCE, LABOR SUPPLY, ONLINE APPENDIX FOR PUBLIC HEALTH INSURANCE, LABOR SUPPLY, AND EMPLOYMENT LOCK Craig Garthwaite Tal Gross Matthew J. Notowidigdo December 2013 A1. Monte Carlo Simulations This section describes a set of

More information

Broome County Community Health Assessment 2013-2017 1 APPENDIX A

Broome County Community Health Assessment 2013-2017 1 APPENDIX A Community Health Assessment 2013-2017 1 APPENDIX A 2 Community Health Assessment 2013-2017 Table of Contents: Appendix A A Community Report Card will be developed based on identified strengths and opportunities

More information

Employment-Based Health Insurance: 2010

Employment-Based Health Insurance: 2010 Employment-Based Health Insurance: 2010 Household Economic Studies Hubert Janicki Issued February 2013 P70-134 INTRODUCTION More than half of the U.S. population (55.1 percent) had employment-based health

More information

Local outlier detection in data forensics: data mining approach to flag unusual schools

Local outlier detection in data forensics: data mining approach to flag unusual schools Local outlier detection in data forensics: data mining approach to flag unusual schools Mayuko Simon Data Recognition Corporation Paper presented at the 2012 Conference on Statistical Detection of Potential

More information

Experiment on Web based recruitment of Cell Phone Only respondents

Experiment on Web based recruitment of Cell Phone Only respondents Experiment on Web based recruitment of Cell Phone Only respondents 2008 AAPOR Annual Conference, New Orleans By: Chintan Turakhia, Abt SRBI Inc. Mark A. Schulman, Abt SRBI Inc. Seth Brohinsky, Abt SRBI

More information

Spatial Analysis with GeoDa Spatial Autocorrelation

Spatial Analysis with GeoDa Spatial Autocorrelation Spatial Analysis with GeoDa Spatial Autocorrelation 1. Background GeoDa is a trademark of Luc Anselin. GeoDa is a collection of software tools designed for exploratory spatial data analysis (ESDA) based

More information

CeGE-Discussion Paper

CeGE-Discussion Paper CeGE-Discussion Paper 38 Philipp Bauer Regina T. Riphahn Heterogenity in the Intergenerational Transmission of Educational Attainment: Evidence from Switzerland on Natives and Second Generation Immigrants

More information

Predicting The Risk Of Rheumatoid Arthritis

Predicting The Risk Of Rheumatoid Arthritis Predicting The Risk Of Rheumatoid Arthritis Modelling Genetic And Environmental Risk Factors Ian Scott Arthritis Research UK Clinical Research Fellow Declaration Of Interests: No Competing Interests Describe

More information

13. Linking Core Wave, Topical Module, and Longitudinal Research Files

13. Linking Core Wave, Topical Module, and Longitudinal Research Files 13. Linking Core Wave, Topical Module, and Longitudinal Research Files In many situations, a single Survey of Income and Program Participation (SIPP) data file will not contain the information needed for

More information