Imputation of missing data under missing not at random assumption & sensitivity analysis
|
|
- Kimberly Reeves
- 7 years ago
- Views:
Transcription
1 Imputation of missing data under missing not at random assumption & sensitivity analysis S. Jolani Department of Methodology and Statistics, Utrecht University, the Netherlands Advanced Multiple Imputation, Utrecht, May 2013
2 Outline
3 Why missing not at random (MNAR)? There might be a reason to believe that responders differ from non-responders, even after accounting for the observed information Some examples: - Income - some people may not reveal their salaries - Blood pressure - the blood pressure is measured less frequently for patients with lower blood pressures - Depression - some patients might dropout because they believe the treatment is not effective
4 Notation Y : incomplete variable R: response (R = 1 if Y is observed) X: fully observed covariate Y obs and Y mis : the observed and missing parts of Y
5 A general strategy Y and R must be modeled jointly (Rubin, 1976) under an MNAR assumption so P(Y, R)
6 Why the classical MI does not work? Imputation under MAR P(Y X, R = 0) = P(Y X, R = 1)
7 Why the classical MI does not work? Imputation under MAR P(Y X, R = 0) = P(Y X, R = 1) Imputation under MNAR P(Y X, R = 0) P(Y X, R = 1)
8 Models for Two general approaches (there are some more): 1 (Heckman, 1976) 2 - (Rubin, 1977)
9 Selection model P(Y, R; ξ, ω) = P(Y ; ξ)p(r Y ; ω), where the parameters ξ and ω are a priori independent. P(Y ; ξ) distribution for the full data P(R Y ; ω) response mechanism (selection function)
10 Selection model Imputation model under MNAR where P(Y mis X, Y obs, R) = P(Y mis X, Y obs, R) P(Y mis X, Y obs )P(R X, Y ) P(Ymis X, Y obs )P(R X, Y ) Y mis
11 Selection model Imputation model under MNAR P(Y mis X, Y obs, R) A simple but possibly inefficient approach (Rubin, 1987): 1 Draw a candidate Y i P(Y i X i ; ξ = ξ ) 2 Calculate p i = P(R i = 1 X i, Y i = Y i ; ω) 3 Draw R i Ber(1, p i ) 4 Impute Y i if R i = 0 otherwise return to (1)
12 model P(Y, R; ψ, θ) = P(R; ψ)p(y R; θ), where the parameters ψ and θ are a priori independent. P(Y X, R = 1; θ 1 ) distribution for the observed data P(Y X, R = 0; θ 0 ) distribution for the missing data P(R; ψ) marginal response probability
13 model The general procedure (Rubin, 1977): 1 Draw θ1 from its posterior distribution using P(Y X, R = 1; θ 1 ) 2 Specify the posterior P(θ 0 θ 1 ) a priori (e.g., θ 0 = θ 1 + k where k is a fixed constant) 3 Draw θ 0 P(θ 0 θ 1 ) 4 Impute Y mis from P(Y X, R = 0; θ 0 )
14 An example Suppose Y is an incomplete variable (continuous) Y obs N(µ 1, σ 2 1 ), Y mis N(µ 0, σ 2 0 ) where θ 1 = (µ 1, σ1 2) and θ 0 = (µ 0, σ0 2 ). Now, if we define µ 0 = µ 1 + k 1, σ 2 0 = k 2σ 2 1 where k 1 and k 2 are fixed and known values (sensitivity parameters).
15 An example Suppose Y is an incomplete variable (continuous) Y obs N(µ 1, σ 2 1 ), Y mis N(µ 0, σ 2 0 ) where θ 1 = (µ 1, σ1 2) and θ 0 = (µ 0, σ0 2 ). Now, if we define µ 0 = µ 1 + k 1, σ 2 0 = k 2σ 2 1 where k 1 and k 2 are fixed and known values (sensitivity parameters). Sensitivity analysis: repeat the analysis for different choices of k 1 and k 2
16 Application: cohort study N=1236, 85+ on Dec. 1, 1986 N=956 were visited ( ) Blood pressure (BP) is missing for 121 patients * Do anti-hypertensive drugs shorten life in the oldest old? * Scientific interest: Mortality risk as function of BP and age
17 Imputation techniques > Simple non-ignorable Survival Survival probability probability by response by response group group BP measured K M Survival probability BP missing Years since intake Source: van Buuren et al. (1999) Research Master MS
18 From the data we see Those with no BP measured die earlier Those that die early and that have no hypertension history have fewer BP measurements Thus, s of BP under MAR could be too high values. We need to lower the imputed values of BP, and study the influence on the outcome
19 A simple model to shift s Y : BP X: age, hypertension, haemoglobin, and etc Specify P(Y X, R) Combined formulation: Model 1 Y = Xβ + ɛ R = 1 2 Y = Xβ + δ + ɛ R = 0 Y = Xβ + (1 R)δ + ɛ δ cannot be estimated (sensitivity parameter)
20 Imputation techniques > Simple non-ignorable Numerical example Both Y Selection model Mixture model Source: van Buuren et al. (1999) Research Master MS
21 How to impute under MNAR in MICE? > delta <- c(0,-5,-10,-15,-20) > post <- mice(leiden85,maxit=0)$post > imp.all <- vector("list", length(delta)) > for (i in 1:length(delta)) { + d <- delta[i] + cmd <- paste("imp[[j]][,i] <- imp[[j]][,i] +",d) + post["bp"] <- cmd + imp <- mice(leidan85, post=post, seed=i*22, print=false) + imp.all[[i]] <- imp + }
22 140 0)15 0)15 0)15 0) )10 0)30 0)31 0) )08 0)15 0)16 0) )06 0)10 0)11 0) )04 0)05 0)05 0) )02 0)03 0)03 0) )00 0)02 0)02 0)00 Mean (mmhg) )6 138)6 : Sensitivity analysis Table V. Mean and standard deviation of the observed and imputed blood pressures. The statistics of imputed BP are pooled over m"5 multiple s N δ SBP DBP Mean SD Mean SD Observed BP )9 25)7 82)8 13)1 Imputed BP )1 26)2 81)5 14)0 121!5 142)3 24)6 78)4 13)7 121!10 135)9 24)7 78)2 12)8 121!15 128)6 25)0 75)3 12)9 121!20 122)3 25)2 74)0 12)1 Source: van Buuren et al. (1999) to model A of the introduction. It was expected that multiple would raise the estimates in comparison with the analysis based on the complete cases, but the results do confirm this. Only slight differences in mortality exist, even among non-response w very different δ s. It appears that, for this application, risk estimates are insensitive to the mis data and the various non-response used to deal with them.
23 Combined formulation: Y = Xβ + (1 R)δ + ɛ if ɛ N(0, σ 2 ), then Y obs N(Xβ, σ 2 ) (1) Y mis N(Xβ + δ, σ 2 ) (2) P(R = 1)P(Y X, R = 1) logit{p(r = 1 X, Y )} = log[ P(R = 0)P(Y X, R = 0) ] where ψ 1 = δ/σ 2 so that δ = ψ 1 σ 2. = ψ 0 + ψ 1 Y + ψ 2 X, (3)
24 Assume P(R = 1 X, Y ) is known (unrealistic!) Y R 1 - P(R = 1 X, Y) R
25 Gr Y R R 1 E(Y X,R, R 1 ) µ µ µ µ It can be shown that µ 10 = µ 01 µ 11 µ 10 µ 01 µ 00 The idea? Impute group 3 from group 2 Impute group 4 from groups 2 and 1
26 But in reality P(R = 1 X, Y ) is unknown Y R R X Figure : The schematic representation of the data
27 But in reality P(R = 1 X, Y ) is unknown Y R R X Fully Conditional Specification: Y P(Y X, R, R 1 ) R 1 P(R 1 X, Y ) Figure : The schematic representation of the data
28 1 Impute initially missing values (Y ) 2 Draw Ṙ from a Bernouli process (Ṙ Ber(1, π)) where π = P(R = 1 X, Y ) 3 Using groups 1 and 2, estimate β and δ from E(Y X, R = 1, R 1 = r 1 ) = Xβ + δ(r 1 1), r 1 = 0, 1 4 Draw β from its posterior distribution for a given prior for β 5 Predict the missing data for group 3 using X β ˆδ 6 Predict the missing data for group 4 using X β 2ˆδ 7 Impute the missing data by adding an appropriate amount of noise to the predicted values 8 Return to Step 2
29 How to implement the drawn method in MICE? > mice(data, meth = "ri") The RI function: > mice.impute.ri(y, ry, x, ri.maxit = 10,...) Note: 1 only for continuous variables (the current version) 2 the same covariates for both (the current version)
30 Summary Participants: 956 Observed BP: 835 Missing BP: 121 Imputation model BP sex, age, hypertension, haemoglobin, etc. Missingness mechanism logit{p(r = 1 Y, X)} BP, type of residence, ADL, previous hypertension, etc. - Number of iterations: 10 - Number of multiple s: 50
31 Table : Mean and standard error (SE) for the systolic blood pressure using CC, MI and RI Total Imputed Method Mean SE Mean SE CC MI RI
32 An interesting result: Research Master MS Using the RI method, we are able to estimate ˆδ = = This value is very similar to the amount of the adjustment in van Buuren et al. (1999) based on a numerical example.
33 Effect of response mechanism on BP density density Systolic BP (mmhg) BP Leiden85+ (the drawn method) Numerical example (van Buuren et al. 1999)
34 A summary of the under MNAR 1 All methods for the incomplete data under MNAR make unverified assumptions 2 Selection model: the distribution of the full data 3 : the distribution of the missing data 4 : the distribution of the selection function
35 General advice on MNAR 1 Why is the ignorability assumption is suspected? (why MNAR assumption) 2 Include as much data as possible in the model 3 Limit the possible non-ignorable alternatives
Handling missing data in Stata a whirlwind tour
Handling missing data in Stata a whirlwind tour 2012 Italian Stata Users Group Meeting Jonathan Bartlett www.missingdata.org.uk 20th September 2012 1/55 Outline The problem of missing data and a principled
More informationA Basic Introduction to Missing Data
John Fox Sociology 740 Winter 2014 Outline Why Missing Data Arise Why Missing Data Arise Global or unit non-response. In a survey, certain respondents may be unreachable or may refuse to participate. Item
More informationBayesian Approaches to Handling Missing Data
Bayesian Approaches to Handling Missing Data Nicky Best and Alexina Mason BIAS Short Course, Jan 30, 2012 Lecture 1. Introduction to Missing Data Bayesian Missing Data Course (Lecture 1) Introduction to
More informationIntroduction to mixed model and missing data issues in longitudinal studies
Introduction to mixed model and missing data issues in longitudinal studies Hélène Jacqmin-Gadda INSERM, U897, Bordeaux, France Inserm workshop, St Raphael Outline of the talk I Introduction Mixed models
More informationReject Inference in Credit Scoring. Jie-Men Mok
Reject Inference in Credit Scoring Jie-Men Mok BMI paper January 2009 ii Preface In the Master programme of Business Mathematics and Informatics (BMI), it is required to perform research on a business
More informationProblem of Missing Data
VASA Mission of VA Statisticians Association (VASA) Promote & disseminate statistical methodological research relevant to VA studies; Facilitate communication & collaboration among VA-affiliated statisticians;
More informationPATTERN MIXTURE MODELS FOR MISSING DATA. Mike Kenward. London School of Hygiene and Tropical Medicine. Talk at the University of Turku,
PATTERN MIXTURE MODELS FOR MISSING DATA Mike Kenward London School of Hygiene and Tropical Medicine Talk at the University of Turku, April 10th 2012 1 / 90 CONTENTS 1 Examples 2 Modelling Incomplete Data
More information1 Sufficient statistics
1 Sufficient statistics A statistic is a function T = rx 1, X 2,, X n of the random sample X 1, X 2,, X n. Examples are X n = 1 n s 2 = = X i, 1 n 1 the sample mean X i X n 2, the sample variance T 1 =
More informationSensitivity Analysis in Multiple Imputation for Missing Data
Paper SAS270-2014 Sensitivity Analysis in Multiple Imputation for Missing Data Yang Yuan, SAS Institute Inc. ABSTRACT Multiple imputation, a popular strategy for dealing with missing values, usually assumes
More informationBayesian Statistics in One Hour. Patrick Lam
Bayesian Statistics in One Hour Patrick Lam Outline Introduction Bayesian Models Applications Missing Data Hierarchical Models Outline Introduction Bayesian Models Applications Missing Data Hierarchical
More informationReview of the Methods for Handling Missing Data in. Longitudinal Data Analysis
Int. Journal of Math. Analysis, Vol. 5, 2011, no. 1, 1-13 Review of the Methods for Handling Missing Data in Longitudinal Data Analysis Michikazu Nakai and Weiming Ke Department of Mathematics and Statistics
More informationMissing data and net survival analysis Bernard Rachet
Workshop on Flexible Models for Longitudinal and Survival Data with Applications in Biostatistics Warwick, 27-29 July 2015 Missing data and net survival analysis Bernard Rachet General context Population-based,
More informationHandling missing data in large data sets. Agostino Di Ciaccio Dept. of Statistics University of Rome La Sapienza
Handling missing data in large data sets Agostino Di Ciaccio Dept. of Statistics University of Rome La Sapienza The problem Often in official statistics we have large data sets with many variables and
More informationParametric fractional imputation for missing data analysis
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 Biometrika (????,??,?, pp. 1 14 C???? Biometrika Trust Printed in
More informationNote on the EM Algorithm in Linear Regression Model
International Mathematical Forum 4 2009 no. 38 1883-1889 Note on the M Algorithm in Linear Regression Model Ji-Xia Wang and Yu Miao College of Mathematics and Information Science Henan Normal University
More informationItem Imputation Without Specifying Scale Structure
Original Article Item Imputation Without Specifying Scale Structure Stef van Buuren TNO Quality of Life, Leiden, The Netherlands University of Utrecht, The Netherlands Abstract. Imputation of incomplete
More information1 Prior Probability and Posterior Probability
Math 541: Statistical Theory II Bayesian Approach to Parameter Estimation Lecturer: Songfeng Zheng 1 Prior Probability and Posterior Probability Consider now a problem of statistical inference in which
More informationMissing Data: Part 1 What to Do? Carol B. Thompson Johns Hopkins Biostatistics Center SON Brown Bag 3/20/13
Missing Data: Part 1 What to Do? Carol B. Thompson Johns Hopkins Biostatistics Center SON Brown Bag 3/20/13 Overview Missingness and impact on statistical analysis Missing data assumptions/mechanisms Conventional
More informationarxiv:1301.2490v1 [stat.ap] 11 Jan 2013
The Annals of Applied Statistics 2012, Vol. 6, No. 4, 1814 1837 DOI: 10.1214/12-AOAS555 c Institute of Mathematical Statistics, 2012 arxiv:1301.2490v1 [stat.ap] 11 Jan 2013 ADDRESSING MISSING DATA MECHANISM
More informationIntroduction to General and Generalized Linear Models
Introduction to General and Generalized Linear Models General Linear Models - part I Henrik Madsen Poul Thyregod Informatics and Mathematical Modelling Technical University of Denmark DK-2800 Kgs. Lyngby
More informationAnalyzing Structural Equation Models With Missing Data
Analyzing Structural Equation Models With Missing Data Craig Enders* Arizona State University cenders@asu.edu based on Enders, C. K. (006). Analyzing structural equation models with missing data. In G.
More informationMissing Data in Survival Analysis and Results from the MESS Trial
Missing Data in Survival Analysis and Results from the MESS Trial J. K. Rogers J. L. Hutton K. Hemming Department of Statistics University of Warwick Research Students Conference, 2008 Outline Background
More informationHow to choose an analysis to handle missing data in longitudinal observational studies
How to choose an analysis to handle missing data in longitudinal observational studies ICH, 25 th February 2015 Ian White MRC Biostatistics Unit, Cambridge, UK Plan Why are missing data a problem? Methods:
More informationMultiple Imputation for Missing Data: A Cautionary Tale
Multiple Imputation for Missing Data: A Cautionary Tale Paul D. Allison University of Pennsylvania Address correspondence to Paul D. Allison, Sociology Department, University of Pennsylvania, 3718 Locust
More informationDealing with Missing Data
Dealing with Missing Data Roch Giorgi email: roch.giorgi@univ-amu.fr UMR 912 SESSTIM, Aix Marseille Université / INSERM / IRD, Marseille, France BioSTIC, APHM, Hôpital Timone, Marseille, France January
More informationStatistical modelling with missing data using multiple imputation. Session 4: Sensitivity Analysis after Multiple Imputation
Statistical modelling with missing data using multiple imputation Session 4: Sensitivity Analysis after Multiple Imputation James Carpenter London School of Hygiene & Tropical Medicine Email: james.carpenter@lshtm.ac.uk
More informationBasics of Statistical Machine Learning
CS761 Spring 2013 Advanced Machine Learning Basics of Statistical Machine Learning Lecturer: Xiaojin Zhu jerryzhu@cs.wisc.edu Modern machine learning is rooted in statistics. You will find many familiar
More information2 Precision-based sample size calculations
Statistics: An introduction to sample size calculations Rosie Cornish. 2006. 1 Introduction One crucial aspect of study design is deciding how big your sample should be. If you increase your sample size
More informationFinal Mathematics 5010, Section 1, Fall 2004 Instructor: D.A. Levin
Final Mathematics 51, Section 1, Fall 24 Instructor: D.A. Levin Name YOU MUST SHOW YOUR WORK TO RECEIVE CREDIT. A CORRECT ANSWER WITHOUT SHOWING YOUR REASONING WILL NOT RECEIVE CREDIT. Problem Points Possible
More information2. Making example missing-value datasets: MCAR, MAR, and MNAR
Lecture 20 1. Types of missing values 2. Making example missing-value datasets: MCAR, MAR, and MNAR 3. Common methods for missing data 4. Compare results on example MCAR, MAR, MNAR data 1 Missing Data
More informationα α λ α = = λ λ α ψ = = α α α λ λ ψ α = + β = > θ θ β > β β θ θ θ β θ β γ θ β = γ θ > β > γ θ β γ = θ β = θ β = θ β = β θ = β β θ = = = β β θ = + α α α α α = = λ λ λ λ λ λ λ = λ λ α α α α λ ψ + α =
More informationOn Treatment of the Multivariate Missing Data
On Treatment of the Multivariate Missing Data Peter J. Foster, Ahmed M. Mami & Ali M. Bala First version: 3 September 009 Research Report No. 3, 009, Probability and Statistics Group School of Mathematics,
More informationElectronic Theses and Dissertations UC Riverside
Electronic Theses and Dissertations UC Riverside Peer Reviewed Title: Bayesian and Non-parametric Approaches to Missing Data Analysis Author: Yu, Yao Acceptance Date: 01 Series: UC Riverside Electronic
More informationMath 151. Rumbos Spring 2014 1. Solutions to Assignment #22
Math 151. Rumbos Spring 2014 1 Solutions to Assignment #22 1. An experiment consists of rolling a die 81 times and computing the average of the numbers on the top face of the die. Estimate the probability
More informationMissing Data. A Typology Of Missing Data. Missing At Random Or Not Missing At Random
[Leeuw, Edith D. de, and Joop Hox. (2008). Missing Data. Encyclopedia of Survey Research Methods. Retrieved from http://sage-ereference.com/survey/article_n298.html] Missing Data An important indicator
More informationA General Approach to Variance Estimation under Imputation for Missing Survey Data
A General Approach to Variance Estimation under Imputation for Missing Survey Data J.N.K. Rao Carleton University Ottawa, Canada 1 2 1 Joint work with J.K. Kim at Iowa State University. 2 Workshop on Survey
More informationChallenges in Longitudinal Data Analysis: Baseline Adjustment, Missing Data, and Drop-out
Challenges in Longitudinal Data Analysis: Baseline Adjustment, Missing Data, and Drop-out Sandra Taylor, Ph.D. IDDRC BBRD Core 23 April 2014 Objectives Baseline Adjustment Introduce approaches Guidance
More informationA Latent Variable Approach to Validate Credit Rating Systems using R
A Latent Variable Approach to Validate Credit Rating Systems using R Chicago, April 24, 2009 Bettina Grün a, Paul Hofmarcher a, Kurt Hornik a, Christoph Leitner a, Stefan Pichler a a WU Wien Grün/Hofmarcher/Hornik/Leitner/Pichler
More informationRe-analysis using Inverse Probability Weighting and Multiple Imputation of Data from the Southampton Women s Survey
Re-analysis using Inverse Probability Weighting and Multiple Imputation of Data from the Southampton Women s Survey MRC Biostatistics Unit Institute of Public Health Forvie Site Robinson Way Cambridge
More informationConfidence Intervals for the Difference Between Two Means
Chapter 47 Confidence Intervals for the Difference Between Two Means Introduction This procedure calculates the sample size necessary to achieve a specified distance from the difference in sample means
More informationChapter 7 - Practice Problems 1
Chapter 7 - Practice Problems 1 SHORT ANSWER. Write the word or phrase that best completes each statement or answers the question. Provide an appropriate response. 1) Define a point estimate. What is the
More informationR 2 -type Curves for Dynamic Predictions from Joint Longitudinal-Survival Models
Faculty of Health Sciences R 2 -type Curves for Dynamic Predictions from Joint Longitudinal-Survival Models Inference & application to prediction of kidney graft failure Paul Blanche joint work with M-C.
More informationMarkov Chain Monte Carlo Simulation Made Simple
Markov Chain Monte Carlo Simulation Made Simple Alastair Smith Department of Politics New York University April2,2003 1 Markov Chain Monte Carlo (MCMC) simualtion is a powerful technique to perform numerical
More informationMISSING DATA TECHNIQUES WITH SAS. IDRE Statistical Consulting Group
MISSING DATA TECHNIQUES WITH SAS IDRE Statistical Consulting Group ROAD MAP FOR TODAY To discuss: 1. Commonly used techniques for handling missing data, focusing on multiple imputation 2. Issues that could
More informationNormal distribution. ) 2 /2σ. 2π σ
Normal distribution The normal distribution is the most widely known and used of all distributions. Because the normal distribution approximates many natural phenomena so well, it has developed into a
More informationLogistic Regression. Jia Li. Department of Statistics The Pennsylvania State University. Logistic Regression
Logistic Regression Department of Statistics The Pennsylvania State University Email: jiali@stat.psu.edu Logistic Regression Preserve linear classification boundaries. By the Bayes rule: Ĝ(x) = arg max
More informationHYPOTHESIS TESTING (ONE SAMPLE) - CHAPTER 7 1. used confidence intervals to answer questions such as...
HYPOTHESIS TESTING (ONE SAMPLE) - CHAPTER 7 1 PREVIOUSLY used confidence intervals to answer questions such as... You know that 0.25% of women have red/green color blindness. You conduct a study of men
More informationMissing Data & How to Deal: An overview of missing data. Melissa Humphries Population Research Center
Missing Data & How to Deal: An overview of missing data Melissa Humphries Population Research Center Goals Discuss ways to evaluate and understand missing data Discuss common missing data methods Know
More informationHYPOTHESIS TESTING: POWER OF THE TEST
HYPOTHESIS TESTING: POWER OF THE TEST The first 6 steps of the 9-step test of hypothesis are called "the test". These steps are not dependent on the observed data values. When planning a research project,
More informationADVANCE: a factorial randomised trial of blood pressure lowering and intensive glucose control in 11,140 patients with type 2 diabetes
ADVANCE: a factorial randomised trial of blood pressure lowering and intensive glucose control in 11,140 patients with type 2 diabetes Effects of a fixed combination of the ACE inhibitor, perindopril,
More information4. Continuous Random Variables, the Pareto and Normal Distributions
4. Continuous Random Variables, the Pareto and Normal Distributions A continuous random variable X can take any value in a given range (e.g. height, weight, age). The distribution of a continuous random
More informationJoint Exam 1/P Sample Exam 1
Joint Exam 1/P Sample Exam 1 Take this practice exam under strict exam conditions: Set a timer for 3 hours; Do not stop the timer for restroom breaks; Do not look at your notes. If you believe a question
More information1.5 Oneway Analysis of Variance
Statistics: Rosie Cornish. 200. 1.5 Oneway Analysis of Variance 1 Introduction Oneway analysis of variance (ANOVA) is used to compare several means. This method is often used in scientific or medical experiments
More informationCredit Risk Models: An Overview
Credit Risk Models: An Overview Paul Embrechts, Rüdiger Frey, Alexander McNeil ETH Zürich c 2003 (Embrechts, Frey, McNeil) A. Multivariate Models for Portfolio Credit Risk 1. Modelling Dependent Defaults:
More informationproblem arises when only a non-random sample is available differs from censored regression model in that x i is also unobserved
4 Data Issues 4.1 Truncated Regression population model y i = x i β + ε i, ε i N(0, σ 2 ) given a random sample, {y i, x i } N i=1, then OLS is consistent and efficient problem arises when only a non-random
More informationApplied Missing Data Analysis in the Health Sciences. Statistics in Practice
Brochure More information from http://www.researchandmarkets.com/reports/2741464/ Applied Missing Data Analysis in the Health Sciences. Statistics in Practice Description: A modern and practical guide
More informationMissing Data in Longitudinal Studies: To Impute or not to Impute? Robert Platt, PhD McGill University
Missing Data in Longitudinal Studies: To Impute or not to Impute? Robert Platt, PhD McGill University 1 Outline Missing data definitions Longitudinal data specific issues Methods Simple methods Multiple
More informationHandling attrition and non-response in longitudinal data
Longitudinal and Life Course Studies 2009 Volume 1 Issue 1 Pp 63-72 Handling attrition and non-response in longitudinal data Harvey Goldstein University of Bristol Correspondence. Professor H. Goldstein
More informationEXPANDING THE EVIDENCE BASE IN OUTCOMES RESEARCH: USING LINKED ELECTRONIC MEDICAL RECORDS (EMR) AND CLAIMS DATA
EXPANDING THE EVIDENCE BASE IN OUTCOMES RESEARCH: USING LINKED ELECTRONIC MEDICAL RECORDS (EMR) AND CLAIMS DATA A CASE STUDY EXAMINING RISK FACTORS AND COSTS OF UNCONTROLLED HYPERTENSION ISPOR 2013 WORKSHOP
More informationDepartment of Mathematics, Indian Institute of Technology, Kharagpur Assignment 2-3, Probability and Statistics, March 2015. Due:-March 25, 2015.
Department of Mathematics, Indian Institute of Technology, Kharagpur Assignment -3, Probability and Statistics, March 05. Due:-March 5, 05.. Show that the function 0 for x < x+ F (x) = 4 for x < for x
More informationDr James Roger. GlaxoSmithKline & London School of Hygiene and Tropical Medicine.
American Statistical Association Biopharm Section Monthly Webinar Series: Sensitivity analyses that address missing data issues in Longitudinal studies for regulatory submission. Dr James Roger. GlaxoSmithKline
More informationHow To Compare Birds To Other Birds
STT 430/630/ES 760 Lecture Notes: Chapter 7: Two-Sample Inference 1 February 27, 2009 Chapter 7: Two Sample Inference Chapter 6 introduced hypothesis testing in the one-sample setting: one sample is obtained
More informationOverview. Longitudinal Data Variation and Correlation Different Approaches. Linear Mixed Models Generalized Linear Mixed Models
Overview 1 Introduction Longitudinal Data Variation and Correlation Different Approaches 2 Mixed Models Linear Mixed Models Generalized Linear Mixed Models 3 Marginal Models Linear Models Generalized Linear
More informationUNTIED WORST-RANK SCORE ANALYSIS
Paper PO16 UNTIED WORST-RANK SCORE ANALYSIS William F. McCarthy, Maryland Medical Research Institute, Baltime, MD Nan Guo, Maryland Medical Research Institute, Baltime, MD ABSTRACT When the non-fatal outcome
More informationLife cycle behavior under uncertainty
Life cycle behavior under uncertainty Essays on savings, mortgages and health PhD thesis to obtain the degree of PhD at the University of Groningen on the authority of the Rector Magnificus Prof. E. Sterken
More informationNon-Inferiority Tests for Two Means using Differences
Chapter 450 on-inferiority Tests for Two Means using Differences Introduction This procedure computes power and sample size for non-inferiority tests in two-sample designs in which the outcome is a continuous
More informationAPPLIED MISSING DATA ANALYSIS
APPLIED MISSING DATA ANALYSIS Craig K. Enders Series Editor's Note by Todd D. little THE GUILFORD PRESS New York London Contents 1 An Introduction to Missing Data 1 1.1 Introduction 1 1.2 Chapter Overview
More informationManual for SOA Exam MLC.
Chapter 5. Life annuities. Extract from: Arcones Manual for the SOA Exam MLC. Spring 2010 Edition. available at http://www.actexmadriver.com/ 1/114 Whole life annuity A whole life annuity is a series of
More informationWeb-based Supplementary Materials for Bayesian Effect Estimation. Accounting for Adjustment Uncertainty by Chi Wang, Giovanni
1 Web-based Supplementary Materials for Bayesian Effect Estimation Accounting for Adjustment Uncertainty by Chi Wang, Giovanni Parmigiani, and Francesca Dominici In Web Appendix A, we provide detailed
More informationErrata and updates for ASM Exam C/Exam 4 Manual (Sixteenth Edition) sorted by page
Errata for ASM Exam C/4 Study Manual (Sixteenth Edition) Sorted by Page 1 Errata and updates for ASM Exam C/Exam 4 Manual (Sixteenth Edition) sorted by page Practice exam 1:9, 1:22, 1:29, 9:5, and 10:8
More informationStatistical Machine Learning from Data
Samy Bengio Statistical Machine Learning from Data 1 Statistical Machine Learning from Data Gaussian Mixture Models Samy Bengio IDIAP Research Institute, Martigny, Switzerland, and Ecole Polytechnique
More informationDongfeng Li. Autumn 2010
Autumn 2010 Chapter Contents Some statistics background; ; Comparing means and proportions; variance. Students should master the basic concepts, descriptive statistics measures and graphs, basic hypothesis
More informationLatent Class Regression Part II
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike License. Your use of this material constitutes acceptance of that license and the conditions of use of materials on this
More informationNormal Distribution. Definition A continuous random variable has a normal distribution if its probability density. f ( y ) = 1.
Normal Distribution Definition A continuous random variable has a normal distribution if its probability density e -(y -µ Y ) 2 2 / 2 σ function can be written as for < y < as Y f ( y ) = 1 σ Y 2 π Notation:
More informationWHERE DOES THE 10% CONDITION COME FROM?
1 WHERE DOES THE 10% CONDITION COME FROM? The text has mentioned The 10% Condition (at least) twice so far: p. 407 Bernoulli trials must be independent. If that assumption is violated, it is still okay
More informationChapter 7 Notes - Inference for Single Samples. You know already for a large sample, you can invoke the CLT so:
Chapter 7 Notes - Inference for Single Samples You know already for a large sample, you can invoke the CLT so: X N(µ, ). Also for a large sample, you can replace an unknown σ by s. You know how to do a
More informationA Mixed Model Approach for Intent-to-Treat Analysis in Longitudinal Clinical Trials with Missing Values
Methods Report A Mixed Model Approach for Intent-to-Treat Analysis in Longitudinal Clinical Trials with Missing Values Hrishikesh Chakraborty and Hong Gu March 9 RTI Press About the Author Hrishikesh Chakraborty,
More informationExact Confidence Intervals
Math 541: Statistical Theory II Instructor: Songfeng Zheng Exact Confidence Intervals Confidence intervals provide an alternative to using an estimator ˆθ when we wish to estimate an unknown parameter
More informationThe Probit Link Function in Generalized Linear Models for Data Mining Applications
Journal of Modern Applied Statistical Methods Copyright 2013 JMASM, Inc. May 2013, Vol. 12, No. 1, 164-169 1538 9472/13/$95.00 The Probit Link Function in Generalized Linear Models for Data Mining Applications
More informationLeast Squares Estimation
Least Squares Estimation SARA A VAN DE GEER Volume 2, pp 1041 1045 in Encyclopedia of Statistics in Behavioral Science ISBN-13: 978-0-470-86080-9 ISBN-10: 0-470-86080-4 Editors Brian S Everitt & David
More informationPrinciple of Data Reduction
Chapter 6 Principle of Data Reduction 6.1 Introduction An experimenter uses the information in a sample X 1,..., X n to make inferences about an unknown parameter θ. If the sample size n is large, then
More informationBayesian Model Averaging Continual Reassessment Method BMA-CRM. Guosheng Yin and Ying Yuan. August 26, 2009
Bayesian Model Averaging Continual Reassessment Method BMA-CRM Guosheng Yin and Ying Yuan August 26, 2009 This document provides the statistical background for the Bayesian model averaging continual reassessment
More informationUniversity of Maryland Fraternity & Sorority Life Spring 2015 Academic Report
University of Maryland Fraternity & Sorority Life Academic Report Academic and Population Statistics Population: # of Students: # of New Members: Avg. Size: Avg. GPA: % of the Undergraduate Population
More informationOptimizing Prediction with Hierarchical Models: Bayesian Clustering
1 Technical Report 06/93, (August 30, 1993). Presidencia de la Generalidad. Caballeros 9, 46001 - Valencia, Spain. Tel. (34)(6) 386.6138, Fax (34)(6) 386.3626, e-mail: bernardo@mac.uv.es Optimizing Prediction
More informationOutline. Topic 4 - Analysis of Variance Approach to Regression. Partitioning Sums of Squares. Total Sum of Squares. Partitioning sums of squares
Topic 4 - Analysis of Variance Approach to Regression Outline Partitioning sums of squares Degrees of freedom Expected mean squares General linear test - Fall 2013 R 2 and the coefficient of correlation
More informationEstimation and attribution of changes in extreme weather and climate events
IPCC workshop on extreme weather and climate events, 11-13 June 2002, Beijing. Estimation and attribution of changes in extreme weather and climate events Dr. David B. Stephenson Department of Meteorology
More informationImplementation of Pattern-Mixture Models Using Standard SAS/STAT Procedures
PharmaSUG2011 - Paper SP04 Implementation of Pattern-Mixture Models Using Standard SAS/STAT Procedures Bohdana Ratitch, Quintiles, Montreal, Quebec, Canada Michael O Kelly, Quintiles, Dublin, Ireland ABSTRACT
More informationGaussian Conjugate Prior Cheat Sheet
Gaussian Conjugate Prior Cheat Sheet Tom SF Haines 1 Purpose This document contains notes on how to handle the multivariate Gaussian 1 in a Bayesian setting. It focuses on the conjugate prior, its Bayesian
More informationValidation of Software for Bayesian Models using Posterior Quantiles. Samantha R. Cook Andrew Gelman Donald B. Rubin DRAFT
Validation of Software for Bayesian Models using Posterior Quantiles Samantha R. Cook Andrew Gelman Donald B. Rubin DRAFT Abstract We present a simulation-based method designed to establish that software
More informationDealing with Missing Data
Res. Lett. Inf. Math. Sci. (2002) 3, 153-160 Available online at http://www.massey.ac.nz/~wwiims/research/letters/ Dealing with Missing Data Judi Scheffer I.I.M.S. Quad A, Massey University, P.O. Box 102904
More informationDecompose Error Rate into components, some of which can be measured on unlabeled data
Bias-Variance Theory Decompose Error Rate into components, some of which can be measured on unlabeled data Bias-Variance Decomposition for Regression Bias-Variance Decomposition for Classification Bias-Variance
More informationCS 688 Pattern Recognition Lecture 4. Linear Models for Classification
CS 688 Pattern Recognition Lecture 4 Linear Models for Classification Probabilistic generative models Probabilistic discriminative models 1 Generative Approach ( x ) p C k p( C k ) Ck p ( ) ( x Ck ) p(
More informationRecent Developments of Statistical Application in. Finance. Ruey S. Tsay. Graduate School of Business. The University of Chicago
Recent Developments of Statistical Application in Finance Ruey S. Tsay Graduate School of Business The University of Chicago Guanghua Conference, June 2004 Summary Focus on two parts: Applications in Finance:
More informationMISSING DATA IMPUTATION IN CARDIAC DATA SET (SURVIVAL PROGNOSIS)
MISSING DATA IMPUTATION IN CARDIAC DATA SET (SURVIVAL PROGNOSIS) R.KAVITHA KUMAR Department of Computer Science and Engineering Pondicherry Engineering College, Pudhucherry, India DR. R.M.CHADRASEKAR Professor,
More informationPart 2: One-parameter models
Part 2: One-parameter models Bernoilli/binomial models Return to iid Y 1,...,Y n Bin(1, θ). The sampling model/likelihood is p(y 1,...,y n θ) =θ P y i (1 θ) n P y i When combined with a prior p(θ), Bayes
More informationData fusion with international large scale assessments: a case study using the OECD PISA and TALIS surveys
Kaplan and McCarty Large-scale Assessments in Education 2013, 1:6 RESEARCH Open Access Data fusion with international large scale assessments: a case study using the OECD PISA and TALIS surveys David Kaplan
More informationZHIYONG ZHANG AND LIJUAN WANG
PSYCHOMETRIKA VOL. 78, NO. 1, 154 184 JANUARY 2013 DOI: 10.1007/S11336-012-9301-5 METHODS FOR MEDIATION ANALYSIS WITH MISSING DATA ZHIYONG ZHANG AND LIJUAN WANG UNIVERSITY OF NOTRE DAME Despite wide applications
More informationThe Normal distribution
The Normal distribution The normal probability distribution is the most common model for relative frequencies of a quantitative variable. Bell-shaped and described by the function f(y) = 1 2σ π e{ 1 2σ
More informationLongitudinal Meta-analysis
Quality & Quantity 38: 381 389, 2004. 2004 Kluwer Academic Publishers. Printed in the Netherlands. 381 Longitudinal Meta-analysis CORA J. M. MAAS, JOOP J. HOX and GERTY J. L. M. LENSVELT-MULDERS Department
More information