Imputation of missing data under missing not at random assumption & sensitivity analysis

Similar documents

Handling missing data in Stata a whirlwind tour

A Basic Introduction to Missing Data

Bayesian Approaches to Handling Missing Data

Introduction to mixed model and missing data issues in longitudinal studies

Reject Inference in Credit Scoring. Jie-Men Mok

Problem of Missing Data

PATTERN MIXTURE MODELS FOR MISSING DATA. Mike Kenward. London School of Hygiene and Tropical Medicine. Talk at the University of Turku,

1 Sufficient statistics

Sensitivity Analysis in Multiple Imputation for Missing Data

Bayesian Statistics in One Hour. Patrick Lam

Review of the Methods for Handling Missing Data in. Longitudinal Data Analysis

Missing data and net survival analysis Bernard Rachet

Handling missing data in large data sets. Agostino Di Ciaccio Dept. of Statistics University of Rome La Sapienza

Parametric fractional imputation for missing data analysis

Note on the EM Algorithm in Linear Regression Model

1 Prior Probability and Posterior Probability

Missing Data: Part 1 What to Do? Carol B. Thompson Johns Hopkins Biostatistics Center SON Brown Bag 3/20/13

Introduction to General and Generalized Linear Models

Analyzing Structural Equation Models With Missing Data

How to choose an analysis to handle missing data in longitudinal observational studies

Multiple Imputation for Missing Data: A Cautionary Tale

Dealing with Missing Data

Statistical modelling with missing data using multiple imputation. Session 4: Sensitivity Analysis after Multiple Imputation

Basics of Statistical Machine Learning

2 Precision-based sample size calculations

Final Mathematics 5010, Section 1, Fall 2004 Instructor: D.A. Levin

2. Making example missing-value datasets: MCAR, MAR, and MNAR

Electronic Theses and Dissertations UC Riverside

Math 151. Rumbos Spring Solutions to Assignment #22

Missing Data. A Typology Of Missing Data. Missing At Random Or Not Missing At Random

A General Approach to Variance Estimation under Imputation for Missing Survey Data

Challenges in Longitudinal Data Analysis: Baseline Adjustment, Missing Data, and Drop-out

A Latent Variable Approach to Validate Credit Rating Systems using R

Re-analysis using Inverse Probability Weighting and Multiple Imputation of Data from the Southampton Women s Survey

Confidence Intervals for the Difference Between Two Means

Chapter 7 - Practice Problems 1

R 2 -type Curves for Dynamic Predictions from Joint Longitudinal-Survival Models

Markov Chain Monte Carlo Simulation Made Simple

MISSING DATA TECHNIQUES WITH SAS. IDRE Statistical Consulting Group

Normal distribution. ) 2 /2σ. 2π σ

Logistic Regression. Jia Li. Department of Statistics The Pennsylvania State University. Logistic Regression

HYPOTHESIS TESTING (ONE SAMPLE) - CHAPTER 7 1. used confidence intervals to answer questions such as...

Missing Data & How to Deal: An overview of missing data. Melissa Humphries Population Research Center

HYPOTHESIS TESTING: POWER OF THE TEST

ADVANCE: a factorial randomised trial of blood pressure lowering and intensive glucose control in 11,140 patients with type 2 diabetes

4. Continuous Random Variables, the Pareto and Normal Distributions

Joint Exam 1/P Sample Exam 1

1.5 Oneway Analysis of Variance

Credit Risk Models: An Overview

problem arises when only a non-random sample is available differs from censored regression model in that x i is also unobserved

Applied Missing Data Analysis in the Health Sciences. Statistics in Practice

Missing Data in Longitudinal Studies: To Impute or not to Impute? Robert Platt, PhD McGill University

Handling attrition and non-response in longitudinal data

EXPANDING THE EVIDENCE BASE IN OUTCOMES RESEARCH: USING LINKED ELECTRONIC MEDICAL RECORDS (EMR) AND CLAIMS DATA

Department of Mathematics, Indian Institute of Technology, Kharagpur Assignment 2-3, Probability and Statistics, March Due:-March 25, 2015.

Dr James Roger. GlaxoSmithKline & London School of Hygiene and Tropical Medicine.

Overview. Longitudinal Data Variation and Correlation Different Approaches. Linear Mixed Models Generalized Linear Mixed Models

UNTIED WORST-RANK SCORE ANALYSIS

Non-Inferiority Tests for Two Means using Differences

APPLIED MISSING DATA ANALYSIS

Manual for SOA Exam MLC.

Web-based Supplementary Materials for Bayesian Effect Estimation. Accounting for Adjustment Uncertainty by Chi Wang, Giovanni

Errata and updates for ASM Exam C/Exam 4 Manual (Sixteenth Edition) sorted by page

Statistical Machine Learning from Data

Dongfeng Li. Autumn 2010

Latent Class Regression Part II

Normal Distribution. Definition A continuous random variable has a normal distribution if its probability density. f ( y ) = 1.

WHERE DOES THE 10% CONDITION COME FROM?

Chapter 7 Notes - Inference for Single Samples. You know already for a large sample, you can invoke the CLT so:

A Mixed Model Approach for Intent-to-Treat Analysis in Longitudinal Clinical Trials with Missing Values

Exact Confidence Intervals

The Probit Link Function in Generalized Linear Models for Data Mining Applications

Least Squares Estimation

Principle of Data Reduction

Bayesian Model Averaging Continual Reassessment Method BMA-CRM. Guosheng Yin and Ying Yuan. August 26, 2009

University of Maryland Fraternity & Sorority Life Spring 2015 Academic Report

Optimizing Prediction with Hierarchical Models: Bayesian Clustering

Outline. Topic 4 - Analysis of Variance Approach to Regression. Partitioning Sums of Squares. Total Sum of Squares. Partitioning sums of squares

Estimation and attribution of changes in extreme weather and climate events

Implementation of Pattern-Mixture Models Using Standard SAS/STAT Procedures

Gaussian Conjugate Prior Cheat Sheet

Validation of Software for Bayesian Models using Posterior Quantiles. Samantha R. Cook Andrew Gelman Donald B. Rubin DRAFT

Dealing with Missing Data

Decompose Error Rate into components, some of which can be measured on unlabeled data

CS 688 Pattern Recognition Lecture 4. Linear Models for Classification

Recent Developments of Statistical Application in. Finance. Ruey S. Tsay. Graduate School of Business. The University of Chicago

MISSING DATA IMPUTATION IN CARDIAC DATA SET (SURVIVAL PROGNOSIS)

Part 2: One-parameter models

ZHIYONG ZHANG AND LIJUAN WANG

The Normal distribution

Longitudinal Meta-analysis

Transcription:

Imputation of missing data under missing not at random assumption & sensitivity analysis S. Jolani Department of Methodology and Statistics, Utrecht University, the Netherlands Advanced Multiple Imputation, Utrecht, May 2013

Outline 1 2 3 4 5 6

Why missing not at random (MNAR)? There might be a reason to believe that responders differ from non-responders, even after accounting for the observed information Some examples: - Income - some people may not reveal their salaries - Blood pressure - the blood pressure is measured less frequently for patients with lower blood pressures - Depression - some patients might dropout because they believe the treatment is not effective

Notation Y : incomplete variable R: response (R = 1 if Y is observed) X: fully observed covariate Y obs and Y mis : the observed and missing parts of Y

A general strategy Y and R must be modeled jointly (Rubin, 1976) under an MNAR assumption so P(Y, R)

Why the classical MI does not work? Imputation under MAR P(Y X, R = 0) = P(Y X, R = 1)

Why the classical MI does not work? Imputation under MAR P(Y X, R = 0) = P(Y X, R = 1) Imputation under MNAR P(Y X, R = 0) P(Y X, R = 1)

Models for Two general approaches (there are some more): 1 (Heckman, 1976) 2 - (Rubin, 1977)

Selection model P(Y, R; ξ, ω) = P(Y ; ξ)p(r Y ; ω), where the parameters ξ and ω are a priori independent. P(Y ; ξ) distribution for the full data P(R Y ; ω) response mechanism (selection function)

Selection model Imputation model under MNAR where P(Y mis X, Y obs, R) = P(Y mis X, Y obs, R) P(Y mis X, Y obs )P(R X, Y ) P(Ymis X, Y obs )P(R X, Y ) Y mis

Selection model Imputation model under MNAR P(Y mis X, Y obs, R) A simple but possibly inefficient approach (Rubin, 1987): 1 Draw a candidate Y i P(Y i X i ; ξ = ξ ) 2 Calculate p i = P(R i = 1 X i, Y i = Y i ; ω) 3 Draw R i Ber(1, p i ) 4 Impute Y i if R i = 0 otherwise return to (1)

model P(Y, R; ψ, θ) = P(R; ψ)p(y R; θ), where the parameters ψ and θ are a priori independent. P(Y X, R = 1; θ 1 ) distribution for the observed data P(Y X, R = 0; θ 0 ) distribution for the missing data P(R; ψ) marginal response probability

model The general procedure (Rubin, 1977): 1 Draw θ1 from its posterior distribution using P(Y X, R = 1; θ 1 ) 2 Specify the posterior P(θ 0 θ 1 ) a priori (e.g., θ 0 = θ 1 + k where k is a fixed constant) 3 Draw θ 0 P(θ 0 θ 1 ) 4 Impute Y mis from P(Y X, R = 0; θ 0 )

An example Suppose Y is an incomplete variable (continuous) Y obs N(µ 1, σ 2 1 ), Y mis N(µ 0, σ 2 0 ) where θ 1 = (µ 1, σ1 2) and θ 0 = (µ 0, σ0 2 ). Now, if we define µ 0 = µ 1 + k 1, σ 2 0 = k 2σ 2 1 where k 1 and k 2 are fixed and known values (sensitivity parameters).

An example Suppose Y is an incomplete variable (continuous) Y obs N(µ 1, σ 2 1 ), Y mis N(µ 0, σ 2 0 ) where θ 1 = (µ 1, σ1 2) and θ 0 = (µ 0, σ0 2 ). Now, if we define µ 0 = µ 1 + k 1, σ 2 0 = k 2σ 2 1 where k 1 and k 2 are fixed and known values (sensitivity parameters). Sensitivity analysis: repeat the analysis for different choices of k 1 and k 2

Application: cohort study N=1236, 85+ on Dec. 1, 1986 N=956 were visited (1987-1989) Blood pressure (BP) is missing for 121 patients * Do anti-hypertensive drugs shorten life in the oldest old? * Scientific interest: Mortality risk as function of BP and age

Imputation techniques > Simple non-ignorable Survival Survival probability probability by response by response group group 1.0 0.8 BP measured K M Survival probability 0.6 0.4 BP missing 0.2 0.0 0 1 2 3 4 5 6 7 Years since intake Source: van Buuren et al. (1999) Research Master MS

From the data we see Those with no BP measured die earlier Those that die early and that have no hypertension history have fewer BP measurements Thus, s of BP under MAR could be too high values. We need to lower the imputed values of BP, and study the influence on the outcome

A simple model to shift s Y : BP X: age, hypertension, haemoglobin, and etc Specify P(Y X, R) Combined formulation: Model 1 Y = Xβ + ɛ R = 1 2 Y = Xβ + δ + ɛ R = 0 Y = Xβ + (1 R)δ + ɛ δ cannot be estimated (sensitivity parameter)

Imputation techniques > Simple non-ignorable Numerical example Both Y Selection model Mixture model Source: van Buuren et al. (1999) Research Master MS

How to impute under MNAR in MICE? > delta <- c(0,-5,-10,-15,-20) > post <- mice(leiden85,maxit=0)$post > imp.all <- vector("list", length(delta)) > for (i in 1:length(delta)) { + d <- delta[i] + cmd <- paste("imp[[j]][,i] <- imp[[j]][,i] +",d) + post["bp"] <- cmd + imp <- mice(leidan85, post=post, seed=i*22, print=false) + imp.all[[i]] <- imp + }

140 0)15 0)15 0)15 0)19 150 0)10 0)30 0)31 0)25 160 0)08 0)15 0)16 0)10 170 0)06 0)10 0)11 0)05 180 0)04 0)05 0)05 0)02 190 0)02 0)03 0)03 0)00 200 0)00 0)02 0)02 0)00 Mean (mmhg) 150 151)6 138)6 : Sensitivity analysis Table V. Mean and standard deviation of the observed and imputed blood pressures. The statistics of imputed BP are pooled over m"5 multiple s N δ SBP DBP Mean SD Mean SD Observed BP 835 152)9 25)7 82)8 13)1 Imputed BP 121 0 151)1 26)2 81)5 14)0 121!5 142)3 24)6 78)4 13)7 121!10 135)9 24)7 78)2 12)8 121!15 128)6 25)0 75)3 12)9 121!20 122)3 25)2 74)0 12)1 Source: van Buuren et al. (1999) to model A of the introduction. It was expected that multiple would raise the estimates in comparison with the analysis based on the complete cases, but the results do confirm this. Only slight differences in mortality exist, even among non-response w very different δ s. It appears that, for this application, risk estimates are insensitive to the mis data and the various non-response used to deal with them.

Combined formulation: Y = Xβ + (1 R)δ + ɛ if ɛ N(0, σ 2 ), then Y obs N(Xβ, σ 2 ) (1) Y mis N(Xβ + δ, σ 2 ) (2) P(R = 1)P(Y X, R = 1) logit{p(r = 1 X, Y )} = log[ P(R = 0)P(Y X, R = 0) ] where ψ 1 = δ/σ 2 so that δ = ψ 1 σ 2. = ψ 0 + ψ 1 Y + ψ 2 X, (3)

Assume P(R = 1 X, Y ) is known (unrealistic!) Y R 1 - P(R = 1 X, Y) R 1 200 1.00 1 195 1.02 1 183 1.06 1 180 1.09 1 176 1.10 0 160 1.15 0 140 1.20 0. 0.25 1. 0.30 1. 0.38 0. 0.42 0. 0.45 0. 0.50 0

Gr Y R R 1 E(Y X,R, R 1 ) 1 200 1 1 µ 11 195 1 1 183 1 1 180 1 1 2 176 1 0 µ 10 160 1 0 140 1 0 3. 0 1 µ 01. 0 1 4. 0 0 µ 00. 0 0. 0 0. 0 0 It can be shown that µ 10 = µ 01 µ 11 µ 10 µ 01 µ 00 The idea? Impute group 3 from group 2 Impute group 4 from groups 2 and 1

But in reality P(R = 1 X, Y ) is unknown Y R R X Figure : The schematic representation of the data

But in reality P(R = 1 X, Y ) is unknown Y R R X Fully Conditional Specification: Y P(Y X, R, R 1 ) R 1 P(R 1 X, Y ) Figure : The schematic representation of the data

1 Impute initially missing values (Y ) 2 Draw Ṙ from a Bernouli process (Ṙ Ber(1, π)) where π = P(R = 1 X, Y ) 3 Using groups 1 and 2, estimate β and δ from E(Y X, R = 1, R 1 = r 1 ) = Xβ + δ(r 1 1), r 1 = 0, 1 4 Draw β from its posterior distribution for a given prior for β 5 Predict the missing data for group 3 using X β ˆδ 6 Predict the missing data for group 4 using X β 2ˆδ 7 Impute the missing data by adding an appropriate amount of noise to the predicted values 8 Return to Step 2

How to implement the drawn method in MICE? > mice(data, meth = "ri") The RI function: > mice.impute.ri(y, ry, x, ri.maxit = 10,...) Note: 1 only for continuous variables (the current version) 2 the same covariates for both (the current version)

Summary Participants: 956 Observed BP: 835 Missing BP: 121 Imputation model BP sex, age, hypertension, haemoglobin, etc. Missingness mechanism logit{p(r = 1 Y, X)} BP, type of residence, ADL, previous hypertension, etc. - Number of iterations: 10 - Number of multiple s: 50

Table : Mean and standard error (SE) for the systolic blood pressure using CC, MI and RI Total Imputed Method Mean SE Mean SE CC 152.893 0.892 - - MI 152.473 0.924 149.47 2.409 RI 151.075 1.109 139.06 2.438

An interesting result: Research Master MS Using the RI method, we are able to estimate ˆδ = 139.1 152.9 = 13.8. This value is very similar to the amount of the adjustment in van Buuren et al. (1999) based on a numerical example.

Effect of response mechanism on BP density 0.000 0.005 0.010 0.015 0.020 50 100 150 200 250 density 0.00 0.05 0.10 0.15 0.20 0.25 0.30 0.35 100 120 140 160 180 200 Systolic BP (mmhg) BP Leiden85+ (the drawn method) Numerical example (van Buuren et al. 1999)

A summary of the under MNAR 1 All methods for the incomplete data under MNAR make unverified assumptions 2 Selection model: the distribution of the full data 3 : the distribution of the missing data 4 : the distribution of the selection function

General advice on MNAR 1 Why is the ignorability assumption is suspected? (why MNAR assumption) 2 Include as much data as possible in the model 3 Limit the possible non-ignorable alternatives