Using Medical Research Data to Motivate Methodology Development among Undergraduates in SIBS Pittsburgh



Similar documents
Challenges in Longitudinal Data Analysis: Baseline Adjustment, Missing Data, and Drop-out

Problem of Missing Data

Analyzing Structural Equation Models With Missing Data

A Basic Introduction to Missing Data

Missing Data: Part 1 What to Do? Carol B. Thompson Johns Hopkins Biostatistics Center SON Brown Bag 3/20/13

2. Making example missing-value datasets: MCAR, MAR, and MNAR

MISSING DATA TECHNIQUES WITH SAS. IDRE Statistical Consulting Group

Handling missing data in Stata a whirlwind tour

Missing data and net survival analysis Bernard Rachet

Introduction to mixed model and missing data issues in longitudinal studies

Department/Academic Unit: Public Health Sciences Degree Program: Biostatistics Collaborative Program

Learning Objectives for Selected Programs Offering Degrees at Two Academic Levels

Missing Data in Longitudinal Studies: To Impute or not to Impute? Robert Platt, PhD McGill University

Missing Data. Katyn & Elena

Study Design and Statistical Analysis

How to choose an analysis to handle missing data in longitudinal observational studies

BS Environmental Science ( )

Missing Data & How to Deal: An overview of missing data. Melissa Humphries Population Research Center

Quality Assurance: Best Practices in Clinical SAS Programming. Parag Shiralkar

Handling attrition and non-response in longitudinal data

School of Public Health and Health Services Department of Epidemiology and Biostatistics

University of Michigan Dearborn Graduate Psychology Assessment Program

Cirrhosis and HCV. Jonathan Israel M.D.

A Mixed Model Approach for Intent-to-Treat Analysis in Longitudinal Clinical Trials with Missing Values

Guideline on missing data in confirmatory clinical trials

The Consequences of Missing Data in the ATLAS ACS 2-TIMI 51 Trial

Use advanced techniques for summary and visualization of complex data for exploratory analysis and presentation.

Statistics in Applications III. Distribution Theory and Inference

Organizing Your Approach to a Data Analysis

Ledipasvir and Sofosbuvir for 8 or 12 Weeks for Chronic HCV without Cirrhosis

Review of the Methods for Handling Missing Data in. Longitudinal Data Analysis

Applied Missing Data Analysis in the Health Sciences. Statistics in Practice

Robert G. Knodell, M.D. Maryland Chapter, American College of Physicians Fb February 3, 2012

Dealing with Missing Data

Missing values in data analysis: Ignore or Impute?

Randomized trials versus observational studies

Module 14: Missing Data Stata Practical

Dealing with Missing Data

Application in Predictive Analytics. FirstName LastName. Northwestern University

Missing Data Sensitivity Analysis of a Continuous Endpoint An Example from a Recent Submission

Handling missing data in large data sets. Agostino Di Ciaccio Dept. of Statistics University of Rome La Sapienza

Imputing Attendance Data in a Longitudinal Multilevel Panel Data Set

Antipsychotic drugs are the cornerstone of treatment

RFI Summary: Executive Summary

Multiple Imputation for Missing Data: A Cautionary Tale

American Academy of Neurology Section on Neuroepidemiology Resident Core Curriculum

BIOS 6660: Analysis of Biomedical Big Data Using R and Bioconductor, Fall 2015 Computer Lab: Education 2 North Room 2201DE (TTh 10:30 to 11:50 am)

Data Cleaning and Missing Data Analysis

The goals of this program in the Department of Exercise Science are to:

SPSS TRAINING SESSION 3 ADVANCED TOPICS (PASW STATISTICS 17.0) Sun Li Centre for Academic Computing lsun@smu.edu.sg

Overview. Longitudinal Data Variation and Correlation Different Approaches. Linear Mixed Models Generalized Linear Mixed Models

CONTENTS OF DAY 2. II. Why Random Sampling is Important 9 A myth, an urban legend, and the real reason NOTES FOR SUMMER STATISTICS INSTITUTE COURSE

MISSING DATA IMPUTATION IN CARDIAC DATA SET (SURVIVAL PROGNOSIS)

#3: SAMPLE CONSENT FORM

2.9 BACHELOR S DEGREES IN PUBLIC HEALTH. IF THE SCHOOL OFFERS BACCALAUREATE PUBLIC HEALTH DEGREES, THEY SHALL INCLUDE THE FOLLOWING ELEMENTS:

TRACKS INFECTIOUS DISEASE EPIDEMIOLOGY

College of Arts and Sciences: Social Science and Humanities Outcomes

Dr James Roger. GlaxoSmithKline & London School of Hygiene and Tropical Medicine.

Imputation and Analysis. Peter Fayers

SCRATCHING THE SURFACE OF PROBABILITY. Robert J. Russell, University of Paisley, UK

Missing data in randomized controlled trials (RCTs) can

HCUP Methods Series Missing Data Methods for the NIS and the SID Report #

Statistical Operations: The Other Half of Good Statistical Practice

Applied Clinical Research (POPM*6230) Fall 2015

Sensitivity Analysis in Multiple Imputation for Missing Data

Master of Science (MS) in Biostatistics Program Director and Academic Advisor:

Re-analysis using Inverse Probability Weighting and Multiple Imputation of Data from the Southampton Women s Survey

Missing Data. A Typology Of Missing Data. Missing At Random Or Not Missing At Random

EMR Systems and the Conduct of Clinical Research. Daniel E Ford, MD, MPH Vice Dean for Clinical Investigation Johns Hopkins School of Medicine

Selection bias in secondary analysis of electronic health record data. Sebastien Haneuse, PhD

Critical Appraisal of Article on Therapy

Data Management considerations in Observational studies

Apply with Resume to: Submenu Path Company/Careers/Current Openings/Job Type: Science

With Big Data Comes Big Responsibility

Elderly males, especially white males, are the people at highest risk for suicide in America.

Syllabus of the Dept. of Applied Statistics EAST West University. Graduate Program

A General Approach to Variance Estimation under Imputation for Missing Survey Data

CLINICAL RESEARCH GENERIC TASK DESCRIPTIONS

Using the Electronic Medical Record to Improve Evidence-based Medical Practice. P. Brian Smith Duke University Medical Center Durham, NC

EXPANDING THE EVIDENCE BASE IN OUTCOMES RESEARCH: USING LINKED ELECTRONIC MEDICAL RECORDS (EMR) AND CLAIMS DATA

Smarter Research. Joseph M. Jasinski, Ph.D. Distinguished Engineer IBM Research

University of Maryland School of Medicine Master of Public Health Program. Evaluation of Public Health Competencies

Transcription:

Using Medical Research Data to Motivate Methodology Development among Undergraduates in SIBS Pittsburgh Megan Marron and Abdus Wahed Graduate School of Public Health

Outline My Experience Motivation for Furthering Statistical Knowledge Advanced Statistical Topic: Missing Data Why is Missing Data a Problem? Missing Data Mechanisms Methods of Handling Missing Data Our Process Summary and Conclusion

My Experience 3 rd year PhD student in Biostatistics BS in Applied Statistics at Rochester Institute of Technology, 2011 SIBS 2010 cohort Center for Oral Health Research in Appalachia (COHRA) project Used logistic regression to examine demographic variables that were associated with whether or not a subject had dental caries Gave insight and advice on graduate school SIBS Teaching Assistant

My Experience Graduate Student Researcher for NIMH sponsored Center of Excellence in the Prevention and Treatment of Late Life Mood Disorders Clinical Trials and Observational Studies in older adults What I do: Attend scientific oversight meetings with PIs and collaborators Consult with clinicians about their hypotheses Develop analytic plans to answer their hypotheses Analyze data from a variety of independently funded research projects Assist clinicians in presenting their results Prepare statistical methods and results for manuscripts

Motivation for Furthering Statistical Knowledge Statisticians are in high demand Graduate School of Public Health Researchers need to be aware of potential statistical issues Ensure valid findings Limited background in statistics does not inhibit learning advanced topics SIBS Pittsburgh has successfully demonstrated this over the past 4 years

Advanced Statistical Topic: Missing Data Not taught in introductory courses Given perfect datasets or perform complete-case analysis Researchers should be familiar with: Different types of missing data Ways of handling missing data How missing data can effect results Why they should be familiar with these concepts: Save time and money Accurate results that are unbiased with small standard errors Nobody knows your study better than you Fundamental to research

Why is Missing Data a Problem? Biased estimates Larger standard errors Loss of information 19 Oct 2009

Missing Data Mechanisms An assumption about the nature of the missing values Missing Completely at Random (MCAR) Missing at Random (MAR) Missing not at Random (MNAR)

MCAR Probability of missing is independent of both observed and unobserved values Example: Weight Loss Study Missing a record of weight due to the scale breaking that day What we know: Subject had nothing to do with the scale breaking Assumption: MCAR Missingness has nothing to do with observed or unobserved measurements

MAR Probability of missing can be explained by observed data Example: Weight Loss Study Participant drops out after a month What we know: Their weight has been steadily increasing Assumption: MAR Missingness has to do with observed measurements

MNAR Probability of missing depends on the unobserved Example: Weight Loss Study Participant drops out after a month What we know: Past weight measurements give no clue to why they would drop out What we do not know: Subject didn t come in because they weighed themselves at home and realized they gained weight (unobserved) Assumption: MNAR Missingness has to do with unobserved measurements

Methods of handling Missing Data Complete Case Analysis Delete all records that have missing Assumes MCAR Loss of precision Inverse Probability Weighting Last Observation Carried Forward Multiple Regression Imputation

Our Process: 1 Introduce advanced statistical concepts in a small-group setting Actively involve trainees in collaborative research projects 2 Data analysis Apply statistical techniques to a Virahep-C data 3 Simulation Show trainees what happens when changing certain conditions 4 Presentation One of the best ways to learn something is to have to teach it to others

Our Process: 1 Introduce advanced statistical concepts in a small-group setting Missing data: - Different types - Why is it a problem? - Methods of handling each type - Potential impact on study results - Importance of justifying the type - Examples to differentiate between types

Our Process: 2 Data analysis: Virahep-C Study NIH/NIDDK-funded Study of Viral Resistance to Antiviral Therapy of Chronic Hepatitis C (Virahep-C) Background: African Americans (AA) with chronic Hepatitis C are less likely to respond to interferon-based antiviral treatment than Caucasian Americans (CA) Multicenter treatment trial with 196 AA and 205 CA Treatment: peginterferon and ribavirin

Our Process: 2 Data analysis: Virahep-C Study in SIBS Outcome: Change in log viral levels between week 12 and baseline Contains missing values Objectives: 1 Estimate mean change in viral levels between week 12 and baseline and mean differences between race 2 Assess associations of baseline demographic and clinical variables on the change in viral level Address objectives using each technique for handling missing data Compare results obtained from each technique

Our Process: 3 Simulation: How to Create Missing Data MCAR: Generate a random Binomial distribution If subject got a 0, then the value for their outcome was deleted MAR: Generate probabilities using a logistic model based off of observed values (age, sex, and treatment) Generate a Bernoulli random variable for each subject using their generated probability If subject got a 0, then the value for their outcome was deleted MNAR: If a subject s outcome is greater than 65 then it was deleted

Our Process: 3 Simulation Modify sample code to examine how different methods of analysis can result in different conclusions Calculate relative bias and standard error to see when each type of missing data is a problem Benefit of a simulation: True values are known Type of missing data is known

Our Process: 4 Presentation Teach other SIBS trainees and faculty: Why missing data is a problem Different types of missing data Methods of handling missing data How results differed under each method of analysis applied to the Virahep C study How results differed under each method of analysis using a simulation to create each type of missing data

Summary and Conclusion Our Process: 1 Introduce advanced statistical concepts in a small-group setting 2 Data analysis 3 Simulation 4 Presentation Using our project-based training program: Advanced statistical topics can be taught to those with limited statistical preparation Trainees were able to effectively explain techniques with useful examples that were easy to understand They are better prepared for dealing with common problems in medical research Gain an appreciation for statistical methods

Thank you for listening! Contact Information: Megan Marron mmm133@pitt.edu Acknowledgment: at the University of Pittsburgh National Heart, Lung, and Blood Institute