Analysis Issues II. Mary Foulkes, PhD Johns Hopkins University

Similar documents
AVOIDING BIAS AND RANDOM ERROR IN DATA ANALYSIS

Survey Research: Choice of Instrument, Sample. Lynda Burton, ScD Johns Hopkins University

Comparison/Control Groups. Mary Foulkes, PhD Johns Hopkins University

MISSING DATA: THE POINT OF VIEW OF ETHICAL COMMITTEES

TUTORIAL on ICH E9 and Other Statistical Regulatory Guidance. Session 1: ICH E9 and E10. PSI Conference, May 2011

Guideline on missing data in confirmatory clinical trials

The Consequences of Missing Data in the ATLAS ACS 2-TIMI 51 Trial

Life Tables. Marie Diener-West, PhD Sukon Kanchanaraksa, PhD

Lecture 25. December 19, Department of Biostatistics Johns Hopkins Bloomberg School of Public Health Johns Hopkins University.

The Product Review Life Cycle A Brief Overview

Intent-to-treat Analysis of Randomized Clinical Trials

INTERNATIONAL CONFERENCE ON HARMONISATION OF TECHNICAL REQUIREMENTS FOR REGISTRATION OF PHARMACEUTICALS FOR HUMAN USE

Likelihood Approaches for Trial Designs in Early Phase Oncology

Organizing Your Approach to a Data Analysis

Testing Multiple Secondary Endpoints in Confirmatory Comparative Studies -- A Regulatory Perspective

What are confidence intervals and p-values?

Study Design and Statistical Analysis

SENSITIVITY ANALYSIS AND INFERENCE. Lecture 12

Fixed-Effect Versus Random-Effects Models

Biostat Methods STAT 5820/6910 Handout #6: Intro. to Clinical Trials (Matthews text)

Summary and general discussion

Randomized trials versus observational studies

Study Designs. Simon Day, PhD Johns Hopkins University

Guidance for Industry

Survey Research: Designing an Instrument. Ann Skinner, MSW Johns Hopkins University

Online 12 - Sections 9.1 and 9.2-Doug Ensley

The Prevention and Treatment of Missing Data in Clinical Trials: An FDA Perspective on the Importance of Dealing With It

Bios 6648: Design & conduct of clinical research

Use of the Chi-Square Statistic. Marie Diener-West, PhD Johns Hopkins University

Measures of Prognosis. Sukon Kanchanaraksa, PhD Johns Hopkins University

Imputation and Analysis. Peter Fayers

Guidance for Industry

PROTOCOL SYNOPSIS Evaluation of long-term opioid efficacy for chronic pain

CONTENTS OF DAY 2. II. Why Random Sampling is Important 9 A myth, an urban legend, and the real reason NOTES FOR SUMMER STATISTICS INSTITUTE COURSE

Industrial Hygiene Concepts

Cancer Biostatistics Workshop Science of Doing Science - Biostatistics

Study Design. Date: March 11, 2003 Reviewer: Jawahar Tiwari, Ph.D. Ellis Unger, M.D. Ghanshyam Gupta, Ph.D. Chief, Therapeutics Evaluation Branch

TIPS DATA QUALITY STANDARDS ABOUT TIPS

UNTIED WORST-RANK SCORE ANALYSIS

A Mixed Model Approach for Intent-to-Treat Analysis in Longitudinal Clinical Trials with Missing Values

Priority Program Translational Oncology Applicants' Guidelines

The PCORI Methodology Report. Appendix A: Methodology Standards

How to Approach a Study: Concepts, Hypotheses, and Theoretical Frameworks. Lynda Burton, ScD Johns Hopkins University

Appendix B Checklist for the Empirical Cycle

The Competitive Environment: Sales and Marketing

Two-Sample T-Tests Assuming Equal Variance (Enter Means)

Two-Sample T-Tests Allowing Unequal Variance (Enter Difference)

If several different trials are mentioned in one publication, the data of each should be extracted in a separate data extraction form.

Measurement: Reliability and Validity Measures. Jonathan Weiner, DrPH Johns Hopkins University

Personalized Predictive Medicine and Genomic Clinical Trials

Latent Class Regression Part II

Auxiliary Variables in Mixture Modeling: 3-Step Approaches Using Mplus

Guidance for Industry Non-Inferiority Clinical Trials

UNDERSTANDING THE INDEPENDENT-SAMPLES t TEST

Chapter 5: Analysis of The National Education Longitudinal Study (NELS:88)

Missing Data Sensitivity Analysis of a Continuous Endpoint An Example from a Recent Submission

Missing data in randomized controlled trials (RCTs) can

Recall this chart that showed how most of our course would be organized:

Clinical Study Synopsis

Beware that Low Urine Creatinine! by Vera F. Dolan MSPH FALU, Michael Fulks MD, Robert L. Stout PhD

CHANCE ENCOUNTERS. Making Sense of Hypothesis Tests. Howard Fincher. Learning Development Tutor. Upgrade Study Advice Service

Chapter 7 Notes - Inference for Single Samples. You know already for a large sample, you can invoke the CLT so:

Missing Data: Part 1 What to Do? Carol B. Thompson Johns Hopkins Biostatistics Center SON Brown Bag 3/20/13

Introduction to. Hypothesis Testing CHAPTER LEARNING OBJECTIVES. 1 Identify the four steps of hypothesis testing.

Service courses for graduate students in degree programs other than the MS or PhD programs in Biostatistics.

Experimental Design. Power and Sample Size Determination. Proportions. Proportions. Confidence Interval for p. The Binomial Test

Chapter 8 Hypothesis Testing Chapter 8 Hypothesis Testing 8-1 Overview 8-2 Basics of Hypothesis Testing

C. The null hypothesis is not rejected when the alternative hypothesis is true. A. population parameters.

Comparison of frequentist and Bayesian inference. Class 20, 18.05, Spring 2014 Jeremy Orloff and Jonathan Bloom

Interim Analysis in Clinical Trials

Applied Missing Data Analysis in the Health Sciences. Statistics in Practice

The Research Proposal

Credibility and Pooling Applications to Group Life and Group Disability Insurance

Guidelines for AJO-DO submissions: Randomized Clinical Trials June 2015

1.0 Abstract. Title: Real Life Evaluation of Rheumatoid Arthritis in Canadians taking HUMIRA. Keywords. Rationale and Background:

Consumer s Guide to Research on STEM Education. March Iris R. Weiss

CHOICE OF CONTROL GROUP AND RELATED ISSUES IN CLINICAL TRIALS E10

MISSING DATA IMPUTATION IN CARDIAC DATA SET (SURVIVAL PROGNOSIS)

Bayesian Phase I/II clinical trials in Oncology

Statistical Methods for Sample Surveys ( )

Biostatistics and Epidemiology within the Paradigm of Public Health. Sukon Kanchanaraksa, PhD Marie Diener-West, PhD Johns Hopkins University

Data Quality Assessment: A Reviewer s Guide EPA QA/G-9R

Basic Results Database

Measurement in ediscovery

Cohort Studies. Sukon Kanchanaraksa, PhD Johns Hopkins University

WISE Power Tutorial All Exercises

Cost-Benefit and Cost-Effectiveness Analysis. Kevin Frick, PhD Johns Hopkins University

Sensitivity Analysis in Multiple Imputation for Missing Data

Introduction to Path Analysis

t Tests in Excel The Excel Statistical Master By Mark Harmon Copyright 2011 Mark Harmon

Critical appraisal skills are essential to informed decision-making

Transcription:

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike License. Your use of this material constitutes acceptance of that license and the conditions of use of materials on this site. Copyright 2008, The Johns Hopkins University and Mary Foulkes. All rights reserved. Use of these materials permitted only in accordance with license rights granted. Materials provided AS IS ; no representations or warranties provided. User assumes all responsibility for use, and all liability related thereto, and must independently review all materials for accuracy and efficacy. May contain materials owned by others. User is responsible for obtaining permissions for use from third parties as needed.

Analysis Issues II Mary Foulkes, PhD Johns Hopkins University

Section A Missing Data

Causes of Missing Data Dropout Noncompliance with measurement procedure Missed visit Refused procedure Error Test not done Test done improperly Results lost 4

Ideal Study No dropouts No noncompliance No errors No loss to follow-up 5

Big Worry Missing data may be associated with outcome Form of this association is unknown Failing to account for the (true) association, may bias results 6

Bias Null hypothesis: no difference If null hypothesis is true, and it systematically eliminates patients with poorer prognosis on one treatment arm, results will be biased towards the alternative hypothesis If results are biased in this way, the type one error (false positive rate) is inflated Presence of bias means significance tests are unreliable 7

Missing Data and the Intention-to-Treat (ITT) Principle ITT: analyze all randomized patients in groups What to do when data are unavailable? Implication of ITT principle for design: collect all required data on all patients, regardless of compliance with treatment Thus, you avoid missing data 8

Dilemma Excluding patients with missing values can bias results and increase type one error (false positives) Collecting and analyzing outcome data on non-compliant patients may dilute results and increase type two error (false negatives) But we can compensate for this dilution with sample size increases 9

Different Questions Is this treatment safe and effective when used as directed in a welldefined population? Is this drug safe and effective when used for its intended purpose by practicing physicians? 10

Relevance of Methodology Quadratic function of am t missing Methodology matters little when... No data are missing: analysis is straightforward Lots of data are missing: no analysis is credible Methodology is more important when modest/moderate amounts of data are missing Not enough to demolish credibility Enough to potentially shift conclusions 11

Handling Missing Data: Design and Conduct Select primary endpoints that are more robust to missing data Change from first to final visit Slope Expend effort on collecting outcome data on all patients entered Visit reminders General patient encouragement Collection of data regardless of compliance with treatment 12

Handling Missing Data: Analysis Exclude subjects with missing values Last Observation Carried Forward (LOCF) Group means Multiple imputation Best/worst case scenarios 13

Exclusions of Subjects Excluding subjects with missing values is the simplest approach Requires the assumption that excluded subjects are a random subset of all randomized subjects Analysis is straightforward Assumption noted above is difficult to justify in most instances 14

Exclusion: Source of Bias Randomization ensures that there are no systematic differences between treatment groups Excluding subjects from analysis on the basis of post-randomization events, e.g., (non-compliance), may introduce systematic differences between treatment groups, thereby biasing the comparison 15

Imputation Determine a value that is best guess of true value of missing data point Several approaches proposed and/or in use Simplest approaches are statistically most problematic Any approach involving made-up data is problematic to some degree 16

Imputation Last Observation Carried Forward (LOCF) Use last measurement available in patients with missing data after a certain point Group means Assign average value of outcome variable among those in that treatment group with complete data Multiple imputation* Predict missing outcome on the basis of outcomes for other patients with similar characteristics *Source: Rubin, D.B. (1987). Multiple Imputation for Nonresponse in Surveys. Wiley. New York. 17

Assumptions Required LOCF: last available measure is a good estimate of final (missing) measure Mean substitution: assigning group average to missing values will result in a good estimate of treatment effect Multiple imputation: one can create a model using data on covariates to accurately estimate missing value 18

Problems with Assumptions LOCF: data may be missing because of some aspect of treatment Patient not responding Treatment not tolerable so that early data may not reflect true treatment effect Mean substitution is like excluding patients with missing data, except that power is (artificially) maintained Multiple imputation approach accounts for variability in estimating missing values, but assumption that we can predict outcome accurately based on measured covariates remains problematic 19

Bottom Line Assessments based on imputation models are reliable only insofar as one has confidence in the assumptions 20

Worst/Best Case Scenarios (Sensitivity Analysis) Neither exclusions nor imputation Provides bounds for the true results if all planned data points had been observed Provides a sense of how far off any of the other analyses could be Does not provide a single answer but aids in the interpretation of other answers 21

An Ounce of Prevention... Discourage dropout Encourage compliance Institute QC checks to minimize errors 22

Summary The less there is missing data, the there is less concern about... Bias Reliable conclusions Appropriate methods of analysis Methods to replace or otherwise account for missing data are all flawed in important ways Sensitivity analyses are essential to evaluating reliability of conclusions 23

In the Next Section We ll Look at... Multiplicity Multiple treatment arms Multiple controls Multiple subgroups 24

Section B Multiplicity

Multiplicity in Trials Refers to the multiple judgments and inferences Hypothesis tests Confidence intervals Graphical analyses Leads to a concern about false positives or an inflation of type one error 26

Example Chance (probability) of drawing the ace of clubs by randomly selecting a card from a complete, shuffled deck is 1/52 Chance of drawing the ace of clubs at least once by randomly selecting a card from a complete, shuffled deck 100 times is...? 27

Sources of Multiplicity Multiple treatment arms Doses, regimens Treatments Multiple controls Internal External (historical) Multiple evaluation groups Evaluable subgroup Per protocol subgroup Demographic and disease-based subgroups 28

Sources of Multiplicity Multiple endpoints Efficacy variables Evaluation time-points Multiple analyses Statistical tests Dichotomization and cut-points Approaches to covariates or missing data Multiple interim analyses of accumulating data Multiple studies, sets of studies 29

Young s Rules With enough testing, false positives will occur Internal evidence will not contradict a false positive results (i.e., don t expect to figure out which are the false positives) Good investigators will come up with possible explanations It only happens to the other guy 30

Approaches to the Problem Do only one test Adjust the p-values Perform the desired tests at the nominal level and warn reader that no account has been taken for multiple testing Ignore the problem 31

Bonferroni Adjustment Most common approach Can severely reduce power when many comparisons are made Conservative when comparisons are not independent Methods exist to account for inter-correlation of results in adjusting significance thresholds 32

Multivariate Testing Mutually exclusive groups Multivariate ( global ) tests assess whether all groups are similar If conclusion of global test is no, then do pair-wise tests If conclusion of global test is yes, stop Problem: if all are similar but one, the power for a global test may be low Note: if multiple arms represent different doses, you should test for dose-response rather than heterogeneity 33

Limitation of Comparisons Select a single primary hypothesis; treat others as exploratory Pre-specify comparisons of interest Create composite variables Refrain from data-dredging 34

Multiplicity when Lumping or Splitting Splitting a trial with focus on the positive subgroup generally leads to a misleading result ISIS-2 trial [Lancet. (1988). 349-360] The subgroup result by astrological signs at birth of Gemini or Libra came out to be slightly unfavorable for aspirin However, the subgroup result by the other astrological sign gave a result in favor of aspirin with p-value < 0.00001 Lumping trials (say, given two trials) only when at least one of them does not give statistically significant result inflates the type I error rate Source: Lu and Huque. (2001). Proper planning for multiplicity, replicability and clinical validity at the design stage is necessary. 35

Some Final Comments Address multiplicity issues at the design stage Different questions require different approaches Clinical subset decision rules, involving multiple endpoints, inflate type one error rate and require adjustments With multiple step procedures, interpretation of conditional results and computation of confidence intervals can be problematic Testing for large families of hypotheses using methods such as FDR (False Discovery Rate) should be considered exploratory strategies 36

In the Next Lecture We ll Look at... Non-inferiority Active control trials Specifying delta Assay sensitivity Potential problems 37