Statistical Methods for Biomarker Discovery

Similar documents
ACUTE RESPIRATORY DISTRESS SYNDROME (ARDS) S. Agarwal, MD, S. Kache MD

THE RISK DISTRIBUTION CURVE AND ITS DERIVATIVES. Ralph Stern Cardiovascular Medicine University of Michigan Ann Arbor, Michigan.

Sepsis: Identification and Treatment

Inflammatory or cardiogenic lung edema? It does matter!

PSA Testing 101. Stanley H. Weiss, MD. Professor, UMDNJ-New Jersey Medical School. Director & PI, Essex County Cancer Coalition. weiss@umdnj.

The Initial and 24 h (After the Patient Rehabilitation) Deficit of Arterial Blood Gases as Predictors of Patients Outcome

"Statistical methods are objective methods by which group trends are abstracted from observations on many separate individuals." 1

1. What is the prostate-specific antigen (PSA) test?

TACO vs. TRALI: Recognition, Differentiation, and Investigation of Pulmonary Transfusion Reactions

Cancer Screening. Robert L. Robinson, MD, MS. Ambulatory Conference SIU School of Medicine Department of Internal Medicine.

NOVEL PLATFORMS FOR CANCER DIAGNOSIS

RISK STRATIFICATION for Acute Coronary Syndrome in the Emergency Department

Cohort Studies. Sukon Kanchanaraksa, PhD Johns Hopkins University

6/5/2014. Objectives. Acute Coronary Syndromes. Epidemiology. Epidemiology. Epidemiology and Health Care Impact Pathophysiology

Tips for surviving the analysis of survival data. Philip Twumasi-Ankrah, PhD

Measures of Prognosis. Sukon Kanchanaraksa, PhD Johns Hopkins University

Your Guide to Express Critical Illness Insurance Definitions

Mortality Assessment Technology: A New Tool for Life Insurance Underwriting

The 4Kscore blood test for risk of aggressive prostate cancer

Elevated Serum Lactate Dehydrogenase Values in Children with Multiorgan Involvements and Severe Febrile Illness

Freiburg Study. The other 24 subjects had healthy markers closer to what would be considered ideal.

The American Cancer Society Cancer Prevention Study I: 12-Year Followup

Omega-3 fatty acids improve the diagnosis-related clinical outcome. Critical Care Medicine April 2006;34(4):972-9

Appendix: Description of the DIETRON model

DISCLOSURES RISK ASSESSMENT. Stroke and Heart Disease -Is there a Link Beyond Risk Factors? Daniel Lackland, MD

ESCMID Online Lecture Library. by author

Clinical Research on Lifestyle Interventions to Treat Obesity and Asthma in Primary Care Jun Ma, M.D., Ph.D.

Detection and staging of recurrent prostate cancer is still one of the important clinical problems in prostate cancer. A rise in PSA or biochemical

MANCHESTER Lung Cancer Screening Program Dartmouth-Hitchcock Manchester 100 Hitchcock Way Manchester, NH (603)

Your Life Your Health Cariodmetabolic Risk Syndrome Part VII Inflammation chronic, low-grade By James L. Holly, MD The Examiner January 25, 2007

7. Prostate cancer in PSA relapse

The Trials and Tribulations of the CRM: the DFO Experience

Butler Memorial Hospital Community Health Needs Assessment 2013

The 4Kscore blood test for risk of aggressive prostate cancer

Prostatectomy, pelvic lymphadenect. Med age 63 years Mean followup 53 months No other cancer related therapy before recurrence. Negative.

Pulmonary interstitium. Interstitial Lung Disease. Interstitial lung disease. Interstitial lung disease. Causes.

Section 14 Simple Linear Regression: Introduction to Least Squares Regression

Early Prostate Cancer: Questions and Answers. Key Points

The Berlin definition of Severe ARDS includes assessment of which of the following?

Testimony of. Dr. James Crapo. April 26, 2005

Prostate Health Index Literature

Prognostic impact of uric acid in patients with stable coronary artery disease

From AARC Protocol Committee; Subcommittee Adult Critical Care Version 1.0a (Sept., 2003), Subcommittee Chair, Susan P. Pilbeam

Appendix G - Identification and Selection of Studies

Birla Sun Life Insurance Riders. Birla Sun Life Insurance Company Limited

Prostate Cancer Screening. Dr. J. McCracken, Urologist

Biostatistics and Epidemiology within the Paradigm of Public Health. Sukon Kanchanaraksa, PhD Marie Diener-West, PhD Johns Hopkins University

Screening for Prostate Cancer

Saving healthcare costs by implementing new genetic risk tests for early detection of cancer and prevention of cardiovascular diseases

Stress is linked to exaggerated cardiovascular reactivity. 1) Stress 2) Hostility 3) Social Support. Evidence of association between these

Diseases. Inflammations Non-inflammatory pleural effusions Pneumothorax Tumours

PSA screening in asymptomatic men the debate continues keyword: psa

An Update on Lung Cancer Diagnosis

Examples of good screening tests include: mammography for breast cancer screening and Pap smears for cervical cancer screening.

Individual patient data meta-analysis of continuous diagnostic markers

Us TOO University Presents: Understanding Diagnostic Testing

African Americans & Cardiovascular Diseases

The Sepsis Puzzle: Identification, Monitoring and Early Goal Directed Therapy

MANAGEMENT OF LIPID DISORDERS: IMPLICATIONS OF THE NEW GUIDELINES

Supplemental Material. Paradoxical association of enhanced cholesterol efflux with increased incident cardiovascular risks

Specialty Excellence Award and America s 100 Best Hospitals for Specialty Care Methodology Contents

Trauma Audit & Research Network. CORE Screens user guide

4/4/2013. Mike Rizo, Pharm D, MBA, ABAAHP THE PHARMACIST OF THE FUTURE? METABOLIC SYNDROME AN INTEGRATIVE APPROACH

Population Health Management Program

Ventilation Perfusion Relationships

Clinically Actionable Biomarkers in Rheumatoid Arthritis

Page 1. Name: 1) Choose the disease that is most closely related to the given phrase. Questions 10 and 11 refer to the following:

L Lang-Lazdunski, A Bille, S Marshall, R Lal, D Landau, J Spicer

National Emphysema Treatment Trial (NETT) Consent for Screening and Patient Registry

Understanding CA 125 Levels A GUIDE FOR OVARIAN CANCER PATIENTS. foundationforwomenscancer.org

NCCN Prostate Cancer Early Detection Guideline

Guide to Biostatistics

PCA3 DETECTION TEST FOR PROSTATE CANCER DO YOU KNOW YOUR RISK OF HAVING CANCER?

Risk Factors for Fire Fighter Cardiovascular Disease

Coronary Artery Disease leading cause of morbidity & mortality in industrialised nations.

Advances in Diagnostic and Molecular Testing in Prostate Cancer

Comparing Functional Data Analysis Approach and Nonparametric Mixed-Effects Modeling Approach for Longitudinal Data Analysis

Early Detection Research Network: Validation Infrastructure. Jo Ann Rinaudo, Ph.D. Division of Cancer Prevention

Chronic Critical Illness: Can it be prevented? Carmen C Polito, MD Pulmonary & Critical Care Medicine Emory University Atlanta, GA cpolito@emory.

A Report on the Health and Cultural Status of Alaska Native Elders

Progress and Prospects in Ovarian Cancer Screening and Prevention

Marilyn Borkgren-Okonek, APN, CCNS, RN, MS Suburban Lung Associates, S.C. Elk Grove Village, IL

Lecture one lung pathology 4 th year MBBS. Dr Asgher Khan Demonstrator of pathology Rawalpindi Medical College Rwp.

Hindrik Vondeling Department of Health Economics University of Southern Denmark Odense, Denmark

PSA Screening and the USPSTF Understanding the Controversy

Estimated New Cases of Leukemia, Lymphoma, Myeloma 2014

Evaluation of Predictive Models

BUNDLES IN 2013: SURVIVING SEPSIS CAMPAIGN

Using Medicare Hospitalization Information and the MedPAR. Beth Virnig, Ph.D. Associate Dean for Research and Professor University of Minnesota

Overview. Geriatric Overview. Chapter 26. Geriatrics 9/11/2012

co-sponsored by the Health & Physical Education Department, the Health Services Office, and the Student Development Center

Prostate cancer statistics

TO SCREEN OR NOT TO SCREEN: THE PROSTATE CANCER

Prostate Cancer Screening in Taiwan: a must

Epi procolon The Blood Test for Colorectal Cancer Screening

Science Highlights. To PSA or not to PSA: That is the Question.

GT-020 Phase 1 Clinical Trial: Results of Second Cohort

chronic leukemia lymphoma myeloma differentiated 14 September 1999 Pre- Transformed Ig Surface Surface Secreted Myeloma Major malignant counterpart

C. The null hypothesis is not rejected when the alternative hypothesis is true. A. population parameters.

Term Critical Illness Insurance

Transcription:

An Application to Diagnosis and Prognosis of ARDS Division of Cancer Biostatistics Department of Biostatistics, Vanderbilt University School of Medicine Cancer Biostatistics Workshop April 17th, 2009

Acknowledgement Vanderbilt Clinical Proteomics Program Lorraine Ware Richard Fremont Addison May Biostatistics Dean Billheimer William Wu

ARDS Acute Respiratory Distress Syndrome Respiratory failure Caused by sepsis, pneumonia, pulmonary aspiration, trauma May cause hypoxemia and multiple organ failure 200,000 / year in US 30% to 50% mortality Diagnosis is based on clinical definition Bilateral infiltrates PaO 2 / FiO 2 < 300 Acute Lung Injury (ALI) PaO 2 / FiO 2 < 200 ARDS No evidence of heart failure as cause Bernard et al. Am J Respir Crit Care Med 1994; 149: 818-824

Biological markers in ALI/ARDS plasma / serum, pulmonary edema fluid, urine Markers of inflammation (IL6, IL8) oxidant stress lung epithelial / endothelial injury (RAGE, SPD, CC16, VWF) collagen deposition and altered coagulation and fibrinolysis (PCPIII) Rationale for studying a panel of biomarkers No single marker has been sufficient to accurately differentiate patients with ALI/ARDS from those at risk. ARDS is a complex syndrome characterized by injury to multiple cell types, inflammation and dysregulated coagulation and fibrinolysis

Study design : diagnostic panel of biomarkers Trauma prospective cohort study with 1,020 critically injured trauma patients ALI/ARDS incidence 20 to 30% Nested case control study: 85 patients with clear chest X ray 105 patients with ALI/ARDS selection criteria include availability of blood sample / CXR within 72 hours of study enrollment Goal is to develop a model that distinguishes ALI/ARDS patients from control patients, and more importantly, choose useful biomarkers from a panel of 21. VALID study (Validation of Acute Lung Injury Markers for Diagnosis): prospective ICU study with 2550 patients.

Study design : prognosis in ALI/ARDS 528 patients from the ARDS clinical trials network multi-center trial (the ALVEOLI study) 8 biomarkers (selected based on prior demonstration of their association with adverse outcomes in ALVEOLI study) Goal is to determine the prognostic value of a panel of plasma markers for death in patients with ARDS

Sensitivity and Specificity TRUTH + a b + TEST True Positive False Positive (Prediction) c d False Negative True Negative Sensitivity = a = P[Test + Truth + ] a + c Specificity = d = P[Test Truth ] b + d

Example: PCA3 and prostate cancer Benign Malignant 0 100 200 300 400

Example: PCA3 and prostate cancer Malignant group has elevated PCA3 levels. The cutoff of 35 was suggested. Malignant Benign PCA3 > 35 108 160 PCA3 35 45 423 Sensitivity = P[Test + Truth + ] = 108 108 + 45 = 0.71 Specificity = P[Test Truth ] = 423 160 + 423 = 0.73

Receiver Operating Characteristic In the previous example, if the cutoff point is not 35, sensitivity and specificity would be different. Malignant Benign PCA3 > 10 136 405 PCA3 10 17 178 Malignant Benign PCA3 > 150 13 11 PCA3 150 140 572 Sensitivity = 0.89 Specificity = 0.31 Sensitivity = 0.08 Specificity = 0.98

PCA3 and prostate cancer: ROC curve ROC curve is a plot of sensitivity and 1-specificity. 1.0 ROC curve 0.8 Sensitivity 0.6 0.4 0.2 0.0 0.0 0.2 0.4 0.6 0.8 1.0 1 Specificity

AUC: area under curve The plot of sensitivity and 1-specificity (P[false negative]) generally does not help you with choosing the optimal cutoff, but it can be used to assess the usefulness of the predictor.

ROC curve ROC curve 1.0 0.8 Sensitivity 0.6 0.4 The blue line corresponds to a perfect prediction. (AUC = 1) The red line corresponds to the worst case (coin flip). (AUC = 0.5) In this example, AUC (aka C-statistic) of PCA3 is 0.742. 0.2 0.0 PCA3 (AUC = 0.742) PSA (AUC = 0.506) 0.0 0.2 0.4 0.6 0.8 1.0 1 Specificity

Interpreting AUC The average value of sensitivity for all possible values of specificity. The average value of specificity for all possible values of sensitivity. The probability that the randomly selected condition + patient has a test result indicating greater suspicion than a randomly selected condition - patient. There is a connection between AUC and a Wilcoxon 2-sample rank-sum test.

Discrimination and calibration So far, we have looked at the model s ability to discriminate cases from controls. Usefulness of a model should also be judged on calibration. A model that assigns a predicted probability of 0.51 to all cases and 0.49 to all controls enjoys perfect discrimination, but it is not calibrated well. Predicted and actual probabilities are about the same in a well calibrated model. e.g., about 30% of people with 30% probability of death die.

Calibration Proportion dead 1.0 0.8 0.6 0.4 0.2 0.0 0.0 0.2 0.4 0.6 0.8 1.0 Predicted Probability

Calibration Dead 1.0 0.8 Proportion Dead 0.6 0.4 0.2 Alive 0.0 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 Predicted Probability of Death

Predicting death from ARDS We would like to develop a model with a few predictors that predicts death from ARDS fairly well. N = 528, (alive 384, dead 144) Consider the following models: Full model with 13 predictors Clinical variables only Biomarkers only Best model Some additional considerations: Interactions? Nonlinear effects?

Interactions? Spearman ρ 2 0.35 0.30 0.25 0.20 0.15 0.10 0.05 0.00 causetrauma log(spd) causemultran causeother causepneum causesepsis log(pai1) a.a pstat nfailsx1 nfailsx2+ log(protc) log(il6) log(il8) log(tnfr) log(icam) log(vwf) mortdead age

Nonlinear effects? dead Probability of death alive 20 30 40 50 60 70 80 90 Age

ROC Curves from various models ROC curves (Models without APACHE III) Sensitivity 0.0 0.2 0.4 0.6 0.8 1.0 Full (0.823) Biomarkers (0.756) ARMA (0.757) Reduced (0.811) Full model has 13 predictors. Biomarkers-only model has 8 predictors. Clinical-variables-only model has 5 predictors. Reduced model has 7 predictors (4 biomarkers). 0.0 0.2 0.4 0.6 0.8 1.0 1 Specificity

How to reduce from the full model? Not all decisions were statistical! ( We want to use 5 biomarkers at most... ) Univariable contribution suggested by backward elimination. (Bootstrap?) Computed AUC for all possible combinations of predictors

AUC 8,191 Area under ROC curve for various models n=13 n=78 n=286 n=715 n=1287 n=1716 n=1716 n=1287 n=715 n=286 n=78 n=13 n=1 0.814 0.808 0.80 0.801 0.790 0.777 1 Age 0.75 2 Age + IL8 0.734 3 Age + IL8 + SPD AUC 0.70 4 Age + IL8 + SPD + Organ failures 5 Age + IL8 + SPD + o.f. + PAI1 0.65 0.663 6 Age + IL8 + SPD + o.f. + PAI1 + Alveolar-arterial O 2 difference 7 Age + IL8 + SPD + o.f. + PAI1 + AA + TNFR 0.60 1 2 3 4 5 6 7 8 9 10 11 12 13 Number of variables in the model

So which model? We would like to have a model that is as good as the full model with fewer predictors. Is the reduced model good enough? How can we compare the models? Reclassification Table Cook NR, Use and misuse of the receiver operating characteristic curve in risk prediction. Circulation, 2007; 115: 928-935.

Reclassification table from Cook (2007) Close TABLE 3. Comparison of Observed and Predicted Risks Among Women in the Women's Health Study

Some improvements on reclassification table Put the predicted probabilities from Model 1 into a few categories like low risk, medium risk, high risk. Put the predicted probabilities from Model 2 into the same categories. Look at the cases that Model 2 reclassified into better categories (i.e., if the sample is from dead group, categorizing into medium risk is better than low risk ; if the sample is from alive group, medium risk is better than high risk ). Is Model 2 an improvement from Model 1?

Biomarker-only model as an example With a logisitic regression model, we compute the predicted probability of death for each patient. 1.0 From the model with 8 biomarkers 0.8 Predicted probability of death 0.6 0.4 0.2 Cutoff = 0.2 Sensitivity = 0.76 Specificity = 0.58 Cutoff = 0.4 Sensitivity = 0.47 Specificity = 0.86 0.0 alive dead

Biomarker-only Model and Final Model Biomarker Only Final Model Blue = Dead 0.0 0.2 0.4 1.0 0.0 0.2 0.4 1.0 Biomarker Only Final Model Blue = Alive 0.0 0.2 0.4 1.0 0.0 0.2 0.4 1.0

Problems with Reclassification Table Limited means of evaluating improvement in performance of the models Requires pre-set categories Beyond Reclassification Tables Statisticians should seek new ways, beyond the ROC curve, to evaluate clinically useful new markers for their ability to improve upon current models... [Greenland and O Malley 2005] People with and without events should be considered separately. Reclassification improvement = Proportion of upward movement in event group + Proportion of downward movement in non-event group We should compute the reclassification improvement without preset categories. Just imagine every single person is in his/her own category. Compare predicted probabilities of event.

Integrated Discrimination Improvement [Pencina et al. 2007] IDI is a measure of improvement of prediction that does not require pre-set categories. Compute the differences (new - old) in predicted probabilities of the event (e.g., death) in the event group and average them. If the new model is better, this will be positive. Compute the difference (new - old) in predicted probabilities of the event in the non-event group and average them. If the new model is better, this will be negative. IDI is the differences of these two measures.

More on IDI IDI does not require cutoffs. IDI takes magnitudes (predicted probabilities of an event) into account. Asymptotic distribution of IDI is simple, and testing H 0 : IDI = 0 (no improvement) is simple when sample size is large. IDI seems much more sensitive than AUC. Example in Pencina et al. Conventional model for coronary heart disease risk includes sex, diabetes, smoking, age, systolic blood pressure. New model includes HDL cholesterol. N = 3264 AUC increases from 0.762 to 0.774 (p-value = 0.092). IDI estimate is 0.009 (p-value = 0.008)

AUC and IDI Both AUC and IDI are a form of average sensitivity. AUC weights more heavily sensitivity corresponding to small cutoffs (large sensitivity). IDI uses a uniform weight. Assuming no change in specificity and a uniform increase in sensitivity by 0.01 at every cutoff point, both AUC and IDI will go up by 0.01. Is increase of 0.01 in average sensitivity meaningful?

Choosing the model Choosing the final model often requires decisions not based on statistical significance. (Difference between two models may be statistically significant with IDI but not with AUC.) The model should have good predictive ability (good sensitivity and specificity large AUC.) The model should be simple.

Nomogram from the final model Points 0 10 20 30 40 50 60 70 80 90 100 Age 18 29 39 50 65 75 80 86 98 316 745 7315 IL8 (day 0) 40 16 99 212 415 588 1045 SPD (day 0) 50 32 12 1 Number of organ failures at baseline 0 2+ 141 250 322 517 PAI1 (day 0) 61 13 Alveolar arterial O2 difference 75 116 208 286 426 589 TNFR (day 0) 4283 8484 17584 26696 48288 Total Points 0 50 100 150 200 250 300 350 Probability of Death 0.05 0.1 0.2 0.4 0.6 0.8 0.9 0.95

Fremont, Koyama, Calfee, Wu, Dossett, Bossert, Mitchell, Wickersham, Bernard, Matthay, May, Ware. Acute lung injury in patients with traumatic injuries: utility of a panel of biomarkers for diagnosis and pathogenesis under revision Ware, Koyama, Billheimer, Wu, Bernard, Thompson, Brower, Standiford, Martin, Mattheay, NHLBI ARDS Clinical Trials Network. Prognostic and pathogenetic value of combining clinical and biochemical indices in patients with acute lung injury under review Blum, Koyama, M Koma, Iturregui, Martinez-Ferrer, Uwamariya, Smith, Clark, Bhowmick. Chemokine markers predict biochemical recurrence of prostate cancer following prostatectomy. Clinical Cancer Research, 2008, 14(23): 7790-7797.

An Application to Diagnosis and Prognosis of ARDS Division of Cancer Biostatistics Department of Biostatistics, Vanderbilt University School of Medicine Cancer Biostatistics Workshop April 17th, 2009