Survival tree and meld to predict long term survival in liver. transplantation waiting list

Similar documents
Evaluation and Prognosis of Patients with Cirrhosis

Effect of Pretransplant Serum Creatinine on the Survival Benefit of Liver Transplantation

Model-Based Recursive Partitioning for Detecting Interaction Effects in Subgroups

PREVENTION OF HCC BY HEPATITIS C TREATMENT. Morris Sherman University of Toronto

Binary Logistic Regression

Methods for Interaction Detection in Predictive Modeling Using SAS Doug Thompson, PhD, Blue Cross Blue Shield of IL, NM, OK & TX, Chicago, IL

Data mining on the EPIRARE survey data

Questions and Answers for Transplant Candidates about the New Kidney Allocation System

A New Priority Policy for Patients with Hepatocellular Carcinoma Awaiting Liver Transplantation Within the Model for End-Stage Liver Disease System

Center Activity (07/01/ /30/2010) Center Region United States Tables for More Information

{ram.gopal,

Benchmarking Open-Source Tree Learners in R/RWeka

Statistical Data Mining. Practical Assignment 3 Discriminant Analysis and Decision Trees

Estimated GFR Based on Creatinine and Cystatin C

How To Find Seasonal Differences Between The Two Programs Of The Labor Market

Handling missing data in large data sets. Agostino Di Ciaccio Dept. of Statistics University of Rome La Sapienza

Where Will my New Kidney Come From?

BURDEN OF LIVER DISEASE IN BRAZIL

A Study to Predict No Show Probability for a Scheduled Appointment at Free Health Clinic

Acute on Chronic Liver Failure: Current Concepts. Disclosures

Chapter 5 Analysis of variance SPSS Analysis of variance

Dealing with Missing Data

Liver Transplantation for Hepatocellular Carcinoma. John P. Roberts, MD Chief, Division of Transplant Service University of California, San Francisco

D-optimal plans in observational studies

Cirrhosis and HCV. Jonathan Israel M.D.

TACKLING SIMPSON'S PARADOX IN BIG DATA USING CLASSIFICATION & REGRESSION TREES

ASSIGNMENT 4 PREDICTIVE MODELING AND GAINS CHARTS

Approach to Abnormal Liver Tests

Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression

Multivariate Analysis of Variance. The general purpose of multivariate analysis of variance (MANOVA) is to determine

Linear Models in STATA and ANOVA

Least Squares Estimation

*6816* 6816 CONSENT FOR DECEASED KIDNEY DONOR ORGAN OPTIONS

EXPLORING & MODELING USING INTERACTIVE DECISION TREES IN SAS ENTERPRISE MINER. Copyr i g ht 2013, SAS Ins titut e Inc. All rights res er ve d.

Statistics Graduate Courses

Survival Analysis of the Patients Diagnosed with Non-Small Cell Lung Cancer Using SAS Enterprise Miner 13.1

Management of hepatitis C: pre- and post-liver transplantation. Piyawat Komolmit Bangkok

After the Cure: Long-Term Management of HCV Liver Disease Norah A. Terrault, MD, MPH

Service courses for graduate students in degree programs other than the MS or PhD programs in Biostatistics.

The Probit Link Function in Generalized Linear Models for Data Mining Applications

Introduction to General and Generalized Linear Models

How does a kidney transplant differ from dialysis?

Surveillance for Hepatocellular Carcinoma

Data, Measurements, Features

Linear Threshold Units

Regression Modeling Strategies

THE BENEFITS OF LIVING DONOR KIDNEY TRANSPLANTATION. feel better knowing

INDIAN STATISTICAL INSTITUTE announces Training Program on Statistical Techniques for Data Mining & Business Analytics

Organizing Your Approach to a Data Analysis

Impact of MELD-Based Allocation on End-Stage Renal Disease After Liver Transplantation

Therapy of decompensated cirrhosis Pre-transplant for HBV and HCV

partykit: A Toolkit for Recursive Partytioning

Negative Binomials Regression Model in Analysis of Wait Time at Hospital Emergency Department

BACKGROUND MEDIA INFORMATION Fast facts about liver disease

II. DISTRIBUTIONS distribution normal distribution. standard scores

Chi-square test Fisher s Exact test

STATISTICA Formula Guide: Logistic Regression. Table of Contents

Decision Trees from large Databases: SLIQ

LOGISTIC REGRESSION ANALYSIS

SAS Software to Fit the Generalized Linear Model

Hepatocellular Carcinoma Treatment Decision Tree

User Guide. A. Program Summary B. Waiting List Information C. Transplant Information

The Natural History of Hepatitis C Cirrhosis After Liver Transplantation

In Proceedings of the Eleventh Conference on Biocybernetics and Biomedical Engineering, pages , Warsaw, Poland, December 2-4, 1999

This can dilute the significance of a departure from the null hypothesis. We can focus the test on departures of a particular form.

Multivariate Logistic Regression

Tips for surviving the analysis of survival data. Philip Twumasi-Ankrah, PhD

Handling attrition and non-response in longitudinal data

David Axelrod, MD, MBA Associate Professor of Surgery Section Chief Solid Organ Transplantation Dartmouth Hitchcock Medical Center

NP/PA Clinical Hepatology Fellowship Summary of Year-Long Curriculum

23/06/2014. Nutrition in Chronic Liver Disease. Objectives. Question #1

Multivariate Normal Distribution

ENHANCED CONFIDENCE INTERPRETATIONS OF GP BASED ENSEMBLE MODELING RESULTS

The State of the Liver in the Adult Patient after Fontan Palliation

IBM SPSS Statistics 20 Part 4: Chi-Square and ANOVA

Package dsmodellingclient

Elevated heart rate at twelve months after heart transplantation is an independent predictor of long term mortality

Association Between Variables

Why Taking This Course? Course Introduction, Descriptive Statistics and Data Visualization. Learning Goals. GENOME 560, Spring 2012

Patterns of abnormal LFTs and their differential diagnosis

training programme in pharmaceutical medicine Clinical Data Management and Analysis

Learning Example. Machine learning and our focus. Another Example. An example: data (loan application) The data and the goal

Machine learning of patient similarity: a case study on predicting survival in cancer patient after locoregional chemotherapy.

MULTIPLE REGRESSION WITH CATEGORICAL DATA

Does referral from an emergency department to an. alcohol treatment center reduce subsequent. emergency room visits in patients with alcohol

Additional sources Compilation of sources:

5. A full binary tree with n leaves contains [A] n nodes. [B] log n 2 nodes. [C] 2n 1 nodes. [D] n 2 nodes.

Performance Analysis of Naive Bayes and J48 Classification Algorithm for Data Classification

1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96

Strategies for Identifying Students at Risk for USMLE Step 1 Failure

Social Media Mining. Data Mining Essentials

Early mortality rate (EMR) in Acute Myeloid Leukemia (AML)

Case Study in the Management of Patients with Hepatocellular Carcinoma

Gerry Hobbs, Department of Statistics, West Virginia University

MEDICAL DECISION SUPPORT IN CLINICAL RECORD MANAGEMENT SYSTEMS. Stefano Rizzi DEIS - Facoltˆ di Ingegneria, Universitˆ di Bologna, Italy

E3: PROBABILITY AND STATISTICS lecture notes

Class 19: Two Way Tables, Conditional Distributions, Chi-Square (Text: Sections 2.5; 9.1)

Missing data and net survival analysis Bernard Rachet

Learning Gaussian process models from big data. Alan Qi Purdue University Joint work with Z. Xu, F. Yan, B. Dai, and Y. Zhu

Transcription:

Survival tree and to predict long term survival in liver transplantation waiting list Emília Matos do Nascimento a, Basilio de Bragança Pereira a,b, *, Samanta Teixeira Basto b, Joaquim Ribeiro Filho b a Federal University of Rio de Janeiro, COPPE - Postgraduate School of Engineering, Rio de Janeiro, Brazil b Federal University of Rio de Janeiro, School of Medicine and HUCFF - University Hospital Clementino Fraga Filho, Rio de Janeiro, Brazil Abstract Background: Many authors have described MELD as a predictor of short-term mortality in the liver transplantation waiting list. However MELD score accuracy to predict long term mortality has not been statistically evaluated. Objective: The aim of this study is to analyze the MELD score as well as other variables as a predictor of long-term mortality using a new model: the Survival Tree analysis. Study Design and Setting: The variables obtained at the time of liver transplantation list enrollment and considered in this study are: sex, age, blood type, body mass index, etiology of liver disease, hepatocellular carcinoma, waiting time for transplant and MELD. Mortality on the waiting list is the outcome. Exclusion, transplantation or still in the transplantation list at the end of the study are censored data. Results: The graphical representation of the survival trees showed that the most statistically significant cut off is related to MELD score at point 6. Conclusion: The results are compatible with the cut off point of MELD indicated in the clinical literature. Keywords: Survival tree; Conditional inference trees; Recursive partitioning; MELD; Liver transplantation waiting list; Long term mortality prediction * Corresponding author. Tel.: +55--56594 E-mail address: basilio@hucff.ufrj.br (B.B. Pereira)

What is New - MELD score cut off to predict long term mortality in liver transplantation waiting list was statistically evaluated for the first time. - Survival Analysis Tree and MELD was used to predict long term mortality.. Introduction The Model for End-Stage Liver Disease (MELD) score was described as a short term mortality index used to predict three month mortality in patients who underwent transjugular intrahepatic portosystemic shunt (TIPS) insertion []. It was subsequently applied to allocate liver grafts in liver transplantation list in the United States and several countries, since February []. Many countries use subjective local criteria or UNOS based policy to allocate liver grafts according to liver disease severity [3]. In Brazil, liver transplantation waiting list was organized according to a chronological system until June, 6 [4]. The liver transplantation waiting list time varies significantly among various centers but usually reflect a gap between the donor liver pool and the demand for transplant [5]. The longer waiting time results in a higher mortality rate [6]. It is important to identify those patients with the worst outcome. There are several factors related to liver transplantation waiting list mortality as age, gender, blood type and disease etiology [7]. Many authors have described MELD as an independent tool related to short term mortality in the transplantation waiting list and tried to determine a threshold to assess prognosis and mortality in this setting [8,9]. However MELD score accuracy to predict long term mortality has not been statistically evaluated in the past. The aim of this study was to analyze the MELD score as a predictor of long term mortality using a Survival Analysis Tree and to establish a MELD cut off point that better predicts this long term mortality. Cut off points of other covariates are also evaluated and their interactions with MELD is also analyzed. The data base and methods are presented in section. Section 3 presents the recursive partitioning method. The results and conclusion are presented in sections 4 and 5 respectively.

. Data base and methods From November 997 to July 6, all patients in the liver transplantation waiting list were evaluated for inclusion in the study. Patients with incomplete data for MELD calculation were excluded and 59 were included. Data were obtained from the patient inclusion registration form and from the hospital s internal system of patient registration (Medtrack) and organized in excel for posterior analysis. The variables obtained at the time of liver transplantation list enrollment and considered in this study are: sex, age, blood type, body mass index, etiology of liver disease, hepatocellular carcinoma, waiting time for transplant (in days) and MELD. The formula for the MELD score [] is 3.8*log e (bilirubin[mg/dl]) +.*log e (INR) + 9.6*log e (creatinine [mg/dl]) + 6.4*(etiology: if cholestatic or alcoholic, otherwise). From the 59 patients in the data base, 6% were male. The mean age was 5±3 years old. The most frequent etiology for liver disease was chronic hepatitis C (47%), alcoholic liver disease (7%) and cryptogenic (%). Regarding general outcome, 36% died, and 64% are censored, from which 8% left the transplant list, 4% had been submitted to a liver transplant, and 4% are still in list. The statistical approach used is the Survival Tree developed by Hothorn et al. []. The implementation was done using R [] packages []. 3. Recursive partitioning A learning set L consists of m covariates X = (X,..., X m ) of a sample space χ = χ... χ m and a response Y of a sample space Y. Let it be a learning set L used to form a predictor ϕ(x, L), i.e., if the input is x the answer y will be predicted by ϕ(x, L). So, the conditional distribution D(Y X) of the response given covariates X depends on a function f of the covariates D(Y X) = D(Y X,..., X m ) = D (Y f(x,..., X m )), with the restriction that the partition is based on the regression relationships so that the covariate space χ = partitioned in r disjoint cells B,..., B r. U r k= Bk are 3

The regression model will be fitted based on a learning sample L n composed of n independent and identically distributed observations. Hothorn et al.[] used regression models describing the conditional distribution of a response Y given the status of m covariates through the tree-structured recursive partitioning and formulated a generic algorithm for recursive binary partitioning for a given learning sample L n using non-negative valued case weights w = (w,..., w n ). Each node of a tree is represented by a vector of non-zero case weights if the corresponding observations are elements of the node and zero otherwise. The association between the response Y and covariates X j, j =,..., m is measured by the following linear statistics T j T pjq ( i j ji ( i n ) ) R n ( L, w) = vec w g ( X ) h Y, = ( Y,..., Y ) n i where g j : X j R pj is a non-random transformation of the covariate X j. h: Y x Y n R q is the influence function that depends on the responses (Y,...,Y n ) in a permutation symmetric way. vec is the operator that convert a p j x q matrix into a p j q column vector by column-wise combination. The distribution of T j (L n, w) under the partial hypotheses H j depends on the joint distribution of Y and X j, which is unknown under almost all practical circumstances. This principle leads to test procedures known as permutation tests. The majority of the algorithms for the construction of classification or regression trees algorithm follow a general rule []: ) Partition the observations by univariate splits in a recursive way. ) Fit a constant model in each cell of the resulting partition. Hothorn et al. [] implemented the conditional inference trees which embed recursive binary partitioning into the well defined theory of permutation tests developed by Strasser and Weber [3]. These are the steps of the algorithm [4]: ) Test the global null hypothesis of independence between any of the input variables and the response (which may be multivariate as well). Stop if this hypothesis cannot be rejected. Otherwise 4

select the input variable with strongest association to the response. This association is measured by a p-value corresponding to a test for the partial null hypothesis of a single input variable and the response. ) Implement a binary split in the selected input variable. 3) Recursively repeats steps ) and ). The algorithm stops if the global null hypothesis of independence between the response Y and any of the m covariates cannot be reject at a pre-specified nominal level α. Otherwise the association between the response and each of the m covariates is measured by test statistics or P-values that indicate the deviations from the partial hypotheses H j. 4. Results This section presents a graphical representation of the survival tree for the 59 patients in the liver transplantation waiting list using R [] packages []. P-values correspond to the log-rank test. In Figure one can observe the MELD cut off at point 6. This survival tree also presents some other cut offs statistically significant. p <. 6 > 6 p =.3 5 p =.4 > 3 > 3 Node 3 (n = ) Node 4 (n = 6) Node 6 (n = 4) Node 7 (n = 8).8.8.8.8.6.6.6.6.4.4.4.4 6 4 6 4 6 4 6 4 Figure Survival Tree (MELD) 5

Figure shows that the first important cut off is related to MELD at point 6 and also presents the interaction with age where the cut off corresponds to 33. years. p <. 6 > 6 p =.6 5 age p =. > 33.6 > 33.6 Node 3 (n = ) Node 4 (n = 6) Node 6 (n = ) Node 7 (n = 6).8.8.8.8.6.6.6.6.4.4.4.4 6 4 6 4 6 4 6 4 Figure Survival Tree (Interaction between MELD and age) Figure 3 shows again that the principal cut off is corresponding to MELD at point 6 and also the relevant interaction with hepatocellular carcinoma (HCC). p <. 6 > 6 p =.6 7 p =.47 > 4 hcc p =.5 3 > 3 yes no Node 3 (n = ) Node 5 (n = 3) Node 6 (n = 47) Node 8 (n = 4) Node 9 (n = 8).8.8.8.8.8.6.6.6.6.6.4.4.4.4.4 6 4 6 4 6 4 6 4 6 4 Figure 3 Survival Tree (Interaction between MELD and HCC) 6

Finally, Figure 4 presents a decision cut off with three variables: MELD (cut off at point 6), age, and hepatocellular carcinoma (HCC). The other variables in the data base did not show any interaction with MELD. p <. 6 > 6 p =.9 7 age p =.3 > 4 hcc p =. 33.6 > 33.6 yes no Node 3 (n = ) Node 5 (n = 3) Node 6 (n = 47) Node 8 (n = ) Node 9 (n = 6).8.8.8.8.8.6.6.6.6.6.4.4.4.4.4 6 4 6 4 6 4 6 4 6 4 Figure 4 Survival Tree (Interaction between MELD, age and HCC) 5. Conclusion The optimal cut off point to MELD score in the clinical literature is around 6 or 7 [8,5]. This has been confirmed in all survival trees presented in this paper where the cut off point to MELD score based on the data is 6. Our statistical results reinforce the cut off point indicated in the clinical literature. Authors' contributions Nascimento EM (M.Sc.) was responsible for the implementation of the R packages; Nascimento EM (M.Sc.) and Pereira B de B (Ph.D.) analyzed the data; Basto ST (M.D.) and Ribeiro Filho J (M.D.) collected and organized the data. All authors read and approved the final manuscript. 7

References [ ] Kamath PS, Wiesner RH, Malinchoc M et al. A model to predict survival in patients with endstage liver disease. Hepatology ; 33(): 464-7. [ ] Edwards EB, Harper AM. Application of a continuous disease severity score to the OPTN liver waiting list. Clin Transpl : 9-4. [ 3] Fink MA, Angus PW, Gow PJ et al. Liver transplant recipient selection: MELD vs. clinical judgment. Liver Transpl 5; (6): 6-6. [ 4] Ministério da Saúde. MS - Portaria nº.6/6, in Diário Oficial da União (6). [ 5] Zapata R, Innocenti F, Sanhueza E et al. Clinical characterization and survival of adult patients awaiting liver transplantation in Chile. Transplant Proc 4; 36(6): 669-7. [ 6] Ransford R, Gunson B, Mayer D, Neuberger J, Christensen E. Effect on outcome of the lengthening waiting list for liver transplantation. Gut ; 47(3): 44-3. [ 7] Fink MA, Berry SR, Gow PJ et al. Risk factors for liver transplantation waiting list mortality. J Gastroenterol Hepatol 7; (): 9-4. [ 8] Adler M, DeGendt E, Vereerstraeten P et al. Value of the MELD score for the assessment of preand post-liver transplantation survival. Transplant Proc 5; 37(6): 863-4. [ 9] Lee YM, Fernandes M, DaCosta M et al. The MELD score may help to determine optimum time for liver transplantation. Transplant Proc 4; 36(): 357-9. [] Hothorn T, Hornik K, Zeileis A. Unbiased Recursive Partitioning: A Conditional Inference Framework. Journal of Computational and Graphical Statistics 6; 5(3): 65-74. [] R Development Core Team (8). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. ISBN 3-95-7-. Available from: http://www.r-project.org. [] Everitt, B.S.; Hothorn, T. A Handbook of Statistical Analyses Using R. Boca Raton, FL: Chapman & Hall/CRC Press; 6. 75 p. [3] Strasser H, Weber C. On the Asymptotic Theory of Permutation Statistics. Vienna: Vienna University of Economics and Business Administration; Report No. 7. Adaptive Information Systems and Modelling in Economics and Management Science. [Internet] 999 [cited 8 Jun 8

7]. 8 p. Available from: http://epub.wu-wien.ac.at/dyn/virlib/wp/eng/mediate/epub-wu- _94c.pdf?ID=epub-wu-_94c [4] Hothorn T, Hornik K, Zeileis A. A computational framework for conditional inference with an application to unbiased recursive partitioning. [Internet] 5 [cited 8 Jun 7]. Available from: http://www.imbe.med.uni-erlangen.de/~hothorn/talks/statgenomicssemth.pdf. [5] Degré D, Bourgeois N, Boon N et al. Aminopyrine breath test compared to the MELD and Child-Pugh scores for predicting mortality among cirrhotic patients awaiting liver transplantation. Transpl Int 4; 7(): 3-8. 9