Missing data and net survival analysis Bernard Rachet


 Ashlee Walker
 1 years ago
 Views:
Transcription
1 Workshop on Flexible Models for Longitudinal and Survival Data with Applications in Biostatistics Warwick, July 2015 Missing data and net survival analysis Bernard Rachet
2 General context Populationbased, routine data Cancer registry data Clinical data tumour, treatment, comorbidity Cancer survival and roles played by patient, tumour and healthcare factors (very) large data sets, but incomplete information, which we have handled using multiple imputation procedure with Rubin s rules
3 Preliminary results of ongoing work
4 Multiple imputation procedure Under Missing At Random (MAR) assumption 1. Impute the missing data from f data sets Y M Y O to give K complete 2. Fit the substantive model to each of the K data sets, to obtain K estimates of the parameters and estimates of their variance 3. Combine them using Rubin s rules
5 Multiple imputation steps Imputation Analysis Pooling Incomplete data Final results K completed data sets K analysis results
6 Pooling K estimates Rubin s rules Given K completed data sets, there are: K estimates with variance ˆk,k 2 ˆ k 1,...,K,k 1,...,K Pooled estimate Total variance ˆ ˆ V MI MI Wˆ 1 ˆ K 1 k withinimputation variance betweenimputation variance K (1 k 1 K )Bˆ Wˆ Bˆ 1 K 1 K 1 K k 1 K k 1 2 k ( ˆ ˆ k MI 2 )
7 Multiple imputation procedure Congeniality 1. Imputation model congenial with substantive model 2. Given the substantive model from f Y X, f Y X g X is a congenial imputation model if both f and g are correctly specified 3. Valid inference (under MAR) if f Y X g X (approximately) represents data structure and substantive model
8 Concepts and measures of interest Aims Concepts Prognosis of a cancer and impact at population level Excess hazard Excess hazard ratio Net survival Crude probabilities of death from cancer and other causes Relative survival data setting Populationbased data Expected mortality hazard from life tables By single year age and sex, and calendar year, geography, deprivation
9 Nur et al, Settings Populationbased cohort of colorectal cancer patients Complete information on age, sex, followup time, vital status, deprivation, comorbidity, surgical treatment Tumour stage, morphology and grade: 45% incomplete data Relative survival data setting λ x = λ P x + exp xβ Substantive model: generalised linear model (Dickman et al, Stat Med 2005) Link function log μ j d Pj = log y j + xβ d j ~Poisson μ j ; μ j = λ j y j ; y j persontime at risk d Pj expected number of deaths life tables Excess hazard ratio (+ Ederer2 relative survival) Offset
10 Data description Variable Stage Patients Category No. % I II III IV Missing (39.5) Missing information associated with: Older ages More deprived categories Less treatment with curative intent Higher probability of death Morphology Adenocarcinoma Mucinous and serous Other Neoplasm, NOS (11.6) Grade I II III/IV Missing (25.0)
11 Missing information in several variables Multiple imputation using Full Conditional Specification (chained equations van Buuren, 1999) Same basic assumptions than in multiple imputation Assumes a joint (multivariate) distribution exists without specifying its form f Y, Y,..., Y f Y Y,..., Y i,1 i,2 i, p i, p i,1 i, p 1 f Y Y,..., Y... f Y Y f Y i, p 1 i,1 i, p 2 i,2 i,1 i,1 Imputation model (joint model for the data) Gibbs sampler to: 1. Estimate the parameters in the joint imputation model 2. Impute the missing data Y ~ N β, Ω Multivariate problem split into a series of univariate problems
12 Imputation models Outcomes Ordinal regression for stage and grade Polytomous regression for morphology Covariables Other two covariables with incomplete information Sex, age, deprivation, comorbidity, treatment, cancer site Vital status Followup time (years): piecewise function (0, 0.5, 1, 2, 3, 4, 5, 5+) Timedependent effects (categorical) for deprivation and age Substantive (excess hazard) model includes all these variables (binary) timedependent effects
13 Results Variable Stage Patients Data after imputation Category No. % % I II III IV Missing (39.5) Missing information associated with: Older ages More deprived categories Less treatment with curative intent Higher probability of death Morphology Adenocarcinoma Mucinous and serous Other Neoplasm, NOS (11.6) Grade I II III/IV Missing (25.0)
14 Results Completecase analysis ( cases) Five years** First year Second to fifth years Period since diagnosis over which EHR was estimated Multiple imputation ( cases) Five years** First year Second to fifth years EHR 95% CI EHR 95% CI EHR 95% CI EHR 95% CI EHR 95% CI EHR 95% CI I II III IV Missing 15 to to to to to to Other results Indicator approach Systematically underestimates variance of EHRs Overestimates EHRs for tumour morphology Underestimates EHRs for age and deprivation Does not identify timedependent effects
15 Stagespecific survival Before imputation After imputation Relative survival (%) I II III IV missing Years since diagnosis 0 I II III IV Years since diagnosis
16 Limitations Tutorial paper no systematic evaluation Relatively simple substantive model piecewise model categorical variables Further recent methodological developments in: multiple imputation net survival, flexible modelling More systematic evaluation simulations
17 Concepts and measures of interest Excess hazard λ E t = λ O t λ P t λ O t dt = dnw t ; λ Y W t P t dt = i=1 n Net survival S E t = e 0 Crude mortality F C t = 0 W t = 1 S Pi t t λe u du t S O u λ E u du Yi W t λpi t Y W t Expected probability of surviving up to t
18 Modelling approach Flexible multivariable excess hazard model Excess hazard Timedependent and nonlinear effects (splines) Variables affecting both mortality processes (cancer and other causes of death) included in the model Net survival is the mean of individual net survival functions predicted by the model
19 Multiple imputation procedure Congeniality 1. Imputation model congenial with substantive model 2. Given the substantive model from f Y X, f Y X g X is a congenial imputation model if both f and g are correctly specified 3. Valid inference (under MAR) if f Y X g X (approximately) represents data structure and substantive model 4. Problematic within net survival setting and with nonlinear and timedependent effects
20 Falcaro et al, 2015 Study settings Data 44,461 men diagnosed with a colorectal cancer in , followed up to 2009 Age at diagnosis (continuous), tumour stage (4 categories), deprivation (5 categories) Missing stage: 30% MCAR logit Pr MAR on X logit Pr MAR logit Pr R i = 1 Z i = δ 0 R i = 1 Z i = α 0 + α 1 (age i 60) R i = 1 Z i = γ 0 + γ 1 (age i 60) + γ 2 T i + γ 3 D i R = 1 if stage missing 100 simulated data sets per scenario
21 Distribution on fully observed data and empirical expected distribution in remaining complete records
22 Substantive model Flexible log cumulative excess hazard model ln Λ E t x i = s 1 ln t ; γ 1, k 1 + β x i + s 2 age i ; γ 2, k 2 Flexible functions: restricted cubic splines Baseline excess hazard: 5 df, 4 internal knots and 2 boundary knots Age (continuous): 3 df, 2 internal knots Covariables: deprivation and stage Aims: estimate effect of stage (log EHR) and stagespecific net survival at 1, 5 and 10 years since diagnosis
23 Imputation models Outcome (stage) Ordinal or multinomial logistic regression Covariables Survival time and log(survival time) or NelsonAalen estimate of the cumulative hazard Event indicator Age splines defined as in the substantive model Deprivation dummy variables 30 imputations Net survival: Rubin s rules applied on log log S E t to obtain approximate normality, then backtransformed
24 Multiple imputation strategy Multiple Imputation Strategy Functional Form How Survival Is Modeled in the Imputation MI_ologit_surv Ordinal logistic Survival time and log survival time MI_ologit_na Ordinal logistic NelsonAalen estimate of cumulative hazard MI_mlogit_surv Multinomial logistic Survival time and log survival time MI_mlogit_na Multinomial logistic NelsonAalen estimate of cumulative hazard
25 Results Bias in log excess hazard ratio estimates for stage (reference stage 1), 100 replications Poor results with ordered logit even under MCAR scenario
26 Stagespecific net survival at 1 year, 100 replications
27 Results Bias in stagespecific net survival estimates at 1 year, 100 replications
28 Comments Promising results despite that the parameter estimated in the substantive model (here excess hazard) does not correspond to the final outcome of interest (net survival) Limitations No timedependent effects of stage Which joint model? Which variables in the imputation models? Vital status NelsonAalen estimates of cumulative hazard Interactions with time since diagnosis (age at diagnosis, deprivation ) Other relevant interactions (tumour stage, region ) other factors (treatment variables, comorbidities, hospital volume, surgeon s experience )
29 Limitations and challenges: preliminary study Simulated data set colon cancer, 12,048 men followed up at least 5 years Baseline excess hazard: 5 df, 4 internal knots Covariables: stage, deprivation, age Timedependent effects of stage: 2 df, 1 internal knot for each higher stage Nonlinear effects of age: 3 df, 2 internal knots Substantive model ln Λ E t x i = s 1 ln t ; γ 1, k 1 + β x i + s 2 age i ; γ 2, k 2 + s 3j stage j t ; γ 3, k 3 Missing stage simulated as in previous example 100 data sets per scenario, with 30% missing stage Focus on MAR here
30 Limitations and challenges: preliminary study Time (year) Net Survival function Complete MAR Stage Simulation of missingness mechanisms as in previous example Same imputation model was applied (multinomial, NelsonAalen)
31 Results Excess hazard ratios for stage 3.5 Tumour stage 2 (reference stage 1) True EHR Completecase EHRs Imputed EHRs Time since diagnosis (years)
32 Results Excess hazard ratios for stage Tumour stage 3 (reference stage 1) True EHR Completecase EHRs Imputed EHRs Time since diagnosis (years)
33 Results Excess hazard ratios for stage Tumour stage 4 (reference stage 1) True EHR Completecase EHRs Imputed EHRs Time since diagnosis (years)
34 Results Stagespecific net survival 1 Tumour stage Time since diagnosis (years)
35 Results Stagespecific net survival 1 Tumour stage Time since diagnosis (years)
36 Results Stagespecific net survival 1 Tumour stage Time since diagnosis (years)
37 Results Stagespecific net survival 1 Tumour stage Time since diagnosis (years)
38 Conclusion and development Why MI? Strength: clear division between imputation and analysis stages both efficiency and MAR plausibility increased Challenge: incompatibility between imputation and substantive models asymptotically biased estimates Define joint model for flexible excess hazard models Multiple imputation by fully conditional specification with substantive model compatible algorithm (SMCFCS) Bartlett JW et al. Statistical Methods in Medical Research 2015
39 References Little RJA, Rubin DB. Statistical Analysis with Missing Data. New York: John Wiley & Sons; Van Buuren S, Boshuizen HC, Knook DL. Multiple imputation of missing blood pressure covariates in survival analysis. Stat Med 1999; 18: White IR, Royston P. Imputing missing covariate values for the Cox model. Stat Med 2009; 28: Nur U, Shack LG, Rachet B, Carpenter JR, Coleman MP. Modelling relative survival in the presence of incomplete data: a tutorial. Int J Epidemiol 2010; 39: Carpenter JR, Kenward MG. Multiple imputation and its application. Chichester: John Wiley & Sons; Falcaro M, Nur U, Rachet B, Carpenter JR. Estimating excess hazard ratios and net survival when covariate data are missing: strategies for multiple imputation. Epidemiology 2015; 26: Bartlett JW, Seaman SR, White IR, Carpenter JR. Multiple imputation of covariates by fully conditional specification: accommodating the substantive model. Stat Methods Med Res 2015; 24:
Dealing with Missing Data
Dealing with Missing Data Roch Giorgi email: roch.giorgi@univamu.fr UMR 912 SESSTIM, Aix Marseille Université / INSERM / IRD, Marseille, France BioSTIC, APHM, Hôpital Timone, Marseille, France January
More informationA Basic Introduction to Missing Data
John Fox Sociology 740 Winter 2014 Outline Why Missing Data Arise Why Missing Data Arise Global or unit nonresponse. In a survey, certain respondents may be unreachable or may refuse to participate. Item
More informationHandling missing data in Stata a whirlwind tour
Handling missing data in Stata a whirlwind tour 2012 Italian Stata Users Group Meeting Jonathan Bartlett www.missingdata.org.uk 20th September 2012 1/55 Outline The problem of missing data and a principled
More informationMissing Data in Longitudinal Studies: To Impute or not to Impute? Robert Platt, PhD McGill University
Missing Data in Longitudinal Studies: To Impute or not to Impute? Robert Platt, PhD McGill University 1 Outline Missing data definitions Longitudinal data specific issues Methods Simple methods Multiple
More informationIntroduction to mixed model and missing data issues in longitudinal studies
Introduction to mixed model and missing data issues in longitudinal studies Hélène JacqminGadda INSERM, U897, Bordeaux, France Inserm workshop, St Raphael Outline of the talk I Introduction Mixed models
More informationMissing Data Dr Eleni Matechou
1 Statistical Methods Principles Missing Data Dr Eleni Matechou matechou@stats.ox.ac.uk References: R.J.A. Little and D.B. Rubin 2nd edition Statistical Analysis with Missing Data J.L. Schafer and J.W.
More informationSPSS TRAINING SESSION 3 ADVANCED TOPICS (PASW STATISTICS 17.0) Sun Li Centre for Academic Computing lsun@smu.edu.sg
SPSS TRAINING SESSION 3 ADVANCED TOPICS (PASW STATISTICS 17.0) Sun Li Centre for Academic Computing lsun@smu.edu.sg IN SPSS SESSION 2, WE HAVE LEARNT: Elementary Data Analysis Group Comparison & Oneway
More informationHandling missing data in large data sets. Agostino Di Ciaccio Dept. of Statistics University of Rome La Sapienza
Handling missing data in large data sets Agostino Di Ciaccio Dept. of Statistics University of Rome La Sapienza The problem Often in official statistics we have large data sets with many variables and
More informationPrognosis of survival for breast cancer patients
Prognosis of survival for breast cancer patients Ken Ryder Breast Cancer Unit Data Section Guy s Hospital Patrick Royston, MRC Clinical Trials Unit London Outline Introduce the data and outcomes requested
More informationStatistical modelling with missing data using multiple imputation. Session 4: Sensitivity Analysis after Multiple Imputation
Statistical modelling with missing data using multiple imputation Session 4: Sensitivity Analysis after Multiple Imputation James Carpenter London School of Hygiene & Tropical Medicine Email: james.carpenter@lshtm.ac.uk
More informationAdequacy of Biomath. Models. Empirical Modeling Tools. Bayesian Modeling. Model Uncertainty / Selection
Directions in Statistical Methodology for Multivariable Predictive Modeling Frank E Harrell Jr University of Virginia Seattle WA 19May98 Overview of Modeling Process Model selection Regression shape Diagnostics
More informationUsing the Delta Method to Construct Confidence Intervals for Predicted Probabilities, Rates, and Discrete Changes
Using the Delta Method to Construct Confidence Intervals for Predicted Probabilities, Rates, Discrete Changes JunXuJ.ScottLong Indiana University August 22, 2005 The paper provides technical details on
More informationTips for surviving the analysis of survival data. Philip TwumasiAnkrah, PhD
Tips for surviving the analysis of survival data Philip TwumasiAnkrah, PhD Big picture In medical research and many other areas of research, we often confront continuous, ordinal or dichotomous outcomes
More informationMissing Data & How to Deal: An overview of missing data. Melissa Humphries Population Research Center
Missing Data & How to Deal: An overview of missing data Melissa Humphries Population Research Center Goals Discuss ways to evaluate and understand missing data Discuss common missing data methods Know
More informationBig data size isn t enough! Irene Petersen, PhD Primary Care & Population Health
Big data size isn t enough! Irene Petersen, PhD Primary Care & Population Health Introduction Reader (Statistics and Epidemiology) Research team epidemiologists/statisticians/phd students Primary care
More informationRelative survival an introduction and recent developments
Relative survival an introduction and recent developments Paul W. Dickman Department of Medical Epidemiology and Biostatistics Karolinska Institutet, Stockholm, Sweden paul.dickman@ki.se 11 December 2008
More informationAn Application of the Gformula to Asbestos and Lung Cancer. Stephen R. Cole. Epidemiology, UNC Chapel Hill. Slides: www.unc.
An Application of the Gformula to Asbestos and Lung Cancer Stephen R. Cole Epidemiology, UNC Chapel Hill Slides: www.unc.edu/~colesr/ 1 Acknowledgements Collaboration with David B. Richardson, Haitao
More informationHow to choose an analysis to handle missing data in longitudinal observational studies
How to choose an analysis to handle missing data in longitudinal observational studies ICH, 25 th February 2015 Ian White MRC Biostatistics Unit, Cambridge, UK Plan Why are missing data a problem? Methods:
More informationSocial inequalities impacts of care management and survival in patients with nonhodgkin lymphomas (ISOLYMPH)
Session 3 : Epidemiology and public health Social inequalities impacts of care management and survival in patients with nonhodgkin lymphomas (ISOLYMPH) Le GuyaderPeyrou Sandra Bergonie Institut Context:
More informationDealing with missing data: Key assumptions and methods for applied analysis
Technical Report No. 4 May 6, 2013 Dealing with missing data: Key assumptions and methods for applied analysis Marina SoleyBori msoley@bu.edu This paper was published in fulfillment of the requirements
More informationStatistics Graduate Courses
Statistics Graduate Courses STAT 7002Topics in StatisticsBiological/Physical/Mathematics (cr.arr.).organized study of selected topics. Subjects and earnable credit may vary from semester to semester.
More informationImputation of missing data under missing not at random assumption & sensitivity analysis
Imputation of missing data under missing not at random assumption & sensitivity analysis S. Jolani Department of Methodology and Statistics, Utrecht University, the Netherlands Advanced Multiple Imputation,
More information13. Poisson Regression Analysis
136 Poisson Regression Analysis 13. Poisson Regression Analysis We have so far considered situations where the outcome variable is numeric and Normally distributed, or binary. In clinical work one often
More informationProblem of Missing Data
VASA Mission of VA Statisticians Association (VASA) Promote & disseminate statistical methodological research relevant to VA studies; Facilitate communication & collaboration among VAaffiliated statisticians;
More informationDevelopment and validation of a prediction model with missing predictor data: a practical approach
Journal of Clinical Epidemiology 63 (2010) 205e214 Development and validation of a prediction model with missing predictor data: a practical approach Yvonne Vergouwe a, *, Patrick Royston b, Karel G.M.
More informationItem Imputation Without Specifying Scale Structure
Original Article Item Imputation Without Specifying Scale Structure Stef van Buuren TNO Quality of Life, Leiden, The Netherlands University of Utrecht, The Netherlands Abstract. Imputation of incomplete
More informationSensitivity Analysis in Multiple Imputation for Missing Data
Paper SAS2702014 Sensitivity Analysis in Multiple Imputation for Missing Data Yang Yuan, SAS Institute Inc. ABSTRACT Multiple imputation, a popular strategy for dealing with missing values, usually assumes
More informationAnalyzing Structural Equation Models With Missing Data
Analyzing Structural Equation Models With Missing Data Craig Enders* Arizona State University cenders@asu.edu based on Enders, C. K. (006). Analyzing structural equation models with missing data. In G.
More informationDealing with Missing Data
Res. Lett. Inf. Math. Sci. (2002) 3, 153160 Available online at http://www.massey.ac.nz/~wwiims/research/letters/ Dealing with Missing Data Judi Scheffer I.I.M.S. Quad A, Massey University, P.O. Box 102904
More informationMissing Data Sensitivity Analysis of a Continuous Endpoint An Example from a Recent Submission
Missing Data Sensitivity Analysis of a Continuous Endpoint An Example from a Recent Submission Arno Fritsch Clinical Statistics Europe, Bayer November 21, 2014 ASA NJ Chapter / Bayer Workshop, Whippany
More informationModern Methods for Missing Data
Modern Methods for Missing Data Paul D. Allison, Ph.D. Statistical Horizons LLC www.statisticalhorizons.com 1 Introduction Missing data problems are nearly universal in statistical practice. Last 25 years
More informationHandling attrition and nonresponse in longitudinal data
Longitudinal and Life Course Studies 2009 Volume 1 Issue 1 Pp 6372 Handling attrition and nonresponse in longitudinal data Harvey Goldstein University of Bristol Correspondence. Professor H. Goldstein
More informationMissing Data: Part 1 What to Do? Carol B. Thompson Johns Hopkins Biostatistics Center SON Brown Bag 3/20/13
Missing Data: Part 1 What to Do? Carol B. Thompson Johns Hopkins Biostatistics Center SON Brown Bag 3/20/13 Overview Missingness and impact on statistical analysis Missing data assumptions/mechanisms Conventional
More informationA Mixed Model Approach for IntenttoTreat Analysis in Longitudinal Clinical Trials with Missing Values
Methods Report A Mixed Model Approach for IntenttoTreat Analysis in Longitudinal Clinical Trials with Missing Values Hrishikesh Chakraborty and Hong Gu March 9 RTI Press About the Author Hrishikesh Chakraborty,
More informationThe Proportional Odds Model for Assessing Rater Agreement with Multiple Modalities
The Proportional Odds Model for Assessing Rater Agreement with Multiple Modalities Elizabeth GarrettMayer, PhD Assistant Professor Sidney Kimmel Comprehensive Cancer Center Johns Hopkins University 1
More informationPATTERN MIXTURE MODELS FOR MISSING DATA. Mike Kenward. London School of Hygiene and Tropical Medicine. Talk at the University of Turku,
PATTERN MIXTURE MODELS FOR MISSING DATA Mike Kenward London School of Hygiene and Tropical Medicine Talk at the University of Turku, April 10th 2012 1 / 90 CONTENTS 1 Examples 2 Modelling Incomplete Data
More informationAdvanced Quantitative Methods for Health Care Professionals PUBH 742 Spring 2015
1 Advanced Quantitative Methods for Health Care Professionals PUBH 742 Spring 2015 Instructor: Joanne M. Garrett, PhD email: joanne_garrett@med.unc.edu Class Notes: Copies of the class lecture slides
More informationStatistical Analysis with Missing Data
Statistical Analysis with Missing Data Second Edition RODERICK J. A. LITTLE DONALD B. RUBIN WILEY INTERSCIENCE A JOHN WILEY & SONS, INC., PUBLICATION Contents Preface PARTI OVERVIEW AND BASIC APPROACHES
More informationPrevalence odds ratio or prevalence ratio in the analysis of cross sectional data: what is to be done?
272 Occup Environ Med 1998;55:272 277 Prevalence odds ratio or prevalence ratio in the analysis of cross sectional data: what is to be done? Mary Lou Thompson, J E Myers, D Kriebel Department of Biostatistics,
More informationOverview. Longitudinal Data Variation and Correlation Different Approaches. Linear Mixed Models Generalized Linear Mixed Models
Overview 1 Introduction Longitudinal Data Variation and Correlation Different Approaches 2 Mixed Models Linear Mixed Models Generalized Linear Mixed Models 3 Marginal Models Linear Models Generalized Linear
More informationMISSING DATA TECHNIQUES WITH SAS. IDRE Statistical Consulting Group
MISSING DATA TECHNIQUES WITH SAS IDRE Statistical Consulting Group ROAD MAP FOR TODAY To discuss: 1. Commonly used techniques for handling missing data, focusing on multiple imputation 2. Issues that could
More informationProgramme du parcours Clinical Epidemiology 20142015. UMR 1. Methods in therapeutic evaluation A Dechartres/A Flahault
Programme du parcours Clinical Epidemiology 20142015 UR 1. ethods in therapeutic evaluation A /A Date cours Horaires 15/10/2014 1417h General principal of therapeutic evaluation (1) 22/10/2014 1417h
More informationRandomized trials versus observational studies
Randomized trials versus observational studies The case of postmenopausal hormone therapy and heart disease Miguel Hernán Harvard School of Public Health www.hsph.harvard.edu/causal Joint work with James
More informationReview of the Methods for Handling Missing Data in. Longitudinal Data Analysis
Int. Journal of Math. Analysis, Vol. 5, 2011, no. 1, 113 Review of the Methods for Handling Missing Data in Longitudinal Data Analysis Michikazu Nakai and Weiming Ke Department of Mathematics and Statistics
More informationA LONGITUDINAL AND SURVIVAL MODEL WITH HEALTH CARE USAGE FOR INSURED ELDERLY. Workshop
A LONGITUDINAL AND SURVIVAL MODEL WITH HEALTH CARE USAGE FOR INSURED ELDERLY Ramon Alemany Montserrat Guillén Xavier Piulachs Lozada Riskcenter  IREA Universitat de Barcelona http://www.ub.edu/riskcenter
More informationA General Approach to Variance Estimation under Imputation for Missing Survey Data
A General Approach to Variance Estimation under Imputation for Missing Survey Data J.N.K. Rao Carleton University Ottawa, Canada 1 2 1 Joint work with J.K. Kim at Iowa State University. 2 Workshop on Survey
More informationBayesX  Software for Bayesian Inference in Structured Additive Regression
BayesX  Software for Bayesian Inference in Structured Additive Regression Thomas Kneib Faculty of Mathematics and Economics, University of Ulm Department of Statistics, LudwigMaximiliansUniversity Munich
More informationRegression Modeling Strategies
Frank E. Harrell, Jr. Regression Modeling Strategies With Applications to Linear Models, Logistic Regression, and Survival Analysis With 141 Figures Springer Contents Preface Typographical Conventions
More informationA REVIEW OF CURRENT SOFTWARE FOR HANDLING MISSING DATA
123 Kwantitatieve Methoden (1999), 62, 123138. A REVIEW OF CURRENT SOFTWARE FOR HANDLING MISSING DATA Joop J. Hox 1 ABSTRACT. When we deal with a large data set with missing data, we have to undertake
More informationLinda K. Muthén Bengt Muthén. Copyright 2008 Muthén & Muthén www.statmodel.com. Table Of Contents
Mplus Short Courses Topic 2 Regression Analysis, Eploratory Factor Analysis, Confirmatory Factor Analysis, And Structural Equation Modeling For Categorical, Censored, And Count Outcomes Linda K. Muthén
More informationEXPANDING THE EVIDENCE BASE IN OUTCOMES RESEARCH: USING LINKED ELECTRONIC MEDICAL RECORDS (EMR) AND CLAIMS DATA
EXPANDING THE EVIDENCE BASE IN OUTCOMES RESEARCH: USING LINKED ELECTRONIC MEDICAL RECORDS (EMR) AND CLAIMS DATA A CASE STUDY EXAMINING RISK FACTORS AND COSTS OF UNCONTROLLED HYPERTENSION ISPOR 2013 WORKSHOP
More informationVI. Introduction to Logistic Regression
VI. Introduction to Logistic Regression We turn our attention now to the topic of modeling a categorical outcome as a function of (possibly) several factors. The framework of generalized linear models
More informationImputation Methods to Deal with Missing Values when Data Mining Trauma Injury Data
Imputation Methods to Deal with Missing Values when Data Mining Trauma Injury Data Kay I Penny Centre for Mathematics and Statistics, Napier University, Craiglockhart Campus, Edinburgh, EH14 1DJ k.penny@napier.ac.uk
More informationGuide to Biostatistics
MedPage Tools Guide to Biostatistics Study Designs Here is a compilation of important epidemiologic and common biostatistical terms used in medical research. You can use it as a reference guide when reading
More informationOverview Classes. 123 Logistic regression (5) 193 Building and applying logistic regression (6) 263 Generalizations of logistic regression (7)
Overview Classes 123 Logistic regression (5) 193 Building and applying logistic regression (6) 263 Generalizations of logistic regression (7) 24 Loglinear models (8) 54 1517 hrs; 5B02 Building and
More informationHCUP Methods Series Missing Data Methods for the NIS and the SID Report # 201501
HCUP Methods Series Contact Information: Healthcare Cost and Utilization Project (HCUP) Agency for Healthcare Research and Quality 540 Gaither Road Rockville, MD 20850 http://www.hcupus.ahrq.gov For Technical
More informationThe point estimate you choose depends on the nature of the outcome of interest odds ratio hazard ratio
Point Estimation Definition: A point estimate is a onenumber summary of data. If you had just one number to summarize the inference from your study.. Examples: Dose finding trials: MTD (maximum tolerable
More informationService courses for graduate students in degree programs other than the MS or PhD programs in Biostatistics.
Course Catalog In order to be assured that all prerequisites are met, students must acquire a permission number from the education coordinator prior to enrolling in any Biostatistics course. Courses are
More informationHealth 2011 Survey: An overview of the design, missing data and statistical analyses examples
Health 2011 Survey: An overview of the design, missing data and statistical analyses examples Tommi Härkänen Department of Health, Functional Capacity and Welfare The National Institute for Health and
More informationAuxiliary Variables in Mixture Modeling: 3Step Approaches Using Mplus
Auxiliary Variables in Mixture Modeling: 3Step Approaches Using Mplus Tihomir Asparouhov and Bengt Muthén Mplus Web Notes: No. 15 Version 8, August 5, 2014 1 Abstract This paper discusses alternatives
More informationReanalysis using Inverse Probability Weighting and Multiple Imputation of Data from the Southampton Women s Survey
Reanalysis using Inverse Probability Weighting and Multiple Imputation of Data from the Southampton Women s Survey MRC Biostatistics Unit Institute of Public Health Forvie Site Robinson Way Cambridge
More informationNominal and ordinal logistic regression
Nominal and ordinal logistic regression April 26 Nominal and ordinal logistic regression Our goal for today is to briefly go over ways to extend the logistic regression model to the case where the outcome
More informationDistance to Event vs. Propensity of Event A Survival Analysis vs. Logistic Regression Approach
Distance to Event vs. Propensity of Event A Survival Analysis vs. Logistic Regression Approach Abhijit Kanjilal Fractal Analytics Ltd. Abstract: In the analytics industry today, logistic regression is
More informationExam C, Fall 2006 PRELIMINARY ANSWER KEY
Exam C, Fall 2006 PRELIMINARY ANSWER KEY Question # Answer Question # Answer 1 E 19 B 2 D 20 D 3 B 21 A 4 C 22 A 5 A 23 E 6 D 24 E 7 B 25 D 8 C 26 A 9 E 27 C 10 D 28 C 11 E 29 C 12 B 30 B 13 C 31 C 14
More informationIII. INTRODUCTION TO LOGISTIC REGRESSION. a) Example: APACHE II Score and Mortality in Sepsis
III. INTRODUCTION TO LOGISTIC REGRESSION 1. Simple Logistic Regression a) Example: APACHE II Score and Mortality in Sepsis The following figure shows 30 day mortality in a sample of septic patients as
More informationThe Basics of Regression Analysis. for TIPPS. Lehana Thabane. What does correlation measure? Correlation is a measure of strength, not causation!
The Purpose of Regression Modeling The Basics of Regression Analysis for TIPPS Lehana Thabane To verify the association or relationship between a single variable and one or more explanatory One explanatory
More informationUsing Medical Research Data to Motivate Methodology Development among Undergraduates in SIBS Pittsburgh
Using Medical Research Data to Motivate Methodology Development among Undergraduates in SIBS Pittsburgh Megan Marron and Abdus Wahed Graduate School of Public Health Outline My Experience Motivation for
More informationSurvival Analysis Using SPSS. By Hui Bian Office for Faculty Excellence
Survival Analysis Using SPSS By Hui Bian Office for Faculty Excellence Survival analysis What is survival analysis Event history analysis Time series analysis When use survival analysis Research interest
More informationMultivariate Logistic Regression
1 Multivariate Logistic Regression As in univariate logistic regression, let π(x) represent the probability of an event that depends on p covariates or independent variables. Then, using an inv.logit formulation
More informationSun Li Centre for Academic Computing lsun@smu.edu.sg
Sun Li Centre for Academic Computing lsun@smu.edu.sg Elementary Data Analysis Group Comparison & Oneway ANOVA Nonparametric Tests Correlations General Linear Regression Logistic Models Binary Logistic
More informationR 2 type Curves for Dynamic Predictions from Joint LongitudinalSurvival Models
Faculty of Health Sciences R 2 type Curves for Dynamic Predictions from Joint LongitudinalSurvival Models Inference & application to prediction of kidney graft failure Paul Blanche joint work with MC.
More informationSUMAN DUVVURU STAT 567 PROJECT REPORT
SUMAN DUVVURU STAT 567 PROJECT REPORT SURVIVAL ANALYSIS OF HEROIN ADDICTS Background and introduction: Current illicit drug use among teens is continuing to increase in many countries around the world.
More informationOrdinal Regression. Chapter
Ordinal Regression Chapter 4 Many variables of interest are ordinal. That is, you can rank the values, but the real distance between categories is unknown. Diseases are graded on scales from least severe
More informationLongitudinal Data Analysis. Wiley Series in Probability and Statistics
Brochure More information from http://www.researchandmarkets.com/reports/2172736/ Longitudinal Data Analysis. Wiley Series in Probability and Statistics Description: Longitudinal data analysis for biomedical
More informationThe CRM for ordinal and multivariate outcomes. Elizabeth GarrettMayer, PhD Emily Van Meter
The CRM for ordinal and multivariate outcomes Elizabeth GarrettMayer, PhD Emily Van Meter Hollings Cancer Center Medical University of South Carolina Outline Part 1: Ordinal toxicity model Part 2: Efficacy
More informationIntroduction to Analysis Methods for Longitudinal/Clustered Data, Part 3: Generalized Estimating Equations
Introduction to Analysis Methods for Longitudinal/Clustered Data, Part 3: Generalized Estimating Equations Mark A. Weaver, PhD Family Health International Office of AIDS Research, NIH ICSSC, FHI Goa, India,
More informationChecking proportionality for Cox s regression model
Checking proportionality for Cox s regression model by Hui Hong Zhang Thesis for the degree of Master of Science (Master i Modellering og dataanalyse) Department of Mathematics Faculty of Mathematics and
More informationMissing data are ubiquitous in clinical research.
Advanced Statistics: Missing Data in Clinical Research Part 1: An Introduction and Conceptual Framework Jason S. Haukoos, MD, MS, Craig D. Newgard, MD, MPH Abstract Missing data are commonly encountered
More informationIncorrect Analyses of Radiation and Mesothelioma in the U.S. Transuranium and Uranium Registries Joey Zhou, Ph.D.
Incorrect Analyses of Radiation and Mesothelioma in the U.S. Transuranium and Uranium Registries Joey Zhou, Ph.D. At the Annual Meeting of the Health Physics Society July 15, 2014 in Baltimore A recently
More informationSP10 From GLM to GLIMMIXWhich Model to Choose? Patricia B. Cerrito, University of Louisville, Louisville, KY
SP10 From GLM to GLIMMIXWhich Model to Choose? Patricia B. Cerrito, University of Louisville, Louisville, KY ABSTRACT The purpose of this paper is to investigate several SAS procedures that are used in
More informationMultinomial and Ordinal Logistic Regression
Multinomial and Ordinal Logistic Regression ME104: Linear Regression Analysis Kenneth Benoit August 22, 2012 Regression with categorical dependent variables When the dependent variable is categorical,
More informationModule 14: Missing Data Stata Practical
Module 14: Missing Data Stata Practical Jonathan Bartlett & James Carpenter London School of Hygiene & Tropical Medicine www.missingdata.org.uk Supported by ESRC grant RES 189250103 and MRC grant G0900724
More informationTravel Distance to Healthcare Centers is Associated with Advanced Colon Cancer at Presentation
Travel Distance to Healthcare Centers is Associated with Advanced Colon Cancer at Presentation Yan Xing, MD, PhD, Ryaz B. Chagpar, MD, MS, Y Nancy You MD, MHSc, Yi Ju Chiang, MSPH, Barry W. Feig, MD, George
More information7.1 The Hazard and Survival Functions
Chapter 7 Survival Models Our final chapter concerns models for the analysis of data which have three main characteristics: (1) the dependent variable or response is the waiting time until the occurrence
More informationMissing data in randomized controlled trials (RCTs) can
EVALUATION TECHNICAL ASSISTANCE BRIEF for OAH & ACYF Teenage Pregnancy Prevention Grantees May 2013 Brief 3 Coping with Missing Data in Randomized Controlled Trials Missing data in randomized controlled
More informationSampling Error Estimation in DesignBased Analysis of the PSID Data
Technical Series Paper #1105 Sampling Error Estimation in DesignBased Analysis of the PSID Data Steven G. Heeringa, Patricia A. Berglund, Azam Khan Survey Research Center, Institute for Social Research
More informationEfficient and Practical Econometric Methods for the SLID, NLSCY, NPHS
Efficient and Practical Econometric Methods for the SLID, NLSCY, NPHS Philip Merrigan ESGUQAM, CIRPÉE Using Big Data to Study Development and Social Change, Concordia University, November 2103 Intro Longitudinal
More informationAnalysis of Longitudinal Data with Missing Values.
Analysis of Longitudinal Data with Missing Values. Methods and Applications in Medical Statistics. Ingrid Garli Dragset Master of Science in Physics and Mathematics Submission date: June 2009 Supervisor:
More informationKomorbide brystkræftpatienter kan de tåle behandling? Et registerstudie baseret på Danish Breast Cancer Cooperative Group
Komorbide brystkræftpatienter kan de tåle behandling? Et registerstudie baseret på Danish Breast Cancer Cooperative Group Lotte Holm Land MD, ph.d. Onkologisk Afd. R. OUH Kræft og komorbiditet  alle skal
More informationProbability Calculator
Chapter 95 Introduction Most statisticians have a set of probability tables that they refer to in doing their statistical wor. This procedure provides you with a set of electronic statistical tables that
More informationI L L I N O I S UNIVERSITY OF ILLINOIS AT URBANACHAMPAIGN
Beckman HLM Reading Group: Questions, Answers and Examples Carolyn J. Anderson Department of Educational Psychology I L L I N O I S UNIVERSITY OF ILLINOIS AT URBANACHAMPAIGN Linear Algebra Slide 1 of
More information11. Analysis of Casecontrol Studies Logistic Regression
Research methods II 113 11. Analysis of Casecontrol Studies Logistic Regression This chapter builds upon and further develops the concepts and strategies described in Ch.6 of Mother and Child Health:
More informationMultiply imputing missing values in data sets with. generalised linear models
Multiply imputing missing values in data sets with mixed measurement scales using a sequence of generalised linear models Min Lee Robin Mitra School of Mathematics University of Southampton, Southampton,
More informationPortfolio Using Queuing Theory
Modeling the Number of Insured Households in an Insurance Portfolio Using Queuing Theory JeanPhilippe Boucher and Guillaume CouturePiché December 8, 2015 Quantact / Département de mathématiques, UQAM.
More informationAn Introduction to Generalized Linear Mixed Models Using SAS PROC GLIMMIX
An Introduction to Generalized Linear Mixed Models Using SAS PROC GLIMMIX Phil Gibbs Advanced Analytics Manager SAS Technical Support November 22, 2008 UC Riverside What We Will Cover Today What is PROC
More informationMultilevel Modelling of medical data
Statistics in Medicine(00). To appear. Multilevel Modelling of medical data By Harvey Goldstein William Browne And Jon Rasbash Institute of Education, University of London 1 Summary This tutorial presents
More informationGoodness of fit assessment of item response theory models
Goodness of fit assessment of item response theory models Alberto Maydeu Olivares University of Barcelona Madrid November 1, 014 Outline Introduction Overall goodness of fit testing Two examples Assessing
More informationMissing values in data analysis: Ignore or Impute?
ORIGINAL ARTICLE Missing values in data analysis: Ignore or Impute? Ng Chong Guan 1, Muhamad Saiful Bahri Yusoff 2 1 Department of Psychological Medicine, Faculty of Medicine, University Malaya 2 Medical
More informationLecture 3: Linear methods for classification
Lecture 3: Linear methods for classification Rafael A. Irizarry and Hector Corrada Bravo February, 2010 Today we describe four specific algorithms useful for classification problems: linear regression,
More informationMaster programme in Statistics
Master programme in Statistics Björn Holmquist 1 1 Department of Statistics Lund University Cramérsällskapets årskonferens, 20100325 Master programme Vad är ett Master programme? Breddmaster vs Djupmaster
More information