Relative survival an introduction and recent developments



Similar documents
Competing-risks regression

Are Patients Diagnosed With Breast Cancer Before Age 50 Years Ever Cured? Hermann Brenner and Timo Hakulinen

Introduction. Survival Analysis. Censoring. Plan of Talk

Missing data and net survival analysis Bernard Rachet

Finnish Cancer Registry Institute for Statistical and Epidemiological Cancer Research. Survival ratios of cancer patients by area in Finland

Measures of Prognosis. Sukon Kanchanaraksa, PhD Johns Hopkins University

III. INTRODUCTION TO LOGISTIC REGRESSION. a) Example: APACHE II Score and Mortality in Sepsis

Tips for surviving the analysis of survival data. Philip Twumasi-Ankrah, PhD

13. Poisson Regression Analysis

Topic 3 - Survival Analysis

Modelling spousal mortality dependence: evidence of heterogeneities and implications

Travel Distance to Healthcare Centers is Associated with Advanced Colon Cancer at Presentation

Lecture 2 ESTIMATING THE SURVIVAL FUNCTION. One-sample nonparametric methods

Fraction of normal remaining life span: a new method for expressing survival in cancer

Mortality Assessment Technology: A New Tool for Life Insurance Underwriting

Section 8» Incidence, Mortality, Survival and Prevalence

SUMAN DUVVURU STAT 567 PROJECT REPORT

11. Analysis of Case-control Studies Logistic Regression

Multinomial and Ordinal Logistic Regression

An Application of the G-formula to Asbestos and Lung Cancer. Stephen R. Cole. Epidemiology, UNC Chapel Hill. Slides:

Module 14: Missing Data Stata Practical

Hormones and cardiovascular disease, what the Danish Nurse Cohort learned us

Journal of Statistical Software

LOGISTIC REGRESSION ANALYSIS

Incorrect Analyses of Radiation and Mesothelioma in the U.S. Transuranium and Uranium Registries Joey Zhou, Ph.D.

Case-control studies. Alfredo Morabia

The American Cancer Society Cancer Prevention Study I: 12-Year Followup

Table 16a Multiple Myeloma Average Annual Number of Cancer Cases and Age-Adjusted Incidence Rates* for

Temporal Trends in Demographics and Overall Survival of Non Small-Cell Lung Cancer Patients at Moffitt Cancer Center From 1986 to 2008

Singapore Cancer Registry Annual Registry Report Trends in Cancer Incidence in Singapore National Registry of Diseases Office (NRDO)

Cancer in Primary Care: Prostate Cancer Screening. How and How often? Should we and in which patients?

Indices of Morbidity and Mortality. Sukon Kanchanaraksa, PhD Johns Hopkins University

Number. Source: Vital Records, M CDPH

Prostate cancer statistics

Prevalence odds ratio or prevalence ratio in the analysis of cross sectional data: what is to be done?

Introduction to Event History Analysis DUSTIN BROWN POPULATION RESEARCH CENTER

Analysis of Population Cancer Risk Factors in National Information System SVOD

Guide to Biostatistics

Effect of Risk and Prognosis Factors on Breast Cancer Survival: Study of a Large Dataset with a Long Term Follow-up

Institut für Soziologie Eberhard Karls Universität Tübingen

Early mortality rate (EMR) in Acute Myeloid Leukemia (AML)

Handling missing data in Stata a whirlwind tour

Butler Memorial Hospital Community Health Needs Assessment 2013

Chapter 2: Health in Wales and the United Kingdom

Multivariate Logistic Regression

Unit 12 Logistic Regression Supplementary Chapter 14 in IPS On CD (Chap 16, 5th ed.)

SAS Software to Fit the Generalized Linear Model

How to set the main menu of STATA to default factory settings standards

Discussion Section 4 ECON 139/ Summer Term II

Exercise Answers. Exercise B 2. C 3. A 4. B 5. A

Published Ahead of Print on October 3, 2011 as /JCO J Clin Oncol by American Society of Clinical Oncology INTRODUCTION

Skin Cancer: The Facts. Slide content provided by Loraine Marrett, Senior Epidemiologist, Division of Preventive Oncology, Cancer Care Ontario.

PSA Testing 101. Stanley H. Weiss, MD. Professor, UMDNJ-New Jersey Medical School. Director & PI, Essex County Cancer Coalition. weiss@umdnj.

especially with continuous

4/8/13. Pre-test Audience Response. Prostate Cancer Screening and Treatment of Prostate Cancer: The 2013 Perspective

Web portal for information on cancer epidemiology in the Czech Republic

Cancer in Ireland 2013: Annual report of the National Cancer Registry

Likelihood of Cancer

Duration Analysis. Econometric Analysis. Dr. Keshab Bhattarai. April 4, Hull Univ. Business School

From the help desk: hurdle models

Generalized Linear Models

Epidemiology, Staging and Treatment of Lung Cancer. Mark A. Socinski, MD

PSA Screening and the USPSTF Understanding the Controversy

Pricing the Critical Illness Risk: The Continuous Challenge.

SAS and R calculations for cause specific hazard ratios in a competing risks analysis with time dependent covariates

The Landmark Approach: An Introduction and Application to Dynamic Prediction in Competing Risks

Time varying (or time-dependent) covariates

Methods for Meta-analysis in Medical Research

VI. Introduction to Logistic Regression

National Life Tables, United Kingdom:

CHILDHOOD CANCER INCIDENCE UPDATE:

General and Abdominal Adiposity and Risk of Death in Europe

7.1 The Hazard and Survival Functions

ANNEX 2: Assessment of the 7 points agreed by WATCH as meriting attention (cover paper, paragraph 9, bullet points) by Andy Darnton, HSE

Cancer research in the Midland Region the prostate and bowel cancer projects

Big data size isn t enough! Irene Petersen, PhD Primary Care & Population Health

MULTIPLE REGRESSION EXAMPLE

R 2 -type Curves for Dynamic Predictions from Joint Longitudinal-Survival Models

Statistics fact sheet

The first three steps in a logistic regression analysis with examples in IBM SPSS. Steve Simon P.Mean Consulting

La sopravvivenza dei tumori maligni nei confronti internazionali: anticipazione della diagnosi o modifica del decorso naturale della malattia?

Alabama s Rural and Urban Counties

CHILDHOOD CANCER SURVIVOR STUDY Analysis Concept Proposal

X X X a) perfect linear correlation b) no correlation c) positive correlation (r = 1) (r = 0) (0 < r < 1)

Cancer Survival - How Long Do People Survive?

Lecture 1: Introduction to Epidemiology

Southern NSW Local Health District: Our Population s Health

Adequacy of Biomath. Models. Empirical Modeling Tools. Bayesian Modeling. Model Uncertainty / Selection

Basic Study Designs in Analytical Epidemiology For Observational Studies

Randomized trials versus observational studies

Biostatistics and Epidemiology within the Paradigm of Public Health. Sukon Kanchanaraksa, PhD Marie Diener-West, PhD Johns Hopkins University

Summary Measures (Ratio, Proportion, Rate) Marie Diener-West, PhD Johns Hopkins University

Comments on the RF fields epidemiology section pages in SCENIHR approved at the 4 th plenary of 12 December 2013

Quantifying Life expectancy in people with Type 2 diabetes

Critical Illness insurance AC04 diagnosis rates

Development of the nomolog program and its evolution

Munich Cancer Registry

Dealing with Missing Data

Supplementary Online Content

Using Stata for Categorical Data Analysis

Transcription:

Relative survival an introduction and recent developments Paul W. Dickman Department of Medical Epidemiology and Biostatistics Karolinska Institutet, Stockholm, Sweden paul.dickman@ki.se 11 December 2008 EpiStat, Gothenburg, 11 December 2008 Outline Introduction to relative survival and why it is often preferred over cause-specific survival for the study of cancer patient survival using data collected by population-based cancer registries. Is relative survival a useful measure for conditions other than cancer? Modelling relative survival. Poisson regression. Flexible parametric models. Cure models. Estimating survival in the presence of competing risks. EpiStat, Gothenburg, 11 December 2008 1 Introduction to relative survival Interest is typically in net mortality (mortality associated with a diagnosis of cancer) [1]. Cause-specific mortality is often used to estimate net mortality only those deaths which can be attributed to the cancer in question are considered to be events. cause-specific mortality = number of deaths due to cancer person time at risk EpiStat, Gothenburg, 11 December 2008 2

Potential disadvantages of cause-specific survival Using cause-specific mortality requires that reliably coded information on cause of death is available. Even when cause of death information is available to the cancer registry via death certificates, it is often vague and difficult to determine whether or not cancer is the primary cause of death. How do we classify, for example, deaths due to treatment complications? Consider a man diagnosed with prostate cancer and treated with estrogen who dies following a myocardial infarction. Do we classify this death as due entirely to prostate cancer or due entirely to other causes? Welch et al. [2] studied deaths among surgically treated cancer patients that occurred within one month of diagnosis. They found that 41% of deaths were not attributed to the coded cancer. EpiStat, Gothenburg, 11 December 2008 3 Relative survival Can instead estimate excess mortality: the difference between observed (all-cause) and expected mortality. excess = observed expected mortality mortality mortality Relative survival is the survival analog of excess mortality the relative survival ratio is defined as the observed survival in the patient group divided by the expected survival of a comparable group from the general population. It is usual to estimate the expected survival proportion from nationwide (or statewide) population life tables stratified by age, sex, calendar time, and, where applicable, race [3]. Although these tables include the effect of deaths due to the cancer being studied, Ederer et al. [4] showed that this does not, in practice, affect the estimated survival proportions. EpiStat, Gothenburg, 11 December 2008 4 A major advantage of relative survival (excess mortality) is that information on cause of death is not required, thereby circumventing problems with the inaccuracy [5] or nonavailability of death certificates. We obtain a measure of the excess mortality experienced by patients diagnosed with cancer, irrespective of whether the excess mortality is directly or indirectly attributable to the cancer. Deaths due to treatment complications or suicide are examples of deaths which may be considered indirectly attributable to cancer. EpiStat, Gothenburg, 11 December 2008 5

Cervical cancer diagnosed in New Zealand 1994 2001 Life table estimates of patient survival Women diagnosed 1994-2001 with follow-up to the end of 2002 Interval- Interval- Effective specific Cumulative Cumulative specific Cumulative number observed observed expected relative relative I N D W at risk survival survival survival survival survival 1 1559 209 0 1559.0 0.86594 0.86594 0.98996 0.87472 0.87472 2 1350 125 177 1261.5 0.90091 0.78014 0.98192 0.90829 0.79450 3 1048 58 172 962.0 0.93971 0.73310 0.97362 0.94772 0.75296 4 818 32 155 740.5 0.95679 0.70142 0.96574 0.96459 0.72630 5 631 23 148 557.0 0.95871 0.67246 0.95766 0.96679 0.70218 6 460 10 130 395.0 0.97468 0.65543 0.94972 0.98284 0.69013 7 320 5 129 255.5 0.98043 0.64261 0.94198 0.98848 0.68219 8 186 3 134 119.0 0.97479 0.62641 0.93312 0.98405 0.67130 9 49 1 48 25.0 0.96000 0.60135 0.91869 0.97508 0.65457 EpiStat, Gothenburg, 11 December 2008 6 Issues with relative survival The central issue in estimating relative survival is defining a comparable group from the general population and estimating expected survival. If not all of the excess mortality is due to the cancer then the relative survival ratio will underestimate net survival (overestimate excess mortality). For example, patients diagnosed with smoking-related cancers will experience excess mortality, compared to the general population, due to both the cancer and other smoking related conditions. Should the patients be a selected group from the general population, for example, with respect to social class, the national population might not be an appropriate comparison group. EpiStat, Gothenburg, 11 December 2008 7 Statistical cure The life table is a useful tool for describing the survival experience of the patients over a long follow-up period. In particular, an interval-specific relative survival ratio equal to one indicates that, during the specified interval, mortality in the patient group was equivalent to that of the general population. The attainment and maintenance of an interval-specific RSR of one indicates that there is no excess mortality due to cancer and the patients are assumed to be statistically cured. An individual is considered to be medically cured if he or she no longer displays symptoms of the disease. Statistical cure applies at a group, rather than individual, level. EpiStat, Gothenburg, 11 December 2008 8

r 1.1 1.0 0.9 0.8 0.7 0.6 0.5 0.4 0.3 Cancer of the stomach 1 2 3 4 5 6 7 8 9 10 Annual follow-up interval Figure 1: Plots of the annual (interval-specific) relative survival ratios (r) for males and females diagnosed with cancer of the stomach in Finland 1985 1994 and followed up to the end of 1995. EpiStat, Gothenburg, 11 December 2008 9 r 1.1 1.0 0.9 0.8 0.7 0.6 0.5 0.4 0.3 Cancer of the breast 1 2 3 4 5 6 7 8 9 10 Annual follow-up interval Figure 2: Plots of the annual (interval-specific) relative survival ratios (r) for females diagnosed with cancer of the breast in Finland 1985 1994 and followed up to the end of 1995. EpiStat, Gothenburg, 11 December 2008 10 Plots of the interval-specific RSR are also useful for assessing the quality of follow-up. If the interval-specific RSR levels out at a value greater than 1, this generally indicates that some deaths have been missed in the follow-up process. An interval-specific relative survival ratio of unity is generally not achieved for smoking-related cancers, such as cancer of the lung and kidney. Compared to the general population, these patients are subject to excess mortality due to the cancer in addition to excess mortality due to other conditions caused by smoking, such as cardiovascular disease. We ll return to these concepts later when we discuss cure models. EpiStat, Gothenburg, 11 December 2008 11

Interpreting relative survival estimates The cumulative relative survival ratio can be interpreted as the proportion of patients alive after i years of follow-up in the hypothetical situation where the cancer in question is the only possible cause of death. 1-RSR can be interpreted as the proportion of patients who will die of cancer within i years of follow-up in the hypothetical situation where the cancer in question is the only possible cause of death. We do not live in this hypothetical world. Estimates of the proportion of patients who will die of cancer in the presence of competing risks can also be made. Cronin and Feuer (2000) [6] extended the theory of competing risks to relative survival; their method is implemented in our Stata command strs. EpiStat, Gothenburg, 11 December 2008 12 Cumulative probability of death in men with localized prostate cancer over the age of 70. EpiStat, Gothenburg, 11 December 2008 13 Cause-specific probability of death in women with regional breast cancer EpiStat, Gothenburg, 11 December 2008 14

Estimating relative survival using a period approach In 1996 Hermann Brenner suggested estimating cancer patient survival using a period, rather than cohort, approach [7]. Time at risk is left truncated at the start of the period window and right censored at the end. This suggestion was initially met with scepticism although studies based on historical data [8] have shown that period analysis provides very good predictions of the prognosis of newly diagnosed patients; and highlights temporal trends in patient survival sooner than cohort methods. EpiStat, Gothenburg, 11 December 2008 15 Subject 1 Diagnosis Death or Censoring Start and Stop at Risk Times Standard Period (0, 2) Subject 2 (0, 4) Subject 3 Subject 4 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 Year (0, 6) (0, 3) EpiStat, Gothenburg, 11 December 2008 16 Subject 1 Diagnosis Death or Censoring Start and Stop at Risk Times Standard Period (0, 2) Subject 2 (0, 4) Subject 3 Subject 4 Period of Interest 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 Year (0, 6) (0, 3) EpiStat, Gothenburg, 11 December 2008 17

Subject 1 Diagnosis Death or Censoring Start and Stop at Risk Times Standard Period (0, 2) (0, 2) Subject 2 (0, 4) (2, 4) Subject 3 Subject 4 1995 1996 1997 1998 1999 2000 2001 2002 Period of Interest Year 2003 2004 2005 (0, 6) (0, 3) (5, 6) (, ) EpiStat, Gothenburg, 11 December 2008 18 Cervical cancer diagnosed in New Zealand 1994 2001 Life table estimates of patient survival Women diagnosed 1994-2001 with follow-up to the end of 2002 Interval- Interval- Effective specific Cumulative Cumulative specific Cumulative number observed observed expected relative relative I N D W at risk survival survival survival survival survival 1 1559 209 0 1559.0 0.86594 0.86594 0.98996 0.87472 0.87472 2 1350 125 177 1261.5 0.90091 0.78014 0.98192 0.90829 0.79450 3 1048 58 172 962.0 0.93971 0.73310 0.97362 0.94772 0.75296 4 818 32 155 740.5 0.95679 0.70142 0.96574 0.96459 0.72630 5 631 23 148 557.0 0.95871 0.67246 0.95766 0.96679 0.70218 6 460 10 130 395.0 0.97468 0.65543 0.94972 0.98284 0.69013 7 320 5 129 255.5 0.98043 0.64261 0.94198 0.98848 0.68219 8 186 3 134 119.0 0.97479 0.62641 0.93312 0.98405 0.67130 9 49 1 48 25.0 0.96000 0.60135 0.91869 0.97508 0.65457 EpiStat, Gothenburg, 11 December 2008 19 Age-standardisation of relative survival This is used primarily when interest is in obtaining estimates of relative survival for descriptive purposes and is of less interest when focus is on modelling. The problem is more complex than age-standardisation of, for example, incidence rates since the age-distribution of the patients changes during follow-up. Which weights do we use? See the papers by Brenner et al. [9, 10, 11] EpiStat, Gothenburg, 11 December 2008 20

EpiStat, Gothenburg, 11 December 2008 21 Applying relative survival to diseases other than cancer In order to interpret excess mortality as mortality due to the disease of interest we need to accurately estimate expected mortality (the mortality that would have been observed in the absence of the disease). General population mortality rates may not satisfy this criteria. Excess mortality (compared to the general population) may nevertheless still be of interest. Recent applications in cardiovascular disease[12] and HIV/AIDS[13]. Nelson et al. Relative survival: what can cardiovascular disease learn from cancer? Eur Heart J. 2008;29:941-7. Bhaskaran et al. Changes in the risk of death after HIV seroconversion compared with mortality in the general population. JAMA 2008;300:51-9. EpiStat, Gothenburg, 11 December 2008 22 Overview of approaches to modelling prognosis of cancer patients Cox proportional hazards model for cause-specific mortality Poisson regression model for cause-specific mortality Similarity of Poisson regression and Cox regression Poisson regression for excess mortality Alternative approaches to modelling excess mortality Fine splitting and modelling the baseline hazard using splines or fractional polynomials Flexible parametric models Cure models for relative survival EpiStat, Gothenburg, 11 December 2008 23

Example: survival of patients diagnosed with colon carcinoma in Finland Patients diagnosed with colon carcinoma in Finland 1984 95. Potential follow-up to end of 1995; censored after 10 years. Outcome is death due to colon carcinoma (i.e., cause-specific mortality). Interest is in the effect of clinical stage at diagnosis (distant metastases vs no distant metastases). How might we specify a statistical model for these data? EpiStat, Gothenburg, 11 December 2008 24 Smoothed empirical hazards (cancer specific mortality rates) sts graph, by(distant) hazard 0.2.4.6.8 distant = 0 distant = 1 0 2 4 6 8 10 Time since diagnosis in years EpiStat, Gothenburg, 11 December 2008 25 The Cox proportional hazards model The intercept in the Cox model [14], the hazard (event rate) for individuals with all covariates z at the reference level, can be thought of as an arbitrary function of time 1, often called the baseline hazard and denoted by λ 0 (t). The hazard at time t for individual with other covariate values is a multiple of the baseline λ(t x) =λ 0 (t)exp(β x). Alternatively ln[λ(t x)] = ln[λ 0 (t)] + β x. 1 time t can be defined in many ways, e.g., attained age, time-on-study, calendar time, etc. EpiStat, Gothenburg, 11 December 2008 26

Smoothed empirical hazards on log scale sts graph, by(distant) hazard yscale(log).2.4.6.8 distant = 0 distant = 1 0 2 4 6 8 10 Time since diagnosis in years EpiStat, Gothenburg, 11 December 2008 27 Cox model to estimate the cause-specific mortality rate ratio. stcox distant failure _d: status == 1 analysis time _t: (exit-origin)/365.25 origin: time dx note: trim>10 trimmed No. of subjects = 13208 Number of obs = 50666 No. of failures = 7122 Time at risk = 44013.26215 LR chi2(1) = 5544.65 Log likelihood = -61651.446 Prob > chi2 = 0.0000 ---------------------------------------------------------------------- _t Haz. Ratio Std. Err. z P> z [95% Conf. Interval] --------+------------------------------------------------------------- distant 6.557777.1689328 73.00 0.000 6.234895 6.897381 ---------------------------------------------------------------------- EpiStat, Gothenburg, 11 December 2008 28 Fitted hazards from Cox model stcurve, hazard at1(distant=0) at2(distant=1) Smoothed hazard function 0.2.4.6.8 distant=0 distant=1 0 2 4 6 8 10 Time since diagnosis in years EpiStat, Gothenburg, 11 December 2008 29

Fitted hazards from Cox model on log scale stcurve, hazard at1(distant=0) at2(distant=1) yscale(log).2.4.6.8 Smoothed hazard function distant=0 distant=1 0 2 4 6 8 10 Time since diagnosis in years EpiStat, Gothenburg, 11 December 2008 30 Regression models commonly applied in epidemiology Linear regression Logistic regression Poisson regression μ = β 0 + β 1 X ( ) π ln = β 0 + β 1 X 1 π ln(λ) =β 0 + β 1 X In each case β 1 is the effect per unit of X, measured as a change in the mean (linear regression); the change in the log odds (logistic regression); the change in the log rate (Poisson regression). Cox model ln[λ(t)] = ln[λ 0 (t)] + β 1 X EpiStat, Gothenburg, 11 December 2008 31 Fitted values: Poisson regression, cause-specific mortality Mortality rate per 1000 person years 0 200 400 600 0 2 4 6 8 10 Time since diagnosis in years No distant mets Distant mets EpiStat, Gothenburg, 11 December 2008 32

Estimated effect of distant metastases while controlling for time since diagnosis. xi: glm dead i.fu distant, family(poisson) lnoff(risktime) eform -------------------------------------------------------------------------- OIM dead IRR Std. Err. z P> z [95% Conf. Interval] ---------+---------------------------------------------------------------- _Ifu_1.4551995.0100359-35.70 0.000.4359484.4753006 _Ifu_2.2660856.0076698-45.93 0.000.2514698.2815508 _Ifu_3.1763302.0063845-47.93 0.000.1642505.1892983 _Ifu_4.1190439.0054356-46.61 0.000.108853.1301888 _Ifu_5.0751727.0045158-43.08 0.000.0668231.0845656 _Ifu_6.0466152.0037402-38.21 0.000.0398319.0545538 _Ifu_7.0269519.0030126-32.33 0.000.0216493.0335532 _Ifu_8.0183221.002654-27.61 0.000.0137936.0243373 _Ifu_9.0116515.0022895-22.66 0.000.0079273.0171253 distant 6.504972.112855 107.93 0.000 6.287499 6.729967 risktime (exposure) -------------------------------------------------------------------------- EpiStat, Gothenburg, 11 December 2008 33. stcox distant Recall the estimate from Cox model No. of subjects = 13208 Number of obs = 50666 No. of failures = 7122 Time at risk = 44013.26215 LR chi2(1) = 5544.65 Log likelihood = -61651.446 Prob > chi2 = 0.0000 ---------------------------------------------------------------------- _t Haz. Ratio Std. Err. z P> z [95% Conf. Interval] --------+------------------------------------------------------------- distant 6.557777.1689328 73.00 0.000 6.234895 6.897381 ---------------------------------------------------------------------- EpiStat, Gothenburg, 11 December 2008 34 Modelling excess mortality (relative survival) Thehazardattimesincediagnosist for persons diagnosed with cancer is modelled as the sum of the known baseline hazard, λ (t), and the excess hazard due to a diagnosis of cancer, ν(t) [15, 16, 17, 18, 19]. λ(t) =λ (t)+ν(t) It is common to assume that the excess hazards are piecewise constant and proportional. Provides estimates of relative excess risk. The model can be estimated in the framework of generalised linear models using standard statistical software (e.g., SAS, Stata, R) [15]. Non-proportional excess hazards are common but can be incorporated by introducing follow-up time by covariate interaction terms. EpiStat, Gothenburg, 11 December 2008 35

Modelling excess mortality using Poisson regression The model can be written as ln(μ j d j)=ln(y j )+xβ, (1) where μ j = E(d j ), d j the expected number of deaths, and y j person-time. This implies a generalised linear model with outcome d j, Poisson error structure, link ln(μ j d j ), and offset ln(y j). Such models have previously been described by Breslow and Day (1987) [20, pp. 173 176] and Berry (1983) [18]. The usual regression diagnostics (residuals, influence statistics) and method for assessing model fit for generalised linear models can be utilised. EpiStat, Gothenburg, 11 December 2008 36 Poisson regression for the colon carcinoma data When we stset the data we specify all deaths as events.. stset exit, fail(status==1 2) origin(dx) scale(365.25) id(id) We use strs to estimate relative survival for each combination of relevant predictor variables and save the results to a file.. strs using popmort, br(0(1)10) mergeby(_year sex _age) > by(sex distant agegrp year8594) notables save(replace) We then fit the Poisson regression model using the resulting output file (which contains the observed (d) and expected (d_star) numbers of deaths for each life table interval along with person-time at risk (y)). EpiStat, Gothenburg, 11 December 2008 37. use grouped, clear. xi: glm d i.end distant, fam(pois) link(rs d_star) lnoffset(y) eform -------------------------------------------------------------------------- d ExpB Std. Err. z P> z [95% Conf. Interval] ---------+---------------------------------------------------------------- _Iend_2.6485116.0222687-12.61 0.000.606302.6936598 _Iend_3.372179.0202302-18.18 0.000.3345676.4140186 _Iend_4.2468263.0196903-17.54 0.000.2110997.2885992 _Iend_5.2160604.0210312-15.74 0.000.1785334.2614753 _Iend_6.1505581.0215428-13.23 0.000.1137389.1992964 _Iend_7.0773745.0191536-10.34 0.000.0476308.1256921 _Iend_8.0628595.0191633-9.08 0.000.0345839.114253 _Iend_9.0333979.018285-6.21 0.000.0114208.0976658 _Iend_10.0728102.0235883-8.09 0.000.0385859.1373902 distant 8.18794.2588833 66.50 0.000 7.69594 8.711393 -------------------------------------------------------------------------- We estimate that excess mortality is 8.2 times higher for patients with distant metastases at diagnosis compared to patients without distant metastases at diagnosis. EpiStat, Gothenburg, 11 December 2008 38

Fitted values from the excess mortality model Excess mortality rate per 1000 person years 0 200 400 600 800 1000 0 2 4 6 8 10 Time since diagnosis in years No distant mets Distant mets EpiStat, Gothenburg, 11 December 2008 38 Can adjust for additional variables.. xi: glm d i.end sex i.agegrp year8594 distant, fam(pois) > link(rs d_star) lnoffset(y) eform i.end _Iend_1-10 (naturally coded; _Iend_1 omitted) i.agegrp _Iagegrp_0-3 (naturally coded; _Iagegrp_0 omitted) ------------------------------------------------------------------------ d ExpB Std. Err. z P> z [95% Conf. Interval] -----------+------------------------------------------------------------ _Iend_2.6582263.022388-12.30 0.000.6157772.7036016 [output omitted] _Iend_10.07477.0243443-7.97 0.000.0394989.1415367 sex.9878062.0272241-0.45 0.656.9358634 1.042632 _Iagegrp_1 1.046824.0680002 0.70 0.481.9216818 1.188959 _Iagegrp_2 1.17649.070505 2.71 0.007 1.046109 1.32312 _Iagegrp_3 1.549778.0950402 7.14 0.000 1.374262 1.74771 year8594.8909376.0238832-4.31 0.000.8453358.9389994 distant 8.008541.2490144 66.91 0.000 7.535056 8.511779 ------------------------------------------------------------------------ EpiStat, Gothenburg, 11 December 2008 39 References [1] Dickman PW, Adami HO. Interpreting trends in cancer patient survival. Journal of Internal Medicine 2006;260:103 17. [2] Welch HG, Black WC. Are deaths within 1 month of cancer-directed surgery attributed to cancer? J Natl Cancer Inst 2002;94:1066 70. [3] Berkson J, Gage RP. Calculation of survival rates for cancer. Proceedings of Staff Meetings of the Mayo Clinic 1950;25:270 286. [4] Ederer F, Axtell LM, Cutler SJ. The relative survival rate: A statistical methodology. National Cancer Institute Monograph 1961;6:101 121. [5] Percy CL, Stanek E, Gloeckler L. Accuracy of cancer death certificates and its effect on cancer mortality statistics. American Journal of Public Health 1981;71:242 250. [6] Cronin K, Feuer E. Cumulative cause-specific mortality for cancer patients in the presence of other causes: a crude analogue of relative survival. Stat Med Jul 2000;19:1729 40. [7] Brenner H, Gefeller O. An alternative approach to monitoring cancer patient survival. Cancer 1996;78:2004 2010. [8] Brenner H, Gefeller O, Hakulinen T. Period analysis for up-to-date cancer survival data: theory, empirical evaluation, computational realisation and applications. European Journal of Cancer 2004;40:326 35. EpiStat, Gothenburg, 11 December 2008 40

[9] Brenner H, Arndt V, Gefeller O, Hakulinen T. An alternative approach to age adjustment of cancer survival rates. Eur J Cancer 2004;40:2317 2322. [10] Brenner H, Hakulinen T. Age adjustment of cancer survival rates: methods, point estimates and standard errors. Br J Cancer 2005;93:372 375. [11] Brenner H, Hakulinen T. On crude and age-adjusted relative survival rates. J Clin Epidemiol 2003;56:1185 91. [12] Nelson CP, Lambert PC, Squire IB, Jones DR. Relative survival: what can cardiovascular disease learn from cancer? Eur Heart J Apr 2008;29:941 947. [13] Bhaskaran K, Hamouda O, Sannes M, Boufassa F, Johnson AM, Lambert PC, et al.. Changes in the risk of death after HIV seroconversion compared with mortality in the general population. JAMA Jul 2008;300:51 59. [14] Cox DR. Regression models and life tables (with discussion). Journal of the Royal Statistical Society Series B 1972;34:187 220. [15] Dickman PW, Sloggett A, Hills M, Hakulinen T. Regression models for relative survival. Stat Med 2004;23:51 64. [16] Estève J, Benhamou E, Croasdale M, Raymond L. Relative survival and the estimation of net survival: Elements for further discussion. Statistics in Medicine 1990;9:529 538. [17] Hakulinen T, Tenkanen L. Regression analysis of relative survival rates. Applied Statistics 1987;36:309 317. EpiStat, Gothenburg, 11 December 2008 41 [18] Berry G. The analysis of mortality by the subject-years method. Biometrics 1983; 39:173 184. [19] Pocock S, Gore S, Kerr G. Long term survival analysis: the curability of breast cancer. Stat Med 1982;1:93 104. [20] Breslow NE, Day NE. Statistical Methods in Cancer Research: Volume II - The Design and Analysis of Cohort Studies. IARC Scientific Publications No. 82. Lyon: IARC, 1987. EpiStat, Gothenburg, 11 December 2008 42