De ntons and Censorng. Survval Analyss We begn by consderng smple analyses but we wll lead up to and take a look at regresson on explanatory factors., as n lnear regresson part A. The mportant d erence between survval analyss and other statstcal analyses whch you have so far encountered s the presence of censorng. Ths actually renders the survval functon of more mportance n wrtng down the models. We begn wth a remnder of some de ntons. T denotes the postve random varable representng tme to event of nterest. Cumulatve Dstrbuton functon s F (t) = Pr(T t) wth probablty densty functon f(t) = F 0 (t): Survval functon s S(t) = P (T > t) = F (t) Note: we use S(t) = F (t) throughout. Hazard functon Pr(t T < t + tjt t) h(t) = lm t!0 t {If T s dscrete and postve nteger-valued then h(t) = Pr(T = tjt t) = Pr(T = t)=s(t ):g Cumulatve hazard functon H(t) = Z t 0 h(s)ds We have the followng relatons between these functons: () h(t) = S(t) S(t + t) lm t!0 ts(t) = S 0 (t) S(t) = d (log S) dt () () S(t) = exp( H(t)); snce S(0) = f(t) = h(t)s(t)
.2 Censorng and truncaton Rght censorng occurs when a subject leaves the study before an event occurs, or the study ends before the event has occurred. For example, we consder patents n a clncal tral to study the e ect of treatments on stroke occurrence. The study ends after 5 years. Those patents who have had no strokes by the end of the year are censored. If the patent leaves the study at tme t e ; then the event occurs n (t e ; ) : Left censorng s when the event of nterest has already occurred before enrolment. Ths s very rarely encountered. Truncaton s delberate and due to study desgn. Rght truncaton occurs when the entre study populaton has already experenced the event of nterest (for example: a hstorcal survey of patents on a cancer regstry). Left truncaton occurs when the subjects have been at rsk before enterng the study (for example: lfe nsurance polcy holders where the study starts on a xed date, event of nterest s age at death). Generally we deal wth rght censorng & sometmes left truncaton. Two types of ndependent rght censorng: Type I : completely random dropout (eg emgraton) and/or xed tme of end of study no event havng occurred. Type II: study ends when a xed number of events amongst the subjects has occurred..3 Lkelhood and Censorng If the censorng mechansm s ndependent of the event process, then we have an easy way of dealng wth t. Suppose that T s the tme to event and that C s the tme to the censorng event. Assume that all subjects may have an event or be censored, say for subject one of a par of observatons et ; ec may be observed: Then snce we observe the mnmum tme we would have the followng expresson for the lkelhood (usng ndependence) L = Y Y f(et )S C (et ) S(ec )f C (ec ) et < ec ec <et Now de ne the followng random varable: f T < C = 0 f T > C For each subject we observe t = mn et ; ec and ; observatons from a contnuous random varable and a bnary random varable. In terms of these L becomes L = Y h(t ) S(t ) Y h C (t ) S C (t ) 2
where we have used densty = hazard survval functon. NB If the censorng mechansm s ndependent (sometmes called non-nformatve) then we can gnore the second product on the rght as t gves us no nformaton about the event tme. In the remander of the course we wll assume that the censorng mechansm s ndependent..4 Data Demographc v. tral data The tme to event can lterally be the age, eg n a lfe nsurance polcy. In a clncal tral t wll more typcally be tme from admsson to the tral. Sldes show ve patents A, B, C, D, E from a Sydney hosptal plot study, concernng treatment of bladder cancer. Each patent has ther own zero tme, the tme at whch the patent entered the study (accrual tme). For each patent we record tme to event of nterest or censorng tme, whchever s the smaller, and the status, = f the event occurs and = 0 f the patent s censored. 3
2 Non-parametrc estmators Remnder: (nforms the argument below) If there are observatons x ; : : : ; x n from a random sample then we de ne the emprcal dstrbuton functon bf (x) = n # fx : x xg Ths s approprate f no censorng occurs. However f censorng occurs ths has to be taken nto account. We measure the par (X; ) where X = mn(t; C) and s as before f T < C = 0 f T > C Suppose that the observatons are (x ; ) for = ; 2 : : : ; n: L = Y f(x ) S(x ) = Y f(x ) ( F (x )) What follows s a heurstc argument allowng us to nd an estmator for S, the survval functon, whch n the lkelhood sense s the best that we can do. Suppose that there are falure tmes (0 <) < t < : : : < t < : : : : Let s ; s 2 ; ; s c be the censorng tmes wthn the nterval [t ; t + ) and suppose that there are d falures at tme t (allowng for ted falure tmes). Then the lkelhood functon becomes L = Y fal = Y fal f(t ) d Y Yc k= (F (t ) F (t )) d Y ( F (s k )) Yc k=! ( F (s k )) where we wrte f(t ) = F (t ) F (t ); the d erence n the cdf at tme t and the cdf mmedately before t. Snce F (t ) s an ncreasng functon, and assumng that t takes xed values at the falure tme ponts, we make F (t ) and F (s k ) as small as possble n order to maxmse the lkelhood. That means we take F (t ) = F (t ) and F (s k ) = F (t ). Ths maxmses L by consderng the cdf F (t) to be a step functon and therefore to come from a dscrete dstrbuton, wth falure tmes as the actual falure tmes whch occur. Then L = Y Y (F (t ) F (t )) d ( fal F (t )) c! 4
So we have showed that amongst all cdf s wth xed values F (t ) at the falure tmes t ; then the dscrete cdf has the maxmum lkelhood, amongst those wth d falures at t and c censorngs n the nterval [t ; t + ): Let us consder the dscrete case and let Pr (fal at t jsurvved to t ) = h Then Y S (t ) = F (t ) = ( h j ); Y f(t ) = h ( h j ) Fnally we have L = Y t h d ( h n d ) where n s the number at rsk at tme t. Ths s usually referred to as the number n the rsk set. Note n + + c + d = n 5
2. Kaplan-Meer estmator Ths estmator for S(t) uses the mle estmators for h. Takng logs l = X d log h + X (n d ) log( h ) D erentate wth respect to h @l @h = d h n d h = 0 =) b h = d n So the Kaplan-Meer estmator s bs(t) = Y tt d n where n = #fn rsk set at t g; d = #fevents at t g: Note that c = #fcensored n [t ; t + )g: If there are no censored observatons before the rst falure tme then n 0 = n = #fn studyg: Generally we assume t 0 = 0: 2.2 Nelson-Aalen estmator and new estmator of S The Nelson-Aalen estmator for the cumulatve hazard functon s 0 bh(t) = X d @= X bh A n tt tt Ths s natural for a dscrete estmator, as we have smply summed the estmates of the hazards at each tme, nstead of ntegratng, to get the cummulatve hazard. Ths correspondngly gves an estmator of S of the form es(t) = exp bh(t) 0 = exp @ X d n tt It s not d cult to show by comparng the functons x; exp( x) on the nterval 0 x, that S(t) e S(t): b A 6
Invented data set Suppose that we have 0 observatons n the data set wth falure tmes as follows: 2; 5; 5; 6+; 7; 2; 4+; 4+; 4+; 4+ Here + ndcates a censored observaton. Then we can calculate both estmators for S(t) at all tme ponts. It s consdered unsafe to extrapolate much beyond the last tme pont, 4, even wth a large data set. 7
2.3 Con dence Intervals We need to nd con dence ntervals (pontwse) for the estmators of S(t) at each tme pont. We d erentate the log-lkelhood and use lkelhood theory, l = X d log h + X (n d ) log( h ); n @ d erentated twce to nd the Hessan matrx 2 l @h @h j o. Note that snce l s a sum of functons of each ndvdual hazard the Hessan must be dagonal. n The estmators ch ; h c 2 ; : : : ; h c o n are asymptotcally unbased and are asymptotcally jontly normally dstrbuted wth approxmate varance I, where the nformaton matrx s gven by @ 2 l I = E : @h @h j Snce the Hessan s dagonal, the covarances are all asymptotcally zero, and coupled wth asymptotc normalty, ths ensures that all pars h b ; h b j are asymptotcally ndependent. @ 2 l @h 2 = d h 2 + n d ( h ) 2 We use the observed nformaton J and so replace h n the above by ts estmator bh = d n : Hence we have var h b d (n d ) n 3 : 2.3. Establshng Greenwood s formula Remnder: method If the random varaton of Y around s small (for example f s the mean of Y and vary has order n ), we use: Takng expectatons g(y ) g() + (Y )g 0 () + 2 (Y )2 g 00 () + : : : E(g(Y )) = g() + O n var(g(y)) = g 0 () 2 vary + o n Dervaton of Greenwood s formula for var( S(t)) b log S(t) b = X log h b t t 8
But var b h d (n d ) n 3 so that, gven g(h ) = log ( h ) ; and b h P! h g 0 (h ) = ( h ) we have var log h b ( h ) 2 var b h ( d (n d ) d n ) 2 n 3 = d n (n d ) Snce h b ; h b j are asymptotcally ndependent we can put all ths together to get var log bs(t) = X d n (n d ) t t Let Y = log S b and note that we need var e Y the delta-method. Fnally we have Greenwood s formula var bs(t) S(t) b X 2 d n (n d ) : t t e Y 2 vary, agan usng 9
Applyng ths to the same sort of argument to the Nelson-Aalen estmator and ts extenson to the survval functon we also see var b H(t) X t t d (n d ) n 3 and vars(t) e = var exp( H(t) b e H 2 X t t 2 X es(t) t t d (n d ) n 3 d (n d ) n 3 Clearly these estmates are only reasonable f each n s su cently large, snce they rely heavly on asymptotc calculatons. 2.4 Actuaral estmator The actuaral estmator s a further estmator for S(t):It s gven as S (t) = Y d n t t 2 c The ntervals between consecutve falure tmes are usually of constant length, and t s generally used by actuares and demographers followng a cohort from brth to death. Age wll normally be the tme varable and hence the unt of tme s year. 0
3 Models: accelerated lfe model, proportonal hazards model We generally wll have heterogeneous data where parameter estmates wll be dependent on covarates measured for partcpants n a study. For example age or sex may have an e ect on tme to event. A smple example would be where partcpants fall nto two groups such as treatment v. control, smoker v. non-smoker. There are two popular general classes of model as n the headng above - AL and PH. 3. Accelerated Lfe models Suppose there are (several) groups, labelled by ndex : The accelerated lfe model has a survval curve for each group de ned by S (t) = S 0 ( t) where S 0 (t) s some baselne survval curve and s a constant spec c to group. If we plot S aganst log t, = ; 2; : : : ; k, then we expect to see a horzontal shft as S (t) = S 0 (e log +log t ) : 3.. Medans and Quantles Note too that each group has a d erent medan lfetme, snce, f S 0 (m) = 0:5; S ( m ) = S 0 ( m ) = 0:5; gvng a medan for group of m. Smlarly f the 00% quantle of the baselne survval functon s t, then the 00% quantle of group s t. 3.2 Proportonal Hazards models In ths model we assume that the hazards n the varous groups are proportonal so that h (t) = h 0 (t) where h 0 (t) s the baselne hazard. Hence we see that Takng logs twce we get S (t) = S 0 (t) log ( log S (t)) = log + log ( log S 0 (t)) So f we plot the RHS of the above equaton aganst ether t or log t we expect to see a vertcal shft between groups.
3.2. Plots Takng both models together t s clear that we should plot log log S b (t) aganst log t as then we can check for AL and PH n one plot. Generally b S wll be calculated as the Kaplan-Meer estmator for group, and the survval functon estmator for each group wll be plotted on the same graph. () If the accelerated lfe model s plausble we expect to see a horzontal shft between groups. () If the proportonal hazards model s plausble we expect to see a vertcal shft between groups. 3.3 AL parametrc models There are several well-known parametrc models whch have the accelerated lfe property. These models also allow us to take account of contnuous covarates such as blood pressure. Name S(t) h(t) Webull exp( (t) ) t log-logstc t +(t) +(t) log-normal - log t+log exponental exp( t) 2
Densty functon Name f(t) = hs Webull t e (t) log-logstc log-normal t p 2 2 exp t (+(t) ) 2 2 (log t + log ) 2 2 exponental e t Remarks: () Exponental s a submodel of Webull wth = () log-normal s derved from a normal dstrbuton wth mean log and varance 2. In ths dstrbuton = has the same role as n the Webull and log-logstc. () The shape parameter s : The scale parameter s : Shape n the hazard functon h(t) s mportant. Webull h monotonc ncreasng > h monotonc decreasng < log-normal h! 0 as t! 0; ; one mode only log-logstc see problem sheet 5. Comments: a) to get a "bathtub" shape we mght use a mxture of Webull s. Ths gves hgh ntal probablty of an event, a perod of low hazard rate and then ncreasng hazard rate for larger values of t. b) to get an nverted "bathtub" shape we may have a mxture of log-logstcs, or possbly a sngle log-normal or sngle log-logstc. To check for approprate parametrc model (gven AL checked) There are some dstrbutonal ways of testng for say Webull v. log-logstc etc., but they nvolve generalsed F-dstrbutons and are not n general use. We can do a smple test for Webull v. exponental as ths smply means testng a null hypothess =, and the exponental s a sub-model of the Webull model. Hence we can use the lkelhood rato statstc whch nvolves 2 log b L web 2 log b L exp 2 (); asymptotcally. 3.3. Plots for parametrc models However most studes use plots whch gve a rough gude from shape. We should use a straghtlne t as ths s the t whch the human eye spots easly. ) Exponental - S = e t ; plot log S v. log t 2) Webull - S = e (t) ; plot log ( log S) v. log t 3) log-logstc - S = +(t) ; plot see problem sheet 5 3
4) log-normal - S = log t+log, plot ( S) v. log t or equvalently (S) v. log t In each of the above we would estmate S wth the Kaplan-Meer estmator bs(t), and use ths to construct the plots. 3.3.2 Regresson n parametrc AL models In general studes each observaton wll have measured explanatory factors such as age, smokng status, blood pressure and so on. We need to ncorporate these nto a model usng some sort of generalsed regresson. It s usual to do so by makng a functon of the explanatory varables. For each observaton (say ndvdual n a clncal tral) we set the scale parameter = (:x), where :x s a lnear predctor composed of a vector x of known explanatory varables (covarates) and an unknown vector of parameters whch wll be estmated. The most common lnk functon s log = :x, equvalently = e :x. The dea s to mrror ordnary lnear regresson and nd a baselne dstrbuton whch does not depend on ; smlar to lookng at the error term n least squares regresson. To gve a dervaton we wll restrct to the Webull dstrbuton, but smlar arguments work for all AL parametrc models. We have S(t) = e (t) = Pr(T > t) = Pr(log T > log t) = Pr( (log T + log ) > (log t + log )) Now let Y = (log T + log ) and y = (log t + log ) : Hence we have Pr(Y > y) = S Y (y) = S(t) = e (t) = exp( e y ) log T = log + Y; where S Y (y) = exp( e y ) The dstrbuton of Y s ndependent of the parameters and : And n the case of the Webull dstrbuton ts dstrbuton s called the extreme value dstrbuton and s as above. In general we wll wrte log T = log + Y for all AL parametrc models, and Y has a dstrbuton n each case whch s ndependent of the model parameters. 4
Name S(t) Y Webull exp( (t) ) log T = log + Y log-logstc +(t) log T = log + Y log-normal - log t+log as before = ; for the log-normal. log T = Name S Y (y) dstrbuton Webull exp( e y ) extreme value log-logstc +e y logstc dstrbuton log-normal (y) N(0,) log + Y 3.3.3 Wth real data (assumng rght censorng only) Censorng s assumed to be ndependent mechansm and s sometmes referred to as non-nformatve. The shape parameter s assumed to be the same for each observaton n the study. There are often very many covarates measured for each subject n a study. A row of data wll have perhaps:- response - event tme t, status (= f falure, =0 f censored) covarates - age, sex, systolc blood pressure, treatment, and so a mxture of categorcal varables and contnuous varables amongst the covarates. Suppose that Webull s a good t. Then S(t) = e (t) and = e :x :x = b 0 + b x age + b 2 x sex + b 3 x sbp + b 4 x trt where b 0 s the ntercept and all regresson coe cents b are to be estmated, as well as estmatng : Note ths model assumes that s the same for each subject. We have not shown, but could have, nteracton terms such as x age x trt. Ths nteracton would allow a d erent e ect of age accordng to treatment group. Suppose subject j has covarate vector x j and so scale parameter Ths gves a lkelhood j = e :xj : L(; ) = Y j j t j j e ( j t j) = Y j e :xj t j j e (e :x j t j) : 5
We can now look for mle s for and all components of the vector ; gvng estmators b, b together wth ther standard errors ( = p q varb; var b j ) calculated from the observed nformaton matrx (see problem sheet 5). As already noted we can test for = usng 2 log b L web 2 log b L exp 2 (), asymptotcally. Packages allow for Webull, log-logstc and log-normal models, sometmes others. In recent years, a sem-parametrc model has been developed n whch the baselne survval functon S 0 s modelled non-parametrcally, and each subject has tme t scaled to j t. Ths model s beyond the scope of ths course. 6