Duration Analysis Econometric Analysis Dr. Keshab Bhattarai Hull Univ. Business School April 4, 2011 Dr. Bhattarai (Hull Univ. Business School) Duration April 4, 2011 1 / 27
What is Duration Analysis? There are several economic questions in which the investigator is interested to know how long a certain thing will last given that it has survived/existed for so long time. Duration of these events is a random variable that depends on chances and duration analysis aims to analyse what factors determine the length of duration of occurrence for period up to T period (t 6 T ) or survival after period T (t > T )or what is probability of transition or the hazard rate between T and T + period. Modelling duration has been used to determine the duration or probability of termination of strikes, unemployment, marriage, disaster spells, heart attacks or many other ill-spells, likelihood of bankruptcy of a rm, technological breakthrough, probability of maintaining championship titles in sports. Dr. Bhattarai (Hull Univ. Business School) Duration April 4, 2011 2 / 27
Example of Duration Analysis Main question is to study that if an event existed so far how long will it last or what is the rate of survival next period? For instance manager of a company would be interested to know how long will a certain machine last given that it has been running so far? A life insurance company would be interested in probability of death of an individual with certain medical record or physical characteristic in the next T + years given that the person has survived up to T years. A union leader or the management negotiator will be interested about the probability of withdrawal of a strike given that the strike has continued up to T periods. Dr. Bhattarai (Hull Univ. Business School) Duration April 4, 2011 3 / 27
Duration Density The starting point of duration analysis is cumulative density function for duration which gives the distribution of duration variable starting from an initial state 0 up to period t as following: Pr (t 6 T ) = F (t) = More interesting is the survival rate which is: Z t 0 f (t) (1) S (t) = 1 F (t) = Pr (t > T ) (2) Probability of transition from one state to another (from unemployment of to employment, life to death, working condition to break down) is given by a hazard rate or probability of termination. Dr. Bhattarai (Hull Univ. Business School) Duration April 4, 2011 4 / 27
Survival and Duration 1.2 Survival 1.0.8.6.4.2.0.2 0 43 86 130 173 216 Duration Estimated Survival Function Dr. Bhattarai (Hull Univ. Business School) Duration April 4, 2011 5 / 27
Hazard Rate Hazard rate F (t + ) F (t) λ (t) = lim!0 S (t) = f (t) S (t) (3) f (t) = S (t).λ (t) (4) Hazard function is linked to the survival function as log [1 λ (t) = F (t)] = F (t) 1 F (t) = f (t) S (t) It is possible to derive the duration function by integrating the survival function (5) Dr. Bhattarai (Hull Univ. Business School) Duration April 4, 2011 6 / 27
Duration and Survival Functions It is possible to derive the duration function by integrating the hazard function Z t 0 λ (t) = log [1 F (s)] + log [1 F (0)] = log [1 F (s)] (6) F (s) = 1 Z t exp λ (t) 0 Therefore modelling hazard function is the main element in the duration models. Proportional hazard model: λ (t, x i ) = λ 0 (t) exp x 0 i, β (8) (7) Dr. Bhattarai (Hull Univ. Business School) Duration April 4, 2011 7 / 27
Main Points in the Duration Analysis Important element in this is modelling the duration dependence, that gives the likelihood of how much hazard rate depends on the duration variable. There is positive duration dependence if the longer the time spent in a given state, the higher the probability of leaving it soon. For instance, longer a light bulb works the higher the probability that it fails next period. Negative duration dependence implies longer the time spent in a given state, the lower the probability of leaving it soon. For instance, the longer the job search lasts, the less chance an unemployed person has nding a job. Absence of duration dependence is observed if the duration does not impact on the hazard rate, but this case is less appealing than the positive or negative duration dependence. Dr. Bhattarai (Hull Univ. Business School) Duration April 4, 2011 8 / 27
Main Points in Duration Analysis Duration dependence λ(t) > 0 indicates positive duration dependence and λ(t) < 0 indicates negative duration dependence. Whereas λ(t) = 0 indicates no duration dependence. There are a number of ways of modelling the hazard functions; 1 exponential models are more popular in the literature (See Wooldridge (2002: chapter 20); Green (2008), Chap 25).see Dixon-Bihan (2011) paper in http://editorialexpress.com/conference/res2011/program/res2011.html Dr. Bhattarai (Hull Univ. Business School) Duration April 4, 2011 9 / 27
Main Points in Duration Analysis Duration dependence λ(t) > 0 indicates positive duration dependence and λ(t) < 0 indicates negative duration dependence. Whereas λ(t) = 0 indicates no duration dependence. There are a number of ways of modelling the hazard functions; 1 exponential 2 Weibull models are more popular in the literature (See Wooldridge (2002: chapter 20); Green (2008), Chap 25).see Dixon-Bihan (2011) paper in http://editorialexpress.com/conference/res2011/program/res2011.html Dr. Bhattarai (Hull Univ. Business School) Duration April 4, 2011 9 / 27
Main Points in Duration Analysis Duration dependence λ(t) > 0 indicates positive duration dependence and λ(t) < 0 indicates negative duration dependence. Whereas λ(t) = 0 indicates no duration dependence. There are a number of ways of modelling the hazard functions; 1 exponential 2 Weibull 3 log-normal models are more popular in the literature (See Wooldridge (2002: chapter 20); Green (2008), Chap 25).see Dixon-Bihan (2011) paper in http://editorialexpress.com/conference/res2011/program/res2011.html Dr. Bhattarai (Hull Univ. Business School) Duration April 4, 2011 9 / 27
Main Points in Duration Analysis Duration dependence λ(t) > 0 indicates positive duration dependence and λ(t) < 0 indicates negative duration dependence. Whereas λ(t) = 0 indicates no duration dependence. There are a number of ways of modelling the hazard functions; 1 exponential 2 Weibull 3 log-normal 4 logistic models are more popular in the literature (See Wooldridge (2002: chapter 20); Green (2008), Chap 25).see Dixon-Bihan (2011) paper in http://editorialexpress.com/conference/res2011/program/res2011.html Dr. Bhattarai (Hull Univ. Business School) Duration April 4, 2011 9 / 27
Main Points in Duration Analysis Duration dependence λ(t) > 0 indicates positive duration dependence and λ(t) < 0 indicates negative duration dependence. Whereas λ(t) = 0 indicates no duration dependence. There are a number of ways of modelling the hazard functions; 1 exponential 2 Weibull 3 log-normal 4 logistic 5 GAMMA models are more popular in the literature (See Wooldridge (2002: chapter 20); Green (2008), Chap 25).see Dixon-Bihan (2011) paper in http://editorialexpress.com/conference/res2011/program/res2011.html Dr. Bhattarai (Hull Univ. Business School) Duration April 4, 2011 9 / 27
Hazard Functions Dr. Bhattarai (Hull Univ. Business School) Duration April 4, 2011 10 / 27
Exponential hazard model Here T has exponential distribution. 1 F (t) = 1 exp ( λ.t). This distribution does not have memory λ (t) = λ, the hazard rate does not depend on duration, it is constant λ (t) = λ. f (t) = λ exp ( λ.t) for (t > 0) λ (t) = log [1 F (t)] = log S (t) (9) ln S (t) = k λ (t) = k λ.t (10) S (t) = K exp ( λt) (11) Estimation of λ is simple; expected duration E (t) = 1 λ and the maximum likelihood estimation of λ is. 1 t Integrated hazard function is written as Λ (t) = R t λ 0 (t) or S (t) = exp ( Λ (t)) or Λ (t) = ln S (t) Dr. Bhattarai (Hull Univ. Business School) Duration April 4, 2011 11 / 27
CDF: density: Survival function Hazard function: S (t) = 1 F (t) = 1 F (t) = 1 e ht (12) f (t) = F 0 (t) = he ht (13) 1 e ht = e ht (14) h(t) = f (t) he ht = S(t) e ht = h (15) See examples in STATA and LIMDEP: Spell, duration, aps, BHPS, recid_jw. Dr. Bhattarai (Hull Univ. Business School) Duration April 4, 2011 12 / 27
Exponential Hazard Model Dr. Bhattarai (Hull Univ. Business School) Duration April 4, 2011 13 / 27
Weibull The CDF of T is given by F (t) = 1 exp ( λ.t α )where λ and α are nonnegative parameters; and the density is given by f (t) = αλt α 1 exp ( λ.t α ) (16) and the hazard function is.λ (t) = f (t) S (t) = αλt α 1 exp( λ.t α ) = αλt exp( λ.t α ) α 1 When α = 1, the Weibull distribution reduces to the exponential distribution with λ (t) = λ; if α > 1, the hazard is monotonically increasing, λ (t) = αλt α 1, which shows positive duration dependence. If α < 1, the hazard, is continuously decreasing, λ (t) = αλt α gives negative duration dependance. 1 this Thus the Weibull distribution is better to capture the duration variable and transition between states if the hazard is monotonically increasing or decreasing. Dr. Bhattarai (Hull Univ. Business School) Duration April 4, 2011 14 / 27
Weibull Hazard Model Dr. Bhattarai (Hull Univ. Business School) Duration April 4, 2011 15 / 27
Log Normal Log normal distributions of durations give non-monotonic hazard functions; rst the hazard rate increases with duration and then decreases. This type of analysis is good in modelling bankruptcy rates. When it follows a normal distribution with mean m and variance σ, its density is given by: f (t) = 1 log T m σ.t φ (17) σ and the survivor function is S (t) = 1 Φ log T m σ with Φ denoting the CDF of a standard normal. The hazard function using λ (t) = f (t) s(t) λ (t) = f (t) s (t) = 1 T 1 σ φ log T σ 1 Φ log T σ m m (18) Dr. Bhattarai (Hull Univ. Business School) Duration April 4, 2011 16 / 27
Log Normal Hazard Models Dr. Bhattarai (Hull Univ. Business School) Duration April 4, 2011 17 / 27
Log logistic Log logistic hazard function is where the α and γ are positive parameters. λ (t) = f (t) s (t) = γαtα 1 1 + γt α (19) Z 0 λ (st) s = Using F (s) = 1 Z 0 exp γαt α 1 1 + γt α s = log (1 + γtα ) = Di erentiating with respect to t gives: R t 0 λ (t) condition derived above hlog (1 + γt α ) 1i (20) F (t) = 1 (1 + γt α ) 1 for t > 0 (21) f (t) = αγt α 1 (1 + γt α ) 2 (22) Dr. Bhattarai (Hull Univ. Business School) Duration April 4, 2011 18 / 27
GAMMA and Summary f (t) = a v t v 1 exp ( at) Γ (v) where Γ (v) = Z 0 exp ( t) t v 1 s (23) Summary of popular distributions for duration model Exponential functions for survival. S (t) = exp ( Λ (t)) λ (t) = λ F (t) = 1 exp ( λ.t) f (t) = λ exp ( Logistic S (t) = 1 1 σ λ.t) Φ log T m σ log T φ( m σ ) σ ) λ (t) = 1 T 1 Φ( log T m F (t) = 1 (1 + αt α ) 1 f (t) = αγt α 1 (1 + λt α ) 2 Weibull ;S (t) = exp ( λ.t α ); ; λ (t) = αλt α 1 F (t) = 1 exp ( λ.t α ) f (t) = αλt α 1 exp ( λ.t α ) Dr. Bhattarai (Hull Univ. Business School) Duration April 4, 2011 19 / 27
Estimation of Hazard Models Issue of ow versus stock sampling and left versus right truncation. Log linear models: Parameters of above models θ = (λ, γ) can be estimated using the maximum likelihood function for uncensored and censored observations. ln L = ln f (t/θ) + ln s (t/θ) (24) It is easily estimated by BHHH (Berdt-Hall-Hall-Hauseman (1974) estimator (See Greene (938-951)). Proportional hazard models, ln L = ln λ (t/θ) + ln s (t/θ) (25) λ (t) = e β(t,θ) λ 0 (t i ) (26) where the λ (t) is proportional to the baseline hazard function λ 0 (t i ). Empirical implementation (STATA10, Greene (2000); Chapter 20; Using Limdep) Dr. Bhattarai (Hull Univ. Business School) Duration April 4, 2011 20 / 27
Estimation of Duration in STATA See the log le hazard and hazard1 from the Annual Population Survey streg tpben31 tpben32 tpben33 tpben34 tpben35 tpben36 self1 self2 self3 self4 sex ethas ethbl gross99 grsexp, dist(weibull) failure _d: 1 (meaning all fail) analysis time _t: durun streg tpben31 tpben32 tpben33 tpben34 tpben35 tpben36 self1 self2 self3 self4 sex ethas ethbl gross99 > grsexp, dist(exponential) streg tpben31 tpben32 tpben33 tpben34 tpben35 tpben36 self1 self2 self3 self4 sex ethas ethbl gross99 > grsexp, dist(gompertz) streg tpben31 tpben32 tpben33 tpben34 tpben35 tpben36 self1 self2 self3 self4 sex ethas ethbl gross99 > grsexp, dist(lognormal) Dr. Bhattarai (Hull Univ. Business School) Duration April 4, 2011 21 / 27
Dr. Bhattarai (Hull Univ. Business School) Duration April 4, 2011 22 / 27
Dr. Bhattarai (Hull Univ. Business School) Duration April 4, 2011 23 / 27
Dr. Bhattarai (Hull Univ. Business School) Duration April 4, 2011 24 / 27
Estimation of Duration in STATA _t j Haz. Ratio Std. Err. z P>jzj [95% Conf. Interval] -+ - tpben31 j.9708788.0020447-14.03 0.000.9668796.9748946 tpben32 j.9733548.0020842-12.61 0.000.9692784.9774483 tpben33 j 1.004645.0029922 1.56 0.120.9987971 1.010527 tpben34 j.9886277.0086266-1.31 0.190.9718636 1.005681 tpben35 j.9717948.0358969-0.77 0.439.9039247 1.044761 tpben36 j (omitted) self1 j 1.020205.0044089 4.63 0.000 1.0116 1.028883 self2 j.9845701.0131169-1.17 0.243.9591941 1.010617 self3 j.9888829.0183726-0.60 0.547.9535209 1.025556 self4 j 1.028848.0244766 1.20 0.232.9819762 1.077957 sex j 1.531057.0346324 18.83 0.000 1.464662 1.600462 ethas j.9877391.0038554-3.16 0.002.9802115.9953245 ethbl j.9862598.0051397-2.65 0.008.9762374.9963851 Dr. Bhattarai gross99(hull j (omitted) Univ. Business School) Duration April 4, 2011 25 / 27
LIMDEP Commands /*========================================= Example 20.17. Log-Linear Survival Models for Strike Duration */========================================= Read ; Nobs = 62 ; Nvar = 2 ; Names = T,Prod $ T Prod 7 0.01138 14 0.01138 52 0.01138 37 0.02299? Four survival models for duration? Create ; logt = Log(T) $ Surv; Lhs=logT ; Rhs = One ; Model=Exponential ; Plot$ Surv; Lhs=logT ; Rhs = One ; Model=Weibull ; Plot$ Surv; Lhs=logT ; Rhs = One ; Model=Logistic ; Plot$ Surv; Lhs=logT ; Rhs = One ; Model=Normal ; Plot $ Dr. Bhattarai (Hull Univ. Business School) Duration April 4, 2011 26 / 27
Chesher A (1984) Improving the e ciency of Probit estimators, Review of Economic Studies,66:3:523-527. Elbers C. and G. Ridder (1982) True and spurious duration dependence: the identi ability of proportional hazard model, Review of Economic Studies, 49:3:July: 403-409 Greene W. (2008) Econometric Analysis, Prentice Hall, 6th edition. Greene W.H. (1998) LIMDEP Version 7: User Manual, Econometric Software Inc. Hausman J.A., (1978), Speci cation Tests in Econometrics, Econometrica, Vol. 46, No. 6, pp.1251-1271. Heckman J. J., (1979), Sample Selection Bias as a Speci cation Error, Econometrica, Vol. 47, No. 1, pp153-161. Imbens G. W. and T Lancaster (1994) Combining Micro and Macro Data in Microeconometric Models, Review of Economic Studies, 61:4:655-680. Keifer N (1988) Economic duration data and hazard functions, Journal of Economic Literature, 26:647-679. Lancaster T (1979) Econometric Methods for Duration of Unemployment, Econometrica, 47:4:939-56. Lancaster T (1990) Econometric Analysis of Transition Data, Blackwell Lancaster T and A Chesher (1983) The Estimation of Models of Labour Market Behviour Review of Economic Studies, 50:4:609-624. Orme C. (1989) On the uniqueness of the maximum likelihood estimator in the truncated regression models. Econometric Review, 8:2:217-222. Staigler D., Stock J. H., (1997), Instrumental Variables Regression with Weak Instruments, Econometrica, Vol. 65, No. 3, pp.557-586. Verbeek M. (2004) A Guide to Modern Econometrics, Wiley. Wooldridge J. M. (2002) Econometric Analysis of Cross Section and Panel Data, MIT Press. Dr. Bhattarai (Hull Univ. Business School) Duration April 4, 2011 27 / 27