Duration Analysis. Econometric Analysis. Dr. Keshab Bhattarai. April 4, 2011. Hull Univ. Business School



Similar documents
Lecture 15 Introduction to Survival Analysis

Parametric Survival Models

Statistical Analysis of Life Insurance Policy Termination and Survivorship

SUMAN DUVVURU STAT 567 PROJECT REPORT

Introduction to Event History Analysis DUSTIN BROWN POPULATION RESEARCH CENTER

ATV - Lifetime Data Analysis

Introduction. Survival Analysis. Censoring. Plan of Talk

7.1 The Hazard and Survival Functions

On Marginal Effects in Semiparametric Censored Regression Models

Exam C, Fall 2006 PRELIMINARY ANSWER KEY

Survival Analysis of Left Truncated Income Protection Insurance Data. [March 29, 2012]

Modeling the Claim Duration of Income Protection Insurance Policyholders Using Parametric Mixture Models

Interpretation of Somers D under four simple models

From the help desk: hurdle models

Survival Distributions, Hazard Functions, Cumulative Hazards

SAS Software to Fit the Generalized Linear Model

Statistics in Retail Finance. Chapter 6: Behavioural models

HURDLE AND SELECTION MODELS Jeff Wooldridge Michigan State University BGSE/IZA Course in Microeconometrics July 2009

Standard errors of marginal effects in the heteroskedastic probit model

Tips for surviving the analysis of survival data. Philip Twumasi-Ankrah, PhD

Gamma Distribution Fitting

Unit 12 Logistic Regression Supplementary Chapter 14 in IPS On CD (Chap 16, 5th ed.)

Using the Delta Method to Construct Confidence Intervals for Predicted Probabilities, Rates, and Discrete Changes

Classification Problems

Empirical Study of effect of using Weibull. NIFTY index options

Distribution (Weibull) Fitting

Modeling and Analysis of Call Center Arrival Data: A Bayesian Approach

Lecture 2 ESTIMATING THE SURVIVAL FUNCTION. One-sample nonparametric methods

Distance to Event vs. Propensity of Event A Survival Analysis vs. Logistic Regression Approach

Panel Data Econometrics

Statistical Methods for research in International Relations and Comparative Politics

DURATION ANALYSIS OF FLEET DYNAMICS

From the help desk: Swamy s random-coefficients model

**BEGINNING OF EXAMINATION** The annual number of claims for an insured has probability function: , 0 < q < 1.

Enhancing Business Resilience under Power Shortage: Effective Allocation of Scarce Electricity Based on Power System Failure and CGE Models

SPSS TRAINING SESSION 3 ADVANCED TOPICS (PASW STATISTICS 17.0) Sun Li Centre for Academic Computing lsun@smu.edu.sg

Vocational high school or Vocational college? Comparing the Transitions from School to Work

( ) is proportional to ( 10 + x)!2. Calculate the

ESTIMATING AVERAGE TREATMENT EFFECTS: IV AND CONTROL FUNCTIONS, II Jeff Wooldridge Michigan State University BGSE/IZA Course in Microeconometrics

Errata and updates for ASM Exam C/Exam 4 Manual (Sixteenth Edition) sorted by page

for an appointment,

Poisson Models for Count Data

Operational Risk Modeling Analytics

Competing-risks regression

Review of Random Variables

Properties of Future Lifetime Distributions and Estimation

Statistics 305: Introduction to Biostatistical Methods for Health Sciences

Modeling Customer Lifetime Value Using Survival Analysis An Application in the Telecommunications Industry

A Stochastic Frontier Model on Investigating Efficiency of Life Insurance Companies in India

Does Internet Job Search Result in Better Matches?

Comparison of resampling method applied to censored data

SAMPLE SELECTION BIAS IN CREDIT SCORING MODELS

IDENTIFICATION IN A CLASS OF NONPARAMETRIC SIMULTANEOUS EQUATIONS MODELS. Steven T. Berry and Philip A. Haile. March 2011 Revised April 2011

Life Data Analysis using the Weibull distribution

GLMs: Gompertz s Law. GLMs in R. Gompertz s famous graduation formula is. or log µ x is linear in age, x,

Multinomial and Ordinal Logistic Regression

Introduction to Survival Analysis

Institute of Actuaries of India Subject CT3 Probability and Mathematical Statistics

Comparing Features of Convenient Estimators for Binary Choice Models With Endogenous Regressors

I L L I N O I S UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN

Logistic regression modeling the probability of success

LOGIT AND PROBIT ANALYSIS

An Application of the Cox Proportional Hazards Model to the Construction of Objective Vintages for Credit in Financial Institutions, Using PROC PHREG

Survival analysis methods in Insurance Applications in car insurance contracts

Survival Analysis, Software

Automated Biosurveillance Data from England and Wales,

Nonparametric adaptive age replacement with a one-cycle criterion

Confidence Intervals for Exponential Reliability

Survey, Statistics and Psychometrics Core Research Facility University of Nebraska-Lincoln. Log-Rank Test for More Than Two Groups

A revisit of the hierarchical insurance claims modeling

Student Performance in Traditional vs. Online Format: Evidence from an MBA Level Introductory Economics Class

VISUALIZATION OF DENSITY FUNCTIONS WITH GEOGEBRA

Estimation and attribution of changes in extreme weather and climate events

problem arises when only a non-random sample is available differs from censored regression model in that x i is also unobserved

Applied Statistics. J. Blanchet and J. Wadsworth. Institute of Mathematics, Analysis, and Applications EPF Lausanne

Applied Reliability Page 1 APPLIED RELIABILITY. Techniques for Reliability Analysis

Statistics 104 Final Project A Culture of Debt: A Study of Credit Card Spending in America TF: Kevin Rader Anonymous Students: LD, MH, IW, MY

ESTIMATING AN ECONOMIC MODEL OF CRIME USING PANEL DATA FROM NORTH CAROLINA BADI H. BALTAGI*

Do Supplemental Online Recorded Lectures Help Students Learn Microeconomics?*

DETERMINANTS OF CAPITAL ADEQUACY RATIO IN SELECTED BOSNIAN BANKS

Master programme in Statistics

Overview of Monte Carlo Simulation, Probability Review and Introduction to Matlab

Bias in the Estimation of Mean Reversion in Continuous-Time Lévy Processes

MATH 10: Elementary Statistics and Probability Chapter 5: Continuous Random Variables

Missing data and net survival analysis Bernard Rachet

Reject Inference in Credit Scoring. Jie-Men Mok

5 Modeling Survival Data with Parametric Regression

Transcription:

Duration Analysis Econometric Analysis Dr. Keshab Bhattarai Hull Univ. Business School April 4, 2011 Dr. Bhattarai (Hull Univ. Business School) Duration April 4, 2011 1 / 27

What is Duration Analysis? There are several economic questions in which the investigator is interested to know how long a certain thing will last given that it has survived/existed for so long time. Duration of these events is a random variable that depends on chances and duration analysis aims to analyse what factors determine the length of duration of occurrence for period up to T period (t 6 T ) or survival after period T (t > T )or what is probability of transition or the hazard rate between T and T + period. Modelling duration has been used to determine the duration or probability of termination of strikes, unemployment, marriage, disaster spells, heart attacks or many other ill-spells, likelihood of bankruptcy of a rm, technological breakthrough, probability of maintaining championship titles in sports. Dr. Bhattarai (Hull Univ. Business School) Duration April 4, 2011 2 / 27

Example of Duration Analysis Main question is to study that if an event existed so far how long will it last or what is the rate of survival next period? For instance manager of a company would be interested to know how long will a certain machine last given that it has been running so far? A life insurance company would be interested in probability of death of an individual with certain medical record or physical characteristic in the next T + years given that the person has survived up to T years. A union leader or the management negotiator will be interested about the probability of withdrawal of a strike given that the strike has continued up to T periods. Dr. Bhattarai (Hull Univ. Business School) Duration April 4, 2011 3 / 27

Duration Density The starting point of duration analysis is cumulative density function for duration which gives the distribution of duration variable starting from an initial state 0 up to period t as following: Pr (t 6 T ) = F (t) = More interesting is the survival rate which is: Z t 0 f (t) (1) S (t) = 1 F (t) = Pr (t > T ) (2) Probability of transition from one state to another (from unemployment of to employment, life to death, working condition to break down) is given by a hazard rate or probability of termination. Dr. Bhattarai (Hull Univ. Business School) Duration April 4, 2011 4 / 27

Survival and Duration 1.2 Survival 1.0.8.6.4.2.0.2 0 43 86 130 173 216 Duration Estimated Survival Function Dr. Bhattarai (Hull Univ. Business School) Duration April 4, 2011 5 / 27

Hazard Rate Hazard rate F (t + ) F (t) λ (t) = lim!0 S (t) = f (t) S (t) (3) f (t) = S (t).λ (t) (4) Hazard function is linked to the survival function as log [1 λ (t) = F (t)] = F (t) 1 F (t) = f (t) S (t) It is possible to derive the duration function by integrating the survival function (5) Dr. Bhattarai (Hull Univ. Business School) Duration April 4, 2011 6 / 27

Duration and Survival Functions It is possible to derive the duration function by integrating the hazard function Z t 0 λ (t) = log [1 F (s)] + log [1 F (0)] = log [1 F (s)] (6) F (s) = 1 Z t exp λ (t) 0 Therefore modelling hazard function is the main element in the duration models. Proportional hazard model: λ (t, x i ) = λ 0 (t) exp x 0 i, β (8) (7) Dr. Bhattarai (Hull Univ. Business School) Duration April 4, 2011 7 / 27

Main Points in the Duration Analysis Important element in this is modelling the duration dependence, that gives the likelihood of how much hazard rate depends on the duration variable. There is positive duration dependence if the longer the time spent in a given state, the higher the probability of leaving it soon. For instance, longer a light bulb works the higher the probability that it fails next period. Negative duration dependence implies longer the time spent in a given state, the lower the probability of leaving it soon. For instance, the longer the job search lasts, the less chance an unemployed person has nding a job. Absence of duration dependence is observed if the duration does not impact on the hazard rate, but this case is less appealing than the positive or negative duration dependence. Dr. Bhattarai (Hull Univ. Business School) Duration April 4, 2011 8 / 27

Main Points in Duration Analysis Duration dependence λ(t) > 0 indicates positive duration dependence and λ(t) < 0 indicates negative duration dependence. Whereas λ(t) = 0 indicates no duration dependence. There are a number of ways of modelling the hazard functions; 1 exponential models are more popular in the literature (See Wooldridge (2002: chapter 20); Green (2008), Chap 25).see Dixon-Bihan (2011) paper in http://editorialexpress.com/conference/res2011/program/res2011.html Dr. Bhattarai (Hull Univ. Business School) Duration April 4, 2011 9 / 27

Main Points in Duration Analysis Duration dependence λ(t) > 0 indicates positive duration dependence and λ(t) < 0 indicates negative duration dependence. Whereas λ(t) = 0 indicates no duration dependence. There are a number of ways of modelling the hazard functions; 1 exponential 2 Weibull models are more popular in the literature (See Wooldridge (2002: chapter 20); Green (2008), Chap 25).see Dixon-Bihan (2011) paper in http://editorialexpress.com/conference/res2011/program/res2011.html Dr. Bhattarai (Hull Univ. Business School) Duration April 4, 2011 9 / 27

Main Points in Duration Analysis Duration dependence λ(t) > 0 indicates positive duration dependence and λ(t) < 0 indicates negative duration dependence. Whereas λ(t) = 0 indicates no duration dependence. There are a number of ways of modelling the hazard functions; 1 exponential 2 Weibull 3 log-normal models are more popular in the literature (See Wooldridge (2002: chapter 20); Green (2008), Chap 25).see Dixon-Bihan (2011) paper in http://editorialexpress.com/conference/res2011/program/res2011.html Dr. Bhattarai (Hull Univ. Business School) Duration April 4, 2011 9 / 27

Main Points in Duration Analysis Duration dependence λ(t) > 0 indicates positive duration dependence and λ(t) < 0 indicates negative duration dependence. Whereas λ(t) = 0 indicates no duration dependence. There are a number of ways of modelling the hazard functions; 1 exponential 2 Weibull 3 log-normal 4 logistic models are more popular in the literature (See Wooldridge (2002: chapter 20); Green (2008), Chap 25).see Dixon-Bihan (2011) paper in http://editorialexpress.com/conference/res2011/program/res2011.html Dr. Bhattarai (Hull Univ. Business School) Duration April 4, 2011 9 / 27

Main Points in Duration Analysis Duration dependence λ(t) > 0 indicates positive duration dependence and λ(t) < 0 indicates negative duration dependence. Whereas λ(t) = 0 indicates no duration dependence. There are a number of ways of modelling the hazard functions; 1 exponential 2 Weibull 3 log-normal 4 logistic 5 GAMMA models are more popular in the literature (See Wooldridge (2002: chapter 20); Green (2008), Chap 25).see Dixon-Bihan (2011) paper in http://editorialexpress.com/conference/res2011/program/res2011.html Dr. Bhattarai (Hull Univ. Business School) Duration April 4, 2011 9 / 27

Hazard Functions Dr. Bhattarai (Hull Univ. Business School) Duration April 4, 2011 10 / 27

Exponential hazard model Here T has exponential distribution. 1 F (t) = 1 exp ( λ.t). This distribution does not have memory λ (t) = λ, the hazard rate does not depend on duration, it is constant λ (t) = λ. f (t) = λ exp ( λ.t) for (t > 0) λ (t) = log [1 F (t)] = log S (t) (9) ln S (t) = k λ (t) = k λ.t (10) S (t) = K exp ( λt) (11) Estimation of λ is simple; expected duration E (t) = 1 λ and the maximum likelihood estimation of λ is. 1 t Integrated hazard function is written as Λ (t) = R t λ 0 (t) or S (t) = exp ( Λ (t)) or Λ (t) = ln S (t) Dr. Bhattarai (Hull Univ. Business School) Duration April 4, 2011 11 / 27

CDF: density: Survival function Hazard function: S (t) = 1 F (t) = 1 F (t) = 1 e ht (12) f (t) = F 0 (t) = he ht (13) 1 e ht = e ht (14) h(t) = f (t) he ht = S(t) e ht = h (15) See examples in STATA and LIMDEP: Spell, duration, aps, BHPS, recid_jw. Dr. Bhattarai (Hull Univ. Business School) Duration April 4, 2011 12 / 27

Exponential Hazard Model Dr. Bhattarai (Hull Univ. Business School) Duration April 4, 2011 13 / 27

Weibull The CDF of T is given by F (t) = 1 exp ( λ.t α )where λ and α are nonnegative parameters; and the density is given by f (t) = αλt α 1 exp ( λ.t α ) (16) and the hazard function is.λ (t) = f (t) S (t) = αλt α 1 exp( λ.t α ) = αλt exp( λ.t α ) α 1 When α = 1, the Weibull distribution reduces to the exponential distribution with λ (t) = λ; if α > 1, the hazard is monotonically increasing, λ (t) = αλt α 1, which shows positive duration dependence. If α < 1, the hazard, is continuously decreasing, λ (t) = αλt α gives negative duration dependance. 1 this Thus the Weibull distribution is better to capture the duration variable and transition between states if the hazard is monotonically increasing or decreasing. Dr. Bhattarai (Hull Univ. Business School) Duration April 4, 2011 14 / 27

Weibull Hazard Model Dr. Bhattarai (Hull Univ. Business School) Duration April 4, 2011 15 / 27

Log Normal Log normal distributions of durations give non-monotonic hazard functions; rst the hazard rate increases with duration and then decreases. This type of analysis is good in modelling bankruptcy rates. When it follows a normal distribution with mean m and variance σ, its density is given by: f (t) = 1 log T m σ.t φ (17) σ and the survivor function is S (t) = 1 Φ log T m σ with Φ denoting the CDF of a standard normal. The hazard function using λ (t) = f (t) s(t) λ (t) = f (t) s (t) = 1 T 1 σ φ log T σ 1 Φ log T σ m m (18) Dr. Bhattarai (Hull Univ. Business School) Duration April 4, 2011 16 / 27

Log Normal Hazard Models Dr. Bhattarai (Hull Univ. Business School) Duration April 4, 2011 17 / 27

Log logistic Log logistic hazard function is where the α and γ are positive parameters. λ (t) = f (t) s (t) = γαtα 1 1 + γt α (19) Z 0 λ (st) s = Using F (s) = 1 Z 0 exp γαt α 1 1 + γt α s = log (1 + γtα ) = Di erentiating with respect to t gives: R t 0 λ (t) condition derived above hlog (1 + γt α ) 1i (20) F (t) = 1 (1 + γt α ) 1 for t > 0 (21) f (t) = αγt α 1 (1 + γt α ) 2 (22) Dr. Bhattarai (Hull Univ. Business School) Duration April 4, 2011 18 / 27

GAMMA and Summary f (t) = a v t v 1 exp ( at) Γ (v) where Γ (v) = Z 0 exp ( t) t v 1 s (23) Summary of popular distributions for duration model Exponential functions for survival. S (t) = exp ( Λ (t)) λ (t) = λ F (t) = 1 exp ( λ.t) f (t) = λ exp ( Logistic S (t) = 1 1 σ λ.t) Φ log T m σ log T φ( m σ ) σ ) λ (t) = 1 T 1 Φ( log T m F (t) = 1 (1 + αt α ) 1 f (t) = αγt α 1 (1 + λt α ) 2 Weibull ;S (t) = exp ( λ.t α ); ; λ (t) = αλt α 1 F (t) = 1 exp ( λ.t α ) f (t) = αλt α 1 exp ( λ.t α ) Dr. Bhattarai (Hull Univ. Business School) Duration April 4, 2011 19 / 27

Estimation of Hazard Models Issue of ow versus stock sampling and left versus right truncation. Log linear models: Parameters of above models θ = (λ, γ) can be estimated using the maximum likelihood function for uncensored and censored observations. ln L = ln f (t/θ) + ln s (t/θ) (24) It is easily estimated by BHHH (Berdt-Hall-Hall-Hauseman (1974) estimator (See Greene (938-951)). Proportional hazard models, ln L = ln λ (t/θ) + ln s (t/θ) (25) λ (t) = e β(t,θ) λ 0 (t i ) (26) where the λ (t) is proportional to the baseline hazard function λ 0 (t i ). Empirical implementation (STATA10, Greene (2000); Chapter 20; Using Limdep) Dr. Bhattarai (Hull Univ. Business School) Duration April 4, 2011 20 / 27

Estimation of Duration in STATA See the log le hazard and hazard1 from the Annual Population Survey streg tpben31 tpben32 tpben33 tpben34 tpben35 tpben36 self1 self2 self3 self4 sex ethas ethbl gross99 grsexp, dist(weibull) failure _d: 1 (meaning all fail) analysis time _t: durun streg tpben31 tpben32 tpben33 tpben34 tpben35 tpben36 self1 self2 self3 self4 sex ethas ethbl gross99 > grsexp, dist(exponential) streg tpben31 tpben32 tpben33 tpben34 tpben35 tpben36 self1 self2 self3 self4 sex ethas ethbl gross99 > grsexp, dist(gompertz) streg tpben31 tpben32 tpben33 tpben34 tpben35 tpben36 self1 self2 self3 self4 sex ethas ethbl gross99 > grsexp, dist(lognormal) Dr. Bhattarai (Hull Univ. Business School) Duration April 4, 2011 21 / 27

Dr. Bhattarai (Hull Univ. Business School) Duration April 4, 2011 22 / 27

Dr. Bhattarai (Hull Univ. Business School) Duration April 4, 2011 23 / 27

Dr. Bhattarai (Hull Univ. Business School) Duration April 4, 2011 24 / 27

Estimation of Duration in STATA _t j Haz. Ratio Std. Err. z P>jzj [95% Conf. Interval] -+ - tpben31 j.9708788.0020447-14.03 0.000.9668796.9748946 tpben32 j.9733548.0020842-12.61 0.000.9692784.9774483 tpben33 j 1.004645.0029922 1.56 0.120.9987971 1.010527 tpben34 j.9886277.0086266-1.31 0.190.9718636 1.005681 tpben35 j.9717948.0358969-0.77 0.439.9039247 1.044761 tpben36 j (omitted) self1 j 1.020205.0044089 4.63 0.000 1.0116 1.028883 self2 j.9845701.0131169-1.17 0.243.9591941 1.010617 self3 j.9888829.0183726-0.60 0.547.9535209 1.025556 self4 j 1.028848.0244766 1.20 0.232.9819762 1.077957 sex j 1.531057.0346324 18.83 0.000 1.464662 1.600462 ethas j.9877391.0038554-3.16 0.002.9802115.9953245 ethbl j.9862598.0051397-2.65 0.008.9762374.9963851 Dr. Bhattarai gross99(hull j (omitted) Univ. Business School) Duration April 4, 2011 25 / 27

LIMDEP Commands /*========================================= Example 20.17. Log-Linear Survival Models for Strike Duration */========================================= Read ; Nobs = 62 ; Nvar = 2 ; Names = T,Prod $ T Prod 7 0.01138 14 0.01138 52 0.01138 37 0.02299? Four survival models for duration? Create ; logt = Log(T) $ Surv; Lhs=logT ; Rhs = One ; Model=Exponential ; Plot$ Surv; Lhs=logT ; Rhs = One ; Model=Weibull ; Plot$ Surv; Lhs=logT ; Rhs = One ; Model=Logistic ; Plot$ Surv; Lhs=logT ; Rhs = One ; Model=Normal ; Plot $ Dr. Bhattarai (Hull Univ. Business School) Duration April 4, 2011 26 / 27

Chesher A (1984) Improving the e ciency of Probit estimators, Review of Economic Studies,66:3:523-527. Elbers C. and G. Ridder (1982) True and spurious duration dependence: the identi ability of proportional hazard model, Review of Economic Studies, 49:3:July: 403-409 Greene W. (2008) Econometric Analysis, Prentice Hall, 6th edition. Greene W.H. (1998) LIMDEP Version 7: User Manual, Econometric Software Inc. Hausman J.A., (1978), Speci cation Tests in Econometrics, Econometrica, Vol. 46, No. 6, pp.1251-1271. Heckman J. J., (1979), Sample Selection Bias as a Speci cation Error, Econometrica, Vol. 47, No. 1, pp153-161. Imbens G. W. and T Lancaster (1994) Combining Micro and Macro Data in Microeconometric Models, Review of Economic Studies, 61:4:655-680. Keifer N (1988) Economic duration data and hazard functions, Journal of Economic Literature, 26:647-679. Lancaster T (1979) Econometric Methods for Duration of Unemployment, Econometrica, 47:4:939-56. Lancaster T (1990) Econometric Analysis of Transition Data, Blackwell Lancaster T and A Chesher (1983) The Estimation of Models of Labour Market Behviour Review of Economic Studies, 50:4:609-624. Orme C. (1989) On the uniqueness of the maximum likelihood estimator in the truncated regression models. Econometric Review, 8:2:217-222. Staigler D., Stock J. H., (1997), Instrumental Variables Regression with Weak Instruments, Econometrica, Vol. 65, No. 3, pp.557-586. Verbeek M. (2004) A Guide to Modern Econometrics, Wiley. Wooldridge J. M. (2002) Econometric Analysis of Cross Section and Panel Data, MIT Press. Dr. Bhattarai (Hull Univ. Business School) Duration April 4, 2011 27 / 27