Limited Dependent Variable Models

Transcription

1 Limited Dependent Variable Models Contents: Censored and truncated samples Sample selection bias and Mills ratio Truncated regression The Tobit I Model Interpretation, Tests, etc. The Selectivity Tobit II Model The Selectivity Tobit III Model The Double Hurdle Model Econometrics 2 (Summer 2008) 1 / 31

2 Censored and truncated samples Censored and Truncated Samples One may not always observe data on continuous dependent and explanatory variables over the entire population. Examples: Wages are only observed for people with job Same happens for working hours Individuals whose incomes fall short of some poverty line are assigned incomes equal to that poverty line for records there may also exist an upper bound (highest taxable income) Only people with children may need child care etc. We define samples as either truncated or censored depending on the nature of the limitations of the data. Econometrics 2 (Summer 2008) 2 / 31

3 Censored and Truncated Samples Censored and truncated samples Truncated samples: A sample is truncated if data is only available on a subset of the whole population. In many samples, only people having a job are recorded For truncated samples, data is simply not available to the researcher Censored samples: A sample is censored if data is re-coded for a subset of the population. Censored data can be characterized as a sample defect. Recall example of minimum wages Further Examples: A model of the demand for tickets to a concert. A model of desired labor supply, or the latent propensity to work. A model for labor supply (in hours) for labor force survey Econometrics 2 (Summer 2008) 3 / 31

4 Observability Criteria Observability Criteria from the dependent variable s perspective Consider an observed dependent variable y, a set of explanatory variables x, a latent variable y. Truncated Samples: T1: y = y if y > c, not observed otherwise, T2: y = y if y < d, not observed otherwise, T3: y = y if c < y < d, not observed otherwise. Censored Samples: C1: y = y 1I(y > c) + c 1I(y c), C2: y = y 1I(y < d) + d 1I(y d), C3: y = y 1I(c < y < d) + c 1I(y c) + d 1I(y d). Econometrics 2 (Summer 2008) 4 / 31

5 Sample Selection Bias Sample Selection Bias Consider the latent relationship y = x β + u with E(u x) = 0. Suppose that the observed dependent variable y is truncated from below at zero (T1 from above). Then, the conditional expectation E(y x, y > 0) = x β + E(u x, y > 0) Suppose we are interested in estimating = x β + E(u u > x β) (1) = x β + α ψ(x β). y i = x i β + u i for y i truncated from below at zero. Econometrics 2 (Summer 2008) 5 / 31

6 Sample Selection Bias Sample Selection Bias Problem: the parameters ˆβ c for the conditional model may not converge to the true parameters of the underlying latent relationship y = x β + u. In a sense, the model above suffers from omitted variables bias. The correct model should have been y i = x i β + α ψ(x i β) + u i. To estimate β from the correct model need to know: What form should ψ(x i β) take? if we know, least squares estimation is possible Econometrics 2 (Summer 2008) 6 / 31

7 Conditional Normal Density Preliminaries Let u i N(0, σ 2 ). Then, we may write the density f (u) as 1 u2 f (u) = exp( 2πσ 2 2σ 2 ) = 1 σ 1 exp( 1 2π 2 ( u σ )2 ) = 1 σ φ( u σ ) Also, P(u > c) = P( u σ > c σ ) = 1 Φ( c σ ) = Φ( c σ ) Hence, f (u u > c) = 1 σ φ( u σ ) 1 1 Φ( c σ ) = σ φ( u σ ) Φ( c σ ) Econometrics 2 (Summer 2008) 7 / 31

8 Preliminaries Conditional Expectation for a Normal Random Variable Let u i N(0, σ 2 ) (homoscedastic!). Then the conditional expectation E(u u > c) may be derived as E(u u > c) = = = c u f (u u > c) u = 1 1 Φ( c σ ) σ 1 Φ( c σ ) = σ φ( c σ ) 1 Φ( c σ ). c c/σ c u σ φ( u σ ) u z φ(z) z = u 1 σ φ( u σ ) 1 Φ( c σ ) u [ ] σ 1 Φ( c σ ) φ(z) c/σ Econometrics 2 (Summer 2008) 8 / 31

9 Inverse Mills Ratio Limited Dependent Variable Models Preliminaries To determine αψ need to assume distribution for u. Take normal distribution (φ, Φ) with variance σ 2. Then, with straight forward integration E[u y > 0] = E[u u > x β] = = x β σ { φ( u σ ) 1 Φ( x β σ φ(z) }du = σ ) Φ(z) where z = x β/σ Therefore α = σ and ψ(x β) = φ(z)/φ(z) = λ(z) called the inverse Mills ratio sometimes we find in the literature λ(x β) Econometrics 2 (Summer 2008) 9 / 31

10 Truncated Regression Model Truncated Regression Model Recall Conditional distribution / density The conditional density for a random variable u (recall, given x, we need only to consider distribution of error term) with unconditional density f (.) cumulative distribution F (.) for truncation from below at c can be written as f (u u > c) = = = f (u) P(u > c) f (u) 1 P(u c) f (u) 1 F (c). (2) Econometrics 2 (Summer 2008) 10 / 31

11 Truncated Regression Model Truncated Regression Model Recall: this is still a density. It is clearly 0, and integrating over the (limited) range gives c f (u) f (u u > c) u = c 1 F (c) u 1 = 1 F (c) f (u) u = c 1 (1 F (c)) = 1. 1 F (c) Consider a truncated sample where y = y if y > c, but y not observed otherwise. The model to be estimated is y i = x i β + u i. Estimating this model on the truncated sample implies that the distribution of u i is truncated, see eqn (1). Econometrics 2 (Summer 2008) 11 / 31

12 Truncated Regression Model Truncated Regression Model Assume the unconditional distribution of u is normal and homoscedastic: u i N(0, σ 2 ). if y i is truncated from below at 0, then we only observe u i over the limited range for which u i > x i β. Then, conditional density of u i is, compare eqn (2), f (u i u i > c) = 1 σ φ( u i σ ) 1 1 Φ( z i ) = σ φ( u i Φ(z i ) σ ) where z i = (x i β)/σ. To estimate the truncated model use of Maximum Likelihood techniques. Econometrics 2 (Summer 2008) 12 / 31

13 Truncated Regression Model Truncated Regression Model For each observation in the truncated sample, the likelihood contribution is precisely the conditional density L i (β, σ) = The sample likelihood is therefore L(β, σ) = 1 σ φ( y i x i β σ ) Φ(x i β/σ). N L i (β, σ). i=1 Maximization gives ML-parameters β and σ. Econometrics 2 (Summer 2008) 13 / 31

14 Truncated Regression Model Truncated Regression Model The marginal effect of a change in x i on the conditional expectation: is not simply β. E(y i x i, y i > 0) Comparing with eqn (1), it is rather E(y i x i, y i > 0) x i = β + α ψ(z i) x i Nevertheless, also β has a clear interpretation; actually Discuss! E(yi x i) = β x i Need to know ψ (inverse Mills ratio, commonly denoted by λ). Econometrics 2 (Summer 2008) 14 / 31

15 The Tobit Model (Tobit I) The Tobit Models Censored regression model or Tobit model (attributable to Tobin, 1958, gives Tobin-probit). The difference here is that data is available on the entire sample. But the dependent variable is censored at some value (say, C1). As before, the censoring (or truncation) is determined by the equation / model of interest. Consider again the latent relationship with a censored variable y i = y i y i = x i β + u i, u i N(0, σ 2 ) 1I(y i > 0). Will derive MLE making use what we have learned above (inverse Mills ration) Econometrics 2 (Summer 2008) 15 / 31

16 The Tobit I Model Limited Dependent Variable Models The Tobit Models Given normality for u i, the probability of observing a censored observation is where z i = x i β/σ. P(y i = 0 x i ) = P(y i 0 x i ) = P(u i x i β) = Φ( z i ) = 1 Φ(z i ) For uncensored observations, we write the density of y i in the normal way as f (y i ) = 1 σ φ(y i x i β ). σ Then, the sample likelihood function can be compiled as L(β, σ) = 1 Φ( x iβ σ ) 1 σ φ(y i x i β ). σ y i =0 y i >0 Econometrics 2 (Summer 2008) 16 / 31

17 The Tobit Models Interpreting the Tobit I Given a latent relationship of the form with an observability criterion we have that y i = x i β + u i, u i N(0, σ 2 ) y i = y i 1I(y i > 0) P(y i = 0 x i ) = 1 Φ(x i β/σ) P(y i > 0 x i ) = Φ(x i β/σ). The expected value of the observed dependent variable y i (censored at zero) is where λ(z) = φ(z)/φ(z). E(y i x i ) = P(y i > 0 x i ) E(y i x i, y i > 0) = Φ(x i β/σ) [x i β + σ λ(x i β/σ)] Econometrics 2 (Summer 2008) 17 / 31

18 Interpreting the Tobit The Tobit Models Marginal Effects On the latent variable: E(yi ) = β x i On the observed dependent variable, censored at zero, for the whole sample: E(y i x i ) = β Φ(x i β/σ) x i On the non-censored observed dependent variable: E(y i x i, y i > 0) x i = β + σ λ x i Econometrics 2 (Summer 2008) 18 / 31

19 The Tobit Models Limitations of the Tobit There may be problems with the Tobit model: Often, also for censored data we find missing explanatory data, What if the latent model is non-linear? (in literature one speaks of problems due to characterizing censored observations as corner solutions.) Econometrics 2 (Summer 2008) 19 / 31

20 The Tobit Models Testing and Miscellaneous Specification tests can be derived easily due to the use of maximum likelihood methods: think of LM-tests, Likelihood ratio test, etc. Extensions to models and estimators without the normality assumption are available though not trivial. For example, with logit, the inverse Mills ratio looks more complicated. Moreover, semiparametric methods to avoid distribution assumptions can get complicated. Can be constructed e.g. from Methods of Moments, see future chapters. If, as will happen on the next slides, the selection rule is based on a different model, then a two step estimator is thinkable, too. Econometrics 2 (Summer 2008) 20 / 31

21 Tobit II, III, etc Tobit II, III, etc. : Alternative Selection Rules Recall that so far we have assumed that censoring and truncation are driven by the model of interest. What happens if there is a simultaneous or sequential decision process behind? Consider a structural latent relationship: y i = x i β + u i and a latent, possibly censored, observability relationship for y i : I i = z i γ + v i. Disturbance terms u i and v i are (jointly) normal and homoscedastic with covariance cov(u i, v i ) = σ uv. Without loss of generality, for identification set σ v = 1 (discuss) Econometrics 2 (Summer 2008) 21 / 31

22 Tobit II, III, etc. Limited Dependent Variable Models Tobit II, III, etc Then, we get the following observability criteria Truncated Regression Tobit I Model Selectivity / Tobit II Model Selectivity / Tobit III Model Double Hurdle Model y i = yi if yi > 0 (not observed else) y i = yi 1I(yi > 0) y i = y i y i = y i y i = y i 1I(Ii > 0) 1I(Ii > 0) with I i censored 1I(Ii > 0 and yi > 0) (Further assumption: y i 0) Econometrics 2 (Summer 2008) 22 / 31

23 Selectivity Model Selectivity Tobit II Model For the Selectivity tobit II model, Observability criterion: y i = x i β + u i, I i = z i γ + v i. y i = y i 1I(I i > 0). General likelihood for the Selectivity model: L(β, γ, Σ) = y i =0 P(I i = 0 z i ) y i >0 f (y i x i, z i, I i = 1), (3) with P(I i = 0 z i ) = P(Ii < 0 z i ) = P(v i < z i γ) = Φ( z i γ) = 1 Φ(z i γ). Econometrics 2 (Summer 2008) 23 / 31

24 Selectivity Model Limited Dependent Variable Models Selectivity Model For the uncensored observations, need the conditional density f (y i x i, z i, I i = 1). Decompose into a product, f (y i x i, z i, I i = 1) = P(I i > 0 y i ) f (y i ). For bivariate normal disturbances u and v, Note that if σ uv = 0, P(Ii > 0 y i ) = P(v i > z i γ u i ) ( z i γ + σuv σ = Φ u (y i x i β) ). 1 ( σuv σ u ) 2 P(I i > 0 y i ) = Φ(z i γ). Econometrics 2 (Summer 2008) 24 / 31

25 Selectivity Model Two-Step Estimation of the Selectivity Model The Selectivity model may be is rather complex Can get consistent (but not efficient) estimates for Selectivity model using probit and OLS. The two-step estimation procedure derives from the fact that E(y i y i > 0) = x i β + E(u i y i > 0). Note that the conditional expectation for the Selectivity model is E(y i y i > 0) = E(y i I i > 0) = x i β + E(u i I i > 0) = x i β + E(u i v i > z i γ) = x i β + σ uv σ u λ(z i γ) Econometrics 2 (Summer 2008) 25 / 31

26 Selectivity Model Two-Step Estimation of the Selectivity Model Heckman (1979) proposes a two-stage procedure: 1 Use a Probit on the entire sample to estimate ˆγ. 2 Construct λ(z i ˆγ) for the non-censored observations in the sample. 3 Regress y i on x i and λ(z i ˆγ) by OLS for the subsample of non-censored observations. We must correct the standard errors of the second stage OLS for the inclusion of what is termed a generator regressor (ie. λ(z i ˆγ) in the structural regression). Even having done so, the estimates are inefficient (ie. not minimum variance). In the two-step procedure (indeed, for the full-information ML procedure also), one needs to worry about indentification of the hazard rate. Econometrics 2 (Summer 2008) 26 / 31

27 Selectivity Model Selectivity Tobit III Model Imagine the Selectivity model is a Tobit I might be even nested with main equation. For example, we have labor supply measured in hours. this, typically, happens only if also the selectivity model is of interest, else reduction to binary response would be fine. The three-step estimation procedure would work then as follows 1 Estimate the selectivity model via Tobit I on the entire sample to get ˆγ. 2 as above 3 as above for nested models some steps more to do. Econometrics 2 (Summer 2008) 27 / 31

28 The Double-Hurdle Model The Double-Hurdle Model An alternative to the selectivity model, proposed by Cragg (1971). So-called because of two hurdles to be overcome before observing a non-censored observation. Examples: 1 Labour Supply: 1 Do you want to work? 2 Given that you choose to look for work, can you find a job? 2 Credit constraints: 1 Do you want to buy the good? 2 Given that you want to buy the good, can you obtain credit? Econometrics 2 (Summer 2008) 28 / 31

29 The Double-Hurdle Model The Double-Hurdle Model For the Double Hurdle model, where (as before) y i = y i 1I(I i > 0 and y i > 0), y i = x i β + u i, I i = z i γ + v i. The probability of observing a non-censored observation is P(y i > 0 x i, z i ) = P(Ii > 0 and yi > 0) = P(v i > z i γ and u i > x i β) ( = Φ 2 z i γ, x i β ) ; ρ σ u Econometrics 2 (Summer 2008) 29 / 31

30 The Double-Hurdle Model The Double-Hurdle Model Clearly, for censored observations we write ( P(y i = 0 x i, z i ) = 1 Φ 2 z i γ, x i β ) ; ρ. σ u Further, for non-censored observations, the likelihood contribution is identical to the Selectivity Tobit II model. The general likelihood looks like eqn (3): L(β, γ, Σ) = y i =0 P(y i = 0 x i, z i ) y i >0 = P(y i = 0 x i, z i ) y i =0 y i >0 f (y i x i, z i, I i f (y i x i, z i, y i > 0) > 0, y i > 0) = P(y i = 0 x i, z i ) P(y i > 0 x i, z i ) f (y i ) y i =0 y i >0 Econometrics 2 (Summer 2008) 30 / 31

31 The Double-Hurdle Model The Double-Hurdle Model A simplification is assuming no correlation, so that L(β, γ, σ u ) = [1 Φ(z i γ) Φ( x i β )] Φ(z i γ) 1 ( yi x i φ β ) σ y i =0 u σ y i >0 u σ u discuss: scaling and identification problems (e.g. still works with σ v = 1) discuss: heteroscedasticity?! Econometrics 2 (Summer 2008) 31 / 31