Econometrics. Week 10. Fall Institute of Economic Studies Faculty of Social Sciences Charles University in Prague

Size: px

Start display at page:

Download "Econometrics. Week 10. Fall Institute of Economic Studies Faculty of Social Sciences Charles University in Prague"

Reynard Crawford
7 years ago
Views:

1 Econometrics Week 10 Institute of Economic Studies Faculty of Social Sciences Charles University in Prague Fall / 23

2 Recommended Reading For the today Limited Dependent Variable Models Chapter 17 (pp ). In the next week Advanced Time Series Topics Selected topics from Chapter 18 2 / 23

3 Today s Talk Limited Dependent Variables (LDV) are simply dependent variables whose range of values is substantially restricted. For example binary dependent variable (values 0 or 1). These kind of variables need a special treatment. In econometrics, we often use several type of models for these variables we are going to discuss today: Logit Probit Tobit Censored and Truncated Regression Models We will focus on their cross-sectional applications, while it can be used as well in panel data or time series data. Main disadvantage of these models is that they are difficult to interpret. 3 / 23

4 Binary Dependent Variables Recall the linear probability model (LPM): P (y = 1 x) = β 0 + xβ The main problem is that: Fitted probabilities can be less than zero or greater than 1. Partial effect of any explanatory variable is constant. These limitations may be overcome by modeling the probability as some general function, instead of assumption of linearity in parameters: Binary Response Models P (y = 1 x) = G(β 0 + β 1 x β k x k ) = G(β 0 + xβ), where x is full set of explanatory variables, G is a function taking values strictly between zero and one: 0 < G(z) < 1. 4 / 23

5 The Logit Model While G can is general nonlinear function, most applications use 2 cases. First common case is to choose G to be a logistic function which is cdf for a standard logistic random variable: The Logit Model G(z) = exp(z)/ [1 + exp(z)] = Λ(z) Taking values between zero and one for all real values z. Sometimes we refer to this model as a logistic regression. 5 / 23

6 The Probit Model Another common choice of G is the standard normal cumulative distribution function (cdf) The Probit Model G(z) = Φ(z) = z φ(z)dv, where φ(z) is the standard normal density: φ(z) = (2π) 1/2 exp( z 2 /2) Both G functions have very similar shapes, they are increasing in z, most quickly around 0. 6 / 23

7 The Logit and Probit Models Figure: Red Line is logistic function, black line standard normal z 7 / 23

8 The Logit and Probit Models cont. There is no real reason to prefer one model to another. Traditionally, logit was used more often as the logistic function leads to easier computation of the model. In economics, assumption of standard normal distribution is more realistic, thus probit is preferred by economists. There is no useful interpretation of coefficients β j. 8 / 23

9 The Logit and Probit Models Estimation Due to its nonlinear nature, we have to use Maximum Likelihood Estimation (MLE). Assuming we have a random sample of size n. To obtain MLE, we need the density of y i given x i first: where y = 0, 1 MLE of β f(y x i ; β) = [G(x i β)] y [1 G(x i β)] (1 y), To obtain MLE of β, we need to maximize following log-likelihood: L(β) = n y i log[g(x i β)] + (1 y i ) log[1 G(x i β)], i=1 where if G(.) is the standard logic cdf, ˆβ is the logit estimator, if G(.) is the standard normal cdf, ˆβ is the probit estimator. 9 / 23

10 The Logit and Probit Models Estimation cont. The general theory of conditional MLE for random samples implies that, under very general conditions, the MLE is consistent, asymptotically normal and asymptotically efficient. Thus we can derive asymptotic standard errors for estimates easily. And we can use them to test H 0 : β = 0. We can also simply test multiple restrictions in the logic and profit models. Easiest way is to use Likelihood Ratio (LR) test: LR = 2(L u L r ) a χ 2 q where u is unrestricted and r restricted model and we have q restrictions. In this way we can simply test the importance of variables. If we drop a variable from the model and log-likelihood significantly decreases, we know that it is significant for the model. 10 / 23

11 Interpretation of the Logit and Probit Estimates The most difficult aspect of these models in presenting and interpreting results. In general, we care about the effect of x on P (y = 1 x). β j coefficients give us the sign of the partial effect of each x j on the response probability, that is p/ x = g(β 0 + xβ)β j. We can compare significance (H 0 : β j = 0) and signs of coefficients. Coparison of magnitudes is more complicated, as we need to calculate the derivatives at the means. STATA will do this in the probit case. 11 / 23

12 Interpretation of the Logit and Probit Estimates cont. To measure goodness-of-fit, we can not simply use R 2. One possibility is a pseudo R 2 based on the log-likelihood: 1 L u /L r. We can also look at the percent correctly predicted measure if predict a probability > 0.5 then that matches y = 1 and vice versa. percent correctly predicted is in fact percentage of times the predicted y i matches actual y i (which is zero or one). 12 / 23

13 The Tobit Model Another kind of limited dependent variable is one where we have roughly continuous variables over strictly positive values but zero otherwise. The Tobit Model y = β 0 + xβ + u, u x N(0, σ 2 ) y = max(0, y ) Latent variable y satisfies the classical linear model assumptions (normal, mohoskedastic distribution with linear conditional mean). 13 / 23

14 The Tobit Model cont. To estimate β and σ, we use MLE again. MLE of Tobit model To estimate the parameters of the Tobit model, we need to maximize log-likelihood obtained by summing following function across all i: l i (β, σ) = 1(y i = 0) log[1 Φ(x i β/σ)] +1(y i > 0) log ((1/σ)φ[(y i x i β)/σ]), where φ is standard normal density function. Notice how log-likelihood functions has two parts 14 / 23

15 Interpreting the Tobit Model Again, we obtain parameter estimates from software package, but they are not so straightforward to interpret as with OLS case. β estimates are effects of x on y, but we are interested to explain y. Basically, we can estimate two expectations: E(y x) = P (y > 0 x)e(y y > 0, x) = Φ(xβ/σ)E(y y > 0, x) E(y y > 0, x) = xβ + σλ(xβ/σ), where λ(c) = φ(c)/φ(c) is so called Inverse Mills ratio. Expected value of y conditional on y > 0 is equal to xβ and some strictly positive term. 15 / 23

16 Interpreting the Tobit Model cont. The Tobit model rely heavily on normality and homoskedasticity in the underlying latent variable. In OLS, we can deal with heteroskedasticity for example by computing heteroskedasticity robust standard errors. But in Tobit, in case of heteroskedasticity we never know what MLE is actually estimating. 16 / 23

17 Censored and Truncated Regression Models Model with similar structure as the Tobit model is censored regression model. These two are sometimes interchanged, but they are different. Unlike the Tobit, the censored regression arises due to the data censoring. We assume underlying dependent variable to be normally distributed, but it is censored below or above a certain value due to the way we collect data. A truncated regression model arises when we exclude a subset of population in a sampling scheme. In other words, we do not have random sampling, but have some rule to sample the data.a 17 / 23

18 Censored Regression Models Generally, censored regression model can be defined without distributional assumptions. We will study it under assumption of normal distribution. The Censored Normal Regression Model y i = β 0 + x i β + u i, u i x i, c i N(0, σ 2 ) w i = min(y i, c i ), with censoring value c i. While we only observe w i = min(y i, c i ) in right censored data, in left censored data, we observe w i = max(y i, c i ). 18 / 23

19 Censored Regression Models cont. A good example of censored data is some type of surveys. If we ask respondents about their wealth, but they are allowed to respond with more than CZK, then we observe actual wealth of people below this threshold, but not above. Censoring threshold c i in this case is constant for all i. In many situations, censoring threshold changes with individuals i. We are able to estimate the censored regression model using MLE. 19 / 23

20 Truncated Regression Models Similar to censored model, but in this case, we do not observe any information about certain part of population. For example, if it is not possible to survey part of population for any reason. The Truncated Normal Regression Model y = β 0 + xβ + u, u x N(0, σ 2 ) This model would satisfy the CLM assumptions. BUT, we do not have random sample (we observe only part of population). Thus OLS suffers from selection bias. Again, we have to use MLE. 20 / 23

21 Sample Selection Correction Truncated regression is a special case of general problem: nonrandom sample selection. We can also thing about it as omitted variable bias. What is omitted now is how was the selection of the sample made. One way how to correct for it and obtain consistent estimates of β from E(y z, s = 1) = xβ + ρλ(zγ), where z is set of instrumental variables: zγ = γ 0 + γ 1 z γ m z m. Sample Selection Correction Using all observations, estimate a probit model of selection indicator s i on z i. Compute the inverse Mills ratio ˆλ i = λ(z iˆγ). Using the selected sample (observations where s i = 1), run regression of y i on x i and ˆλ i. 21 / 23

22 Sample Selection Correction We can use a simple test for selection bias using this procedure. We can test H 0 : ρ = 0 under which there is no sample selection problem. Use t statistic on ˆλ i. When ρ 0, OLS errors are not correct. 22 / 23

23 Thank you Thank you very much for your attention! 23 / 23

Regression with a Binary Dependent Variable

Regression with a Binary Dependent Variable Chapter 9 Michael Ash CPPA Lecture 22 Course Notes Endgame Take-home final Distributed Friday 19 May Due Tuesday 23 May (Paper or emailed PDF ok; no Word, Excel,