Overview and Introduction. Contact

Similar documents
LOGIT AND PROBIT ANALYSIS

Multiple Choice Models II

The Probit Link Function in Generalized Linear Models for Data Mining Applications

Standard errors of marginal effects in the heteroskedastic probit model

SAS Software to Fit the Generalized Linear Model

Ordinal Regression. Chapter

Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model

What s New in Econometrics? Lecture 8 Cluster and Stratified Sampling

Regression with a Binary Dependent Variable

Multinomial and Ordinal Logistic Regression

Linear Threshold Units

BayesX - Software for Bayesian Inference in Structured Additive Regression

Teaching model: C1 a. General background: 50% b. Theory-into-practice/developmental 50% knowledge-building: c. Guided academic activities:

Using the Delta Method to Construct Confidence Intervals for Predicted Probabilities, Rates, and Discrete Changes

Statistical Machine Learning

HURDLE AND SELECTION MODELS Jeff Wooldridge Michigan State University BGSE/IZA Course in Microeconometrics July 2009

ECON 523 Applied Econometrics I /Masters Level American University, Spring Description of the course

Chapter 6: Multivariate Cointegration Analysis

Linear Classification. Volker Tresp Summer 2015

Auxiliary Variables in Mixture Modeling: 3-Step Approaches Using Mplus

Logit Models for Binary Data

Solución del Examen Tipo: 1

From the help desk: Bootstrapped standard errors

LOGISTIC REGRESSION. Nitin R Patel. where the dependent variable, y, is binary (for convenience we often code these values as

STA 4273H: Statistical Machine Learning

CHAPTER 12 EXAMPLES: MONTE CARLO SIMULATION STUDIES

Poisson Models for Count Data

Logit and Probit. Brad Jones 1. April 21, University of California, Davis. Bradford S. Jones, UC-Davis, Dept. of Political Science

for an appointment,

CHAPTER 3 EXAMPLES: REGRESSION AND PATH ANALYSIS

ESTIMATING AVERAGE TREATMENT EFFECTS: IV AND CONTROL FUNCTIONS, II Jeff Wooldridge Michigan State University BGSE/IZA Course in Microeconometrics

Econometrics Simple Linear Regression

Statistical Methods for research in International Relations and Comparative Politics

On Marginal Effects in Semiparametric Censored Regression Models

Techniques of Statistical Analysis II second group

Financial Vulnerability Index (IMPACT)

A Basic Introduction to Missing Data

FIXED EFFECTS AND RELATED ESTIMATORS FOR CORRELATED RANDOM COEFFICIENT AND TREATMENT EFFECT PANEL DATA MODELS

Logistic regression modeling the probability of success

Example: Credit card default, we may be more interested in predicting the probabilty of a default than classifying individuals as default or not.

Comparing Features of Convenient Estimators for Binary Choice Models With Endogenous Regressors

LAGUARDIA COMMUNITY COLLEGE CITY UNIVERSITY OF NEW YORK DEPARTMENT OF MATHEMATICS, ENGINEERING, AND COMPUTER SCIENCE

1 Teaching notes on GMM 1.

CIDE XXIV Corso Residenziale di Econometria

SAMPLE SELECTION BIAS IN CREDIT SCORING MODELS

It is important to bear in mind that one of the first three subscripts is redundant since k = i -j +3.

ANNUITY LAPSE RATE MODELING: TOBIT OR NOT TOBIT? 1. INTRODUCTION

Panel Data Econometrics

CREDIT SCORING MODEL APPLICATIONS:

Statistics Graduate Courses

VI. Introduction to Logistic Regression

Linda K. Muthén Bengt Muthén. Copyright 2008 Muthén & Muthén Table Of Contents

Statistics in Retail Finance. Chapter 6: Behavioural models

Logistic Regression (1/24/13)

From the help desk: hurdle models

MULTIPLE REGRESSION WITH CATEGORICAL DATA

INDIRECT INFERENCE (prepared for: The New Palgrave Dictionary of Economics, Second Edition)

Multinomial Logistic Regression

ON THE ROBUSTNESS OF FIXED EFFECTS AND RELATED ESTIMATORS IN CORRELATED RANDOM COEFFICIENT PANEL DATA MODELS

Regression III: Advanced Methods

Calculating the Probability of Returning a Loan with Binary Probability Models

Regression for nonnegative skewed dependent variables

Dealing with Missing Data

2. Linear regression with multiple regressors

Introduction to General and Generalized Linear Models

Elements of statistics (MATH0487-1)

Lecture 3: Linear methods for classification

Markov Chain Monte Carlo Simulation Made Simple

Least Squares Estimation

ASSIGNMENT 4 PREDICTIVE MODELING AND GAINS CHARTS

Why Taking This Course? Course Introduction, Descriptive Statistics and Data Visualization. Learning Goals. GENOME 560, Spring 2012

Nominal and ordinal logistic regression

CAPM, Arbitrage, and Linear Factor Models

Mortgage Loan Approvals and Government Intervention Policy

Handling attrition and non-response in longitudinal data

Introduction to Fixed Effects Methods

11. Analysis of Case-control Studies Logistic Regression

Generalized Linear Models

Lecture 6: Poisson regression

Module 3: Correlation and Covariance

Models for Longitudinal and Clustered Data

Chapter 10: Basic Linear Unobserved Effects Panel Data. Models:

The Cobb-Douglas Production Function

Multivariate Logistic Regression

Institute of Actuaries of India Subject CT3 Probability and Mathematical Statistics

PS 271B: Quantitative Methods II. Lecture Notes

Efficient and Practical Econometric Methods for the SLID, NLSCY, NPHS

CS229 Lecture notes. Andrew Ng

Section Format Day Begin End Building Rm# Instructor. 001 Lecture Tue 6:45 PM 8:40 PM Silver 401 Ballerini

MULTIVARIATE PROBABILITY DISTRIBUTIONS

FDI as a source of finance in imperfect capital markets Firm-Level Evidence from Argentina

The Loss in Efficiency from Using Grouped Data to Estimate Coefficients of Group Level Variables. Kathleen M. Lang* Boston College.

The Assessment of Fit in the Class of Logistic Regression Models: A Pathway out of the Jungle of Pseudo-R²s

Chapter 4: Vector Autoregressive Models

IAPRI Quantitative Analysis Capacity Building Series. Multiple regression analysis & interpreting results

Practical. I conometrics. data collection, analysis, and application. Christiana E. Hilmer. Michael J. Hilmer San Diego State University

Short title: Measurement error in binary regression. T. Fearn 1, D.C. Hill 2 and S.C. Darby 2. of Oxford, Oxford, U.K.

Aileen Murphy, Department of Economics, UCC, Ireland. WORKING PAPER SERIES 07-10

STATISTICA Formula Guide: Logistic Regression. Table of Contents

LOGISTIC REGRESSION ANALYSIS

Transcription:

Overview and Introduction Contact Lecturer: S. Sperlich (MZG 8.128) Lecture: Tue at 8-10am, MZG 8.163 Contact: stefan.sperlich@wiwi.uni-goettingen.de (Office hours: by appointment) Assistent: M. Dickel (MZG 8.135) Tutorial: Thu at 8-10am, MZG 8.163 or WiSoRZ 7.124 Start: Thu April 16 at 8am s.t. WiSoRZ 7.124 Contact: meike.dickel@wiwi.uni-goettingen.de (Office hours: by appointment) Econometrics 2 (Summer 2009) 1 / 35

Overview and Introduction Contents Binary Choice Models Multiple Diskrete Variables (Polychotome) Models Tobit 1, Tobit 2, Tobit 3 Hypothesis Tests (with recapitulation) GMM (including recapitulation of Endogeneity and IV-methods) Duration and Survival Models Paneldata Analysis (basics) Econometrics 2 (Summer 2009) 2 / 35

Overview and Introduction Literature Arellano, M. (2003) Panel Data Econometrics, Oxford University Press. Baltagi, B. (2001) Econometric Analysis of Panel Data, Wiley College Textbooks. Berndt, E. R. (1996) The practice of econometrics: classic and contemporary, Addison Wesley. Greene, W. (2003) Econometric Analysis, Prentice Hall. Hsiao, C. (2003) Analysis of Panel Data, Cambridge University Press. Judge, G., Hill, R., Griffiths, W., Lütkepohl, H. (1988) Introduction to the Theory and Practice of Econometrics, New York: Wiley. Maddala, G. S. (1986) Limited dependent and Qualitative Variables in Econometrics. Wooldridge, J. (2002) Econometric Analysis of Cross Section and Panel Data, MIT Press. Econometrics 2 (Summer 2009) 3 / 35

Overview and Introduction Organization Language (english - german) Tutorials (IT-room and/or seminarroom) Timetable for tutorials Exams Requirements: maths, statistics, profound knowledge of multivariate regression, Desired: Introduction of Econometrics or Econometrics 1 Econometrics 2 (Summer 2009) 4 / 35

Chapter 1 Binary and Multiple Choice Models Examples? Previous knowledge linear, logit, probit Differences between normal linear regression Econometrics 2 (Summer 2009) 5 / 35

Discrete Choice Models Classes of Discrete Variables binary multinomial Further classification of multinomial discrete variables: categorical y = 1, if income < 3000 e. y = 2, if income between 3000 and 5000 e. y = 3, if income > 5000 e. non categorical y = number of cars in the household. Further classification of categorical variables depends if they have a natural order or sequence. Econometrics 2 (Summer 2009) 6 / 35

Discrete Choice Models Classes of Discrete Variables nominal/ unordered categorical y = 1, if the mode of transport is by car. y = 2, if the mode of transport is by bus. y = 3, if the mode of transport is by train. ordered and/ or sequential y = 1, if an individual chooses not to work. y = 2, if an individual wants to work, but can t get a job. y = 3, if the individual works. The characteristics of any discrete variable dictate the methods available for model solution. Econometrics 2 (Summer 2009) 7 / 35

Binary Choice Models Theoretical Framework Consider a binary dependent variable y, which has only two possible outcomes (0 and 1), and a vector of explanatory variables x thought to influence the realization of y. The unconditional expectation of the binary variable y is by definition a probability: E(y) = P(y = 1). Further, let the set of explanatory variables x influence the outcome of y. Then, the conditional expectation of y given x is: E(y x) = P(y = 1 x). Econometrics 2 (Summer 2009) 8 / 35

Binary Choice Models Theoretical Framework Relate this term to the standard regression analysis: for the conditional expectation y = F (x, β) + u, E(y x) = E(F (X, β) + u x) = F (x, β) + E(u x) = F (x, β). Hence, the standard regression functional F (x, β) is a representation of the conditional expectation of y given x. If the dependent variable in a regression relationship is binary, then the regression functional F (x, β) equates directly to the conditional probability of observing y = 1. Thus, the characteristics of binary choice models crucially depend on the way we specify the regression functional F (x, β). Econometrics 2 (Summer 2009) 9 / 35

The Linear Probability Model The Linear Probability Model Consider a binary dependent variable y and a (k dim.) vector of explanatory variables x. We may specify the conditional probability directly as: P(y = 1 x) = F (x, β) = x β. Introducing random disturbances, we have y = x β + u, where u represents the stochastic disturbance term in the relationship, f (u) represents its density and E(u x) = 0 by definition. For a sample of n observations {y i, x i } drawn at random from a population, y i = x i β + u i. OLS-estimation procedures may be applied. Known as the Linear Probability Model (LPM) Econometrics 2 (Summer 2009) 10 / 35

The Linear Probability Model Econometrics 2 (Summer 2009) 11 / 35

The Linear Probability Model Problems with the LPM disturbance terms are non-normal u i = 1 x i β with probability f (u i) = x i β u i = x i β with probability f (u i) = 1 x i β disturbance terms are heteroskedastic Var(u i ) = E(u 2 i ) = ( x i β) 2 (1 x i β) + (1 x i β) 2 (x i β) = (x i β) (1 x i β) = P(y i = 1 x i ) P(y i = 0 x i ). the conditional expectation is not bounded between zero and one E(y i x i ) = P(y i = 1 x i ) = x i β, which is defined over the entire real line. Econometrics 2 (Summer 2009) 12 / 35

Possible Solutions The Linear Probability Model Weighted Least Squares to account for the heteroskedasticity, with weights w i = (x i ˆβ) (1 x i ˆβ), calculated from a first-stage estimation. The adjusted model then becomes y i = x i u i β +. w i w i w i This still does not return probabilities within the range [0, 1]. A better solution is to re-specify or to transform the regression model itself to constrain the probability outcome. Econometrics 2 (Summer 2009) 13 / 35

The Linear Probability Model Econometrics 2 (Summer 2009) 14 / 35

Probit and Logit Models Probit and Logit Models In general, For the Linear Probability Model, E(y i x i ) = P(y i = 1 x i ) = F (x i, β). F (x i, β) = x i β To solve the probability problem, constrain the outcome F (x i, β) to the interval [0, 1]. Which alternatives do we know? Econometrics 2 (Summer 2009) 15 / 35

Probit and Logit Models The Transformation Approach For the Probit F (x i, β) = Φ(x i β), where Φ represents the cumulative distribution of the standard normal density. For the Logit where Λ(z) = represents the Logistic function. F (x i, β) = Λ(x i β), exp(z) 1 + exp(z) = 1 1 + exp( z) Which effect has the choice of the linkfunction? First, some characteristics: Econometrics 2 (Summer 2009) 16 / 35

Probit and Logit Models The Transformation Approach Notice that the functions Φ(z) and Λ(z) are both monotone increasing functions of z. Moreover, in both cases, F (x i, β) 0 falls x i β, F (x i, β) 1 falls x i β +. So, Probit and Logit models both return well-defined probabilities. However, because the transformed regression function is non-linear in β, we can no longer use OLS and must move to ML techniques. The LPM might therefore be considered a first-order approximation to the arbitrary non-linear probability function F ( ). That is F (x, β) F (x 0, β) + (x x 0 ) F (x 0, β) β = x β 0. using a first-order Taylor series expansion around x = x 0. Econometrics 2 (Summer 2009) 17 / 35

Probit and Logit Models Latent Variable Assume that there is some underlying (and unobserved) latent propensity variable y where y (, ). Whilst we do not observe y directly, we do observe a binary outcome y such that y = 1I{y > 0}. where 1I is termed the indicator function, taking the value 1 if the condition within parentheses is satisfied, and 0 otherwise. Define the latent equation in linear form: y = x β + u, where u is random with symmetric density f and corresponding cumulative density function F. Econometrics 2 (Summer 2009) 18 / 35

Probit and Logit Models Latent Variable We now have that E(y x) = P(y = 1 x) = P(y > 0 x) = P(x β + u > 0) = P(u > x β) = 1 F ( x β) = F (x β). By specifying an appropriate distribution function for u, we can derive the Probit and Logit models. When u is assumed normally distributed, parameters must be scaled to force the variance of u to σ 2 = Var(u) = 1. Why? P(y = 1 x) = P(u > x β) = P(u/σ > x (β/σ)) = P(z > x (β/σ)) = Φ(x (β/σ)). Econometrics 2 (Summer 2009) 19 / 35

Probit and Logit Models Theoretical Foundations Suppose, y = 1 represents a person that works, and y = 0 one that doesn t. Consider state-specific utilities U y : U y=1 = x β 1 + u 1, U y=0 = x β 0 + u 0. Participation in the work force requires that Uy=1 > U y=0, such that y = 1I{U y=1 > U y=0} = 1I{x β 1 + u 1 > x β 0 + u 0 } = 1I{u 1 u 0 > x (β 1 β 0 )}. Identify the difference β 1 β 0. Hence, where y = 1I{y > 0}, y = x (β 1 β 0 ) + (u 1 u 0 ) = x β + u. Econometrics 2 (Summer 2009) 20 / 35

ML Estimation ML Estimation Consider a sample of n observations {y i, x i }, where y i is binary. Assume y i = 1I{y i > 0} for yi = x i β + u i. For any vector β, the probability of observing y i conditional on x i is: Taking logs, L(β x i ) = = n P(y i x i, β) i=1 n P(y i = 0 x i, β) 1 y i P(y i = 1 x i, β) y i. i=1 ln L(β x i ) = n {(1 y i ) ln P(y i = 0 x i, β) + y i ln P(y i = 1 x i, β)}. i=1 Econometrics 2 (Summer 2009) 21 / 35

ML Estimation For the Probit model, P(y i = 1 x i, β) = Φ(x i β), P(y i = 0 x i, β) = 1 Φ(x i β) giving a log-likelihood of the form n ln L(β x i ) = {(1 y i ) ln(1 Φ(x i β)) + y i ln Φ(x i β)}. i=1 For the Logit model, to give ln L(β x i ) = P(y i = 1 x i, β) = Λ(x i β) = exp(x i β) 1 + exp(x i β), P(y i = 0 x i, β) = 1 Λ(x i 1 β) = 1 + exp(x i β) n {(1 y i ) ln(1 Λ(x i β)) + y i ln Λ(x i β)}. i=1 Econometrics 2 (Summer 2009) 22 / 35

First Order Conditions ML Estimation Parameters which maximize the general log likelihood require that For the Probit, S(β) = ln L(β x i) β = 0. For the Logit, S(β) = n i=1 S(β) = y i Φ(x i β) Φ(x i β) (1 Φ(x i β)) φ(x i β) x i. n i=1 [ y i exp(x i β) ] 1 + exp(x i β) x i. Solution to ML is obtained by finding parameters for which S(β) = 0. Econometrics 2 (Summer 2009) 23 / 35

Interpretation Binary Choice Models Let s keep concentrating on the following Binary Choice models: LPM P(y i = 1 x i, β) = x i β Probit P(y i = 1 x i, β) = Φ(x i β) Logit P(y i = 1 x i, β) = Λ(x i β). If β j is positive (negative), then P(y i = 1 x i, β) = F (x i β) will increase (decrease) with an increase in x j. Econometrics 2 (Summer 2009) 24 / 35

Marginal Effects Interpretation LPM Probit Logit P(y i = 1 x i, β) x ij P(y i = 1 x i, β) x ij = β j P(y i = 1 x i, β) x ij = = φ(x i β) β j exp(x i β) (1 + exp(x i β))2 β j Implications Slope estimates are not directly comparable. E. g. variance of disturbances in Logit model and the Probit model are different. Hence the parameters are also scaled differently. Econometrics 2 (Summer 2009) 25 / 35

Interpretation Econometrics 2 (Summer 2009) 26 / 35

Marginal Effects Interpretation Notice also: The marginal effects in the LPM are constant (i. e. independent of the data). The marginal effects in the Probit and Logit models depend on x i. A popular transformation: β LPM 0.25 β L for the slopes, and β LPM 0.25 β L + 0.5 for the intercept. β P 0.625 β L. Econometrics 2 (Summer 2009) 27 / 35

Interpretation An Empirical Example: childcare take-up estimates Parameter Estimates Variable LPM Probit Logit single woman -0.059-0.184-0.310 other children aged 5+ -0.101-0.318-0.540 woman works 0.152 0.430 0.713 left school at 18 0.109 0.310 0.520 attended college/uni 0.160 0.458 0.757 youngest child aged 2 0.186 0.556 0.928 youngest child aged 3-4 0.309 0.882 1.458 receives maintenance 0.089 0.264 0.432 constant 0.153-0.995-1.645 Datasource: 1991/92 General Household Survey, from which a random sample of n = 1288 women was taken, which are responsible for at least one child in pre-school-age. Econometrics 2 (Summer 2009) 28 / 35

Interpretation An Empirical Example: childcare take-up estimates The dependent variable is 1, if the woman pays for childcare, else 0. The reference in all cases is a married woman who doesn t work, has left school at 16, has one child aged less than two and who receives no maintenance. For the reference household, all explanatory variables take a value of 0, which leads to probability estimates in each model of: LPM P(y i = 1 x i ) = x i β = 0.153, Probit P(y i = 1 x i ) = Φ(x i β) = Φ( 0.995) = 0.161, Logit P(y i = 1 x i ) = Λ(x i β) = exp( 1.645) 1 + exp( 1.645) = 0.162. Econometrics 2 (Summer 2009) 29 / 35

Interpretation An Empirical Example: childcare take-up estimates How, for example, does the probability change for women who attend university? LPM P(y i = 1 x i ) = x i β = 0.153 + 0.160 = 0.313, Probit P(y i = 1 x i ) = Φ(x i β) = Φ( 0.995 + 0.458) = Φ( 0.537) = 0.296, Logit P(y i = 1 x i ) = Λ(x i β) = exp( 1.645 + 0.757) 1 + exp( 1.645 + 0.757) = 0.291. Econometrics 2 (Summer 2009) 30 / 35

Statistical Inference Binary Choice Models: statistical inference For the LPM, estimated standard errors are easily derived and evaluated. But remember, LPM is heteroskedastic. For the Probit and Logit models, where I is the Fisher-Information. n ( β β ) asym. N(0, I ( β) 1 ), Computer software for ML estimation evaluates the variance-covariance matrix V ( β) directly. Hence, statistical inference and hypothesis testing can be carried out using standard inferential techniques. Econometrics 2 (Summer 2009) 31 / 35

Goodness of fit Statistical Inference Let L UR represent likelihoods for the full model. Let L R represent likelihoods for a restricted model estimated on an intercept alone. Then the formulation for two proposed measures are as follows: Cragg Uhler pseudo R 2 = L2/n UR L2/n R 1 L 2/n R McFadden pseudo R 2 = 1 ln L UR ln L R. Remember: the classical and adjusted R 2 can t be used. (Why?), Econometrics 2 (Summer 2009) 32 / 35

Goodness of fit Statistical Inference An alternative outcome-based measure: the proportion of correct predictions. For P i = P(y i = 1 x i ), eg. Φ(x i β) (Probit), let ỹ i = 1I{ P i > 0.5} Define the proportion of correct predictions as P = 1 n 1I{y i = ỹ i }. n i=1 In many statistic computer programs, you can see tables of predicted and observes binary values: predicted observed 0 1 0 n 00 n 01 1 n 10 n 11 This measure should be avoided. It doesn t make sense if one of the two conditions is hardly represented in the random sample. Econometrics 2 (Summer 2009) 33 / 35

Statistical Inference Testing the Overall Significance of the Regression Let L UR represent likelihoods for the full model. Let L R represent likelihoods for a restricted model. r represents the number of restrictions imposed. Then: 2 ln(l R /L UR ) = 2(ln L UR ln L R ) χ 2 r. For example: H 0 : β 2 = β 3 = = β k = 0 H A : at least one β j 0, j = 2,..., k. Econometrics 2 (Summer 2009) 34 / 35

Transitions in Binary Choice models Transitions in Binary Choice models Let y i = x i β + u i. Imagine a change from x i to x R i x i. This clearly alters the latent variable from y i to Measure used often: Odds Ratio y R i = (x R i ) β + u i. R(x i ) = P(y i = 1 x i )/P(y i = 0 x i ) But how does this exogenous shock impact on the probability P i(j k) of transition from any state j to k, (j, k = 0, 1)? Hence, we need probabilities P i(0 1) = P(y R i > 0 y i < 0). Econometrics 2 (Summer 2009) 35 / 35