Problem set 2, Part 2: Generalized Roy Model 2 Factor, no normality



Similar documents
Hidden Markov Models

Call Price as a Function of the Stock Price

From the help desk: Bootstrapped standard errors

Statistical Models in R

Handling attrition and non-response in longitudinal data

problem arises when only a non-random sample is available differs from censored regression model in that x i is also unobserved

REVIEW EXERCISES DAVID J LOWRY

One-Way Analysis of Variance

Annuities. Lecture: Weeks Lecture: Weeks 9-11 (STT 455) Annuities Fall Valdez 1 / 43

Predicting Defaults of Loans using Lending Club s Loan Data


The Reinvestment Assumption Dallas Brozik, Marshall University

PS 271B: Quantitative Methods II. Lecture Notes

Sections 2.11 and 5.8

Case Study in Data Analysis Does a drug prevent cardiomegaly in heart failure?

Multivariate Normal Distribution

What s New in Econometrics? Lecture 8 Cluster and Stratified Sampling

Marginal Cost. Example 1: Suppose the total cost in dollars per week by ABC Corporation for 2

2DI36 Statistics. 2DI36 Part II (Chapter 7 of MR)

EXCEL PREREQUISITES SOLVING TIME VALUE OF MONEY PROBLEMS IN EXCEL

ESTIMATING AVERAGE TREATMENT EFFECTS: IV AND CONTROL FUNCTIONS, II Jeff Wooldridge Michigan State University BGSE/IZA Course in Microeconometrics

Intermediate Value Theorem, Rolle s Theorem and Mean Value Theorem

EQUIPMENT RENTAL by George M. Keen, Senior Consultant

11. Time series and dynamic linear models

Perfect Pizza - Credit Card Processing Decisions Gail Kaciuba, Ph.D., St. Mary s University, San Antonio, USA

Activity 1: Using base ten blocks to model operations on decimals

" Y. Notation and Equations for Regression Lecture 11/4. Notation:

Lecture 3: Linear methods for classification

Markov Chain Monte Carlo Simulation Made Simple

How To Close A House On A Mortgage

Bayesian Statistics in One Hour. Patrick Lam

One-year reserve risk including a tail factor : closed formula and bootstrap approaches

Recursive Algorithms. Recursion. Motivating Example Factorial Recall the factorial function. { 1 if n = 1 n! = n (n 1)! if n > 1

Lesson 1. Key Financial Concepts INTRODUCTION

Portfolio Management 101:

1 Teaching notes on GMM 1.

Lecture 8: Signal Detection and Noise Assumption

Settlement. Coming to Grips With. What to Know before Your Closing. The Event. What Is Closing?

Marketing Variance Analysis

1 Maximum likelihood estimation

Lecture 19: Conditional Logistic Regression

What does the number m in y = mx + b measure? To find out, suppose (x 1, y 1 ) and (x 2, y 2 ) are two points on the graph of y = mx + b.

Mathematics of Life Contingencies MATH 3281

2.2 Derivative as a Function

Pricing I: Linear Demand

November 2012 Course MLC Examination, Problem No. 1 For two lives, (80) and (90), with independent future lifetimes, you are given: k p 80+k

Further Topics in Actuarial Mathematics: Premium Reserves. Matthew Mikola

Introduction to. Hypothesis Testing CHAPTER LEARNING OBJECTIVES. 1 Identify the four steps of hypothesis testing.

Normalization and Mixed Degrees of Integration in Cointegrated Time Series Systems

Overview. Longitudinal Data Variation and Correlation Different Approaches. Linear Mixed Models Generalized Linear Mixed Models

Insurance Benefits. Lecture: Weeks 6-8. Lecture: Weeks 6-8 (STT 455) Insurance Benefits Fall Valdez 1 / 36

Solving Quadratic & Higher Degree Inequalities

Notes on indifference curve analysis of the choice between leisure and labor, and the deadweight loss of taxation. Jon Bakija

NCSS Statistical Software Principal Components Regression. In ordinary least squares, the regression coefficients are estimated using the formula ( )

The Real Business Cycle model

QUANTIZED INTEREST RATE AT THE MONEY FOR AMERICAN OPTIONS

PATTERN MIXTURE MODELS FOR MISSING DATA. Mike Kenward. London School of Hygiene and Tropical Medicine. Talk at the University of Turku,

Chapter 6: Point Estimation. Fall Probability & Statistics

Article 3, Dealing with Reuse, explains how to quantify the impact of software reuse and commercial components/libraries on your estimate.

Why is Insurance Good? An Example Jon Bakija, Williams College (Revised October 2013)

Intermediate Math Circles March 7, 2012 Linear Diophantine Equations II

1 The Black-Scholes Formula

Basics of Statistical Machine Learning

Name: Date: 3. Variables that a model tries to explain are called: A. endogenous. B. exogenous. C. market clearing. D. fixed.

The Point-Slope Form

Discrete Structures for Computer Science

How to write a design document

Lecture notes: single-agent dynamics 1

Statistical modelling with missing data using multiple imputation. Session 4: Sensitivity Analysis after Multiple Imputation

Amortized Loan Example

calculating probabilities

Vieta s Formulas and the Identity Theorem

HOW MUCH WILL IT COST?

Economic Ordering Quantities: A Practical Cost Reduction Strategy for Inventory Management

More on annuities with payments in arithmetic progression and yield rates for annuities

Chapter 22 Credit Risk

Does Black-Scholes framework for Option Pricing use Constant Volatilities and Interest Rates? New Solution for a New Problem

Auxiliary Variables in Mixture Modeling: 3-Step Approaches Using Mplus

On Black-Scholes Equation, Black- Scholes Formula and Binary Option Price

Reject Inference in Credit Scoring. Jie-Men Mok

Unit 4 The Bernoulli and Binomial Distributions

ECG590I Asset Pricing. Lecture 2: Present Value 1

Chapter 5 Estimating Demand Functions

Time Value of Money Dallas Brozik, Marshall University

Divorce Magazine Interviews Judith S. Charny

PERCENTS - compliments of Dan Mosenkis

Comparison of Estimation Methods for Complex Survey Data Analysis

Algebra. Exponents. Absolute Value. Simplify each of the following as much as possible. 2x y x + y y. xxx 3. x x x xx x. 1. Evaluate 5 and 123

Greatest Common Factor and Least Common Multiple

Investment, Time, and Present Value

Transcription:

Problem set 2, Part 2: Generalized Roy Model 2 Factor, no normality After doing this problem set you should be able to figure out how to include more factors (so you make the model more flexible) get rid of the normality assumptions (actually there is a paper by Ferguson (1983) that shows that a mixture of normals can aproximate almost any distribution arbitrarilly well so we are really being very flexible here. That the model does not depend on functional form or distributional assumptions can be seen in Carneiro, Hansen Heckman (2003)). Ok, so much for simple 1 factor models normality assumptions. Let usnowgotoamodeloftheform I = Zγ + V (1) Y t,1 = Xβ t,1 + ε t,1 (2) Y t,0 = Xβ t,0 + ε t,0 (3) D =1(I>0) (4) Assume that that Y t = DY t,1 +(1 D) Y t,0 V = f 1 α V1 + f 2 α V2 + U V (5) ε t,1 = f 1 α t,11 + f 2 α t,12 + U t,1 (6) ε t,0 = f 1 α t,01 + f 2 α t,02 + U t,0 (7) (U t,1,u t,0,u V ) mutually independent (U t,1 U t,0 U V ) for all t f 1 f 2 (f 1,f 2 ) (U t,0,u t,1,u V ) U t,1 N ³0,σ 2Ut,1 ³0,σ 2Ut,0 U t,0 N U V N 0,σ 2 U V 1

Suppose that additionaly, you have two external test equations which we observe regardless of D which only depend on f 1. The tests take the form T 1 = Qθ 1 + f 1 δ 11 + U T1 T 2 = Qθ 2 + f 1 δ 21 + U T2 U T1 N 0,σ 2 T 1 U T2 N 0,σ 2 T 2. 1. Write down the likelihood function for this problem assuming that f 1 f 2 have some distribution say Pr (f 1,f 2 )=Pr(f 1 )Pr(f 2 ). Notice that conditional on f everything is independent, so take advantage of this when writting the likelihood. Now assume that XK 1 f 1 p f1, kn µ f1,k,σ 2 f 1,k XK 1 p f1, kµ f1,k =0 XK 1 p f1, k =1 XK 2 f 2 p f2, kn µ f2,k,σ 2 f 2,k XK 2 p f2, kµ f2,k =0 XK 2 p f2, k =1 2

where I am abusing notation to let st for X Pr (X) = KX KX p k N µ k,σ 2 k ³ 1 p k p e 1 X µk 2 σ k 2. 2πσ 2 k Also impose the following normalizations: σ 2 U V =1,δ 11 =1α 1,02 =1. To keep it simple assume that K 1 =2K 2 =2,butyoucanwriteit as a general program that allows for more mixture components, more time periods, more test equations more factors (later you will write a program that allows for more choices too!). 2. Just to start your engines, what is the formula for the variance of a mixture of normals rom variable like X above? 3. Program either a Maximum Likelihood or MCMC version of this model. Notice that if you do an MLE version you are now going to have to integrate over 2 continuous distributions which is going to take a long time. I strongly recomend the MCMC version since extending it to more factorsisnaturalitisstillveryfastwhereasextendingthemleversion is not. You ll still have a chance to practice MLE on a dynamic program on PS8. If using an MCMC method put non-informative priors on γ,θ j,β t,1 β t,0.putnormal(0, 10.0) (proper but with little µ information) priors on 1 1 1 α t,01,α t,11,α t,v1 δ 21 ;gamma(2, 1) priors on,,. σ 2 U σ 2 t,1 U σ 2 t,0 T j We are going to use dataset 2b for this part of the problem set for future problem sets. The way this dataset, again abusing notation for the mixtures, was generated is the following: f 1 0.5N (1, 2) + 0.5N ( 1, 2) f 2 0.3N (0.5, 0.5) + 0.7N ( 0.2143, 0.1) U t,1 N (0, 1) U t,0 N (0, 1) U V N (0, 1) t = 1, 2, 3 3

Z 0 = X 0 are just a constant equal to 1. Next we generate X 1 N (0, 2) Z 1 N (0, 2) so we are in the case where Z =(Z 0,Z 1 )X =(X 0,X 1 ) are exogenous. We finally form Y t,1 =2X 0 + X 1 +2f 1 + f 2 + U t,1 Y t,0 = X 0 + X 1 + f 1 + f 2 + U t,0 I =0.5Z 0 + Z 1 + f 1 + f 2 + U V for t =1, 2, 3let D =1(I>0). so that the observed Y t is Y t = DY t,1 +(1 D) Y t,0. Finally the test equations were generated as U T1 N (0, 1), U T2 N (0, 1). Q 0 =1, Q 1 N (0, 1) T 1 = Q 0 + Q 1 + f 1 + U T1. T 2 = Q 0 +2Q 1 +0.5f 1 + U T2 4. Run your program on this data. If you did it correctly your estimates should be close to the values we assigned when we built the dataset. Suppose we are interested in estimating mean treatment parameters (but see Carneiro, Hansen Heckman for the use of these methods in estimation of distributions) for present values, assume there is no discounting. That is, define 3X Y 1 = Y 0 = 4 Y t,1 t=1 3X Y t,0. t=1

5. (Derive analytically if you want, it is a very nice exercise to do you will not regret it since you will use something similar in PS8). Estimate from your results in the previous section the following: a) Average Treatment on the Treated b) Average Treatment effect c) Average effect of treatment for people at the margin of indiference between D =1D = 0 (a nice way to do this numerically is to change the intercept of I by very little take the average treatment on the treated effect for those persons who actually change choice). d) How would you estimate the Marginal Treatment Effect (Can you derive it?)? 6. Now let s look at the robustness of the method to changes in available information. This is going to be very important when making comparisons across methods. Suppose now that f 2 becomes available somehow so that now it is observed by you (the econometrician). This means we are back in a one factor model since f 2 is now like an X a Z. Reestimate the model. Do your results change? (hint: they shouldn t change much, to see why check Heckman Navarro-Lozano (2004)). In this final stage we are going to give names to things to make it easier to underst. In the previous model, suppose that the choice being made is schooling but that now there are 3 levels of schooling so now we have I 1 = U V1 I 2 = Zγ 2 + f 1 α 2,V1 + f 2 α 2,V2 + U V2 I 3 = Zγ 3 + f 1 α 3,V1 + f 2 α 3,V2 + U V3 U Vj N (0, 1). You should recognize the assumptions from problem set 1b. Now suppose that the outcomes you observe (i.e., the Y t,j ) in each schooling level are wages (now of course you only observe wages for the schooling level chosen, not for all 3). However we now add employment to the decisions being made. That is, not only do I only observe wages only for the schooling level chosen but at any given time period I only observe wages if Et = W t ρ t + f 1 π t,1 + f 2 π t,2 + η t > 0. η t N (0, 1). 5

That is I only observe wages for those who choose to be employed (we could easily allow the empoyment decision to depend on the schooling level chosen too). Assume as before that everything is independent conditional on the factor. 7. Can you write the likelihood for this model? 8. What about an algorithm? (you do not need to program it, just write how you would do it). By now, you should be able to see that extending the model (say for more time periods, other choices, more than 2 choices etc) is pretty straightforward under the factor structure assumption. This however, is only one way to do it. With what you have learned you should be able to figure out other methods program them since the principles are always the same. 6