Econ 371 Problem Set #4 Answer Sheet. P rice = (0.485)BDR + (23.4)Bath + (0.156)Hsize + (0.002)LSize + (0.090)Age (48.

Similar documents
Discussion Section 4 ECON 139/ Summer Term II

IAPRI Quantitative Analysis Capacity Building Series. Multiple regression analysis & interpreting results

August 2012 EXAMINATIONS Solution Part I

ECON 142 SKETCH OF SOLUTIONS FOR APPLIED EXERCISE #2

2. Linear regression with multiple regressors

Rockefeller College University at Albany

MULTIPLE REGRESSION EXAMPLE

Quick Stata Guide by Liz Foster

Nonlinear Regression Functions. SW Ch 8 1/54/

Stepwise Regression. Chapter 311. Introduction. Variable Selection Procedures. Forward (Step-Up) Selection

Logs Transformation in a Regression Equation

Econometrics I: Econometric Methods

Multiple Linear Regression

Please follow the directions once you locate the Stata software in your computer. Room 114 (Business Lab) has computers with Stata software

Department of Economics Session 2012/2013. EC352 Econometric Methods. Solutions to Exercises from Week (0.052)

Module 5: Multiple Regression Analysis

Multicollinearity Richard Williams, University of Notre Dame, Last revised January 13, 2015

MODEL I: DRINK REGRESSED ON GPA & MALE, WITHOUT CENTERING

Outline. Topic 4 - Analysis of Variance Approach to Regression. Partitioning Sums of Squares. Total Sum of Squares. Partitioning sums of squares

Forecasting in STATA: Tools and Tricks

From this it is not clear what sort of variable that insure is so list the first 10 observations.

Lecture 15. Endogeneity & Instrumental Variable Estimation

Failure to take the sampling scheme into account can lead to inaccurate point estimates and/or flawed estimates of the standard errors.

SPSS Guide: Regression Analysis

25 Working with categorical data and factor variables

Interaction effects between continuous variables (Optional)

Marginal Effects for Continuous Variables Richard Williams, University of Notre Dame, Last revised February 21, 2015

Chapter 7: Simple linear regression Learning Objectives

Regression step-by-step using Microsoft Excel

DETERMINANTS OF CAPITAL ADEQUACY RATIO IN SELECTED BOSNIAN BANKS

1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96

Economics of Strategy (ECON 4550) Maymester 2015 Applications of Regression Analysis

Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression

Lesson 1: Comparison of Population Means Part c: Comparison of Two- Means

Standard errors of marginal effects in the heteroskedastic probit model

Using R for Linear Regression

From the help desk: Bootstrapped standard errors

Good luck! BUSINESS STATISTICS FINAL EXAM INSTRUCTIONS. Name:

Regression Analysis: A Complete Example

ESTIMATING AVERAGE TREATMENT EFFECTS: IV AND CONTROL FUNCTIONS, II Jeff Wooldridge Michigan State University BGSE/IZA Course in Microeconometrics

Multiple Linear Regression in Data Mining

Chapter 13 Introduction to Linear Regression and Correlation Analysis

NCSS Statistical Software Principal Components Regression. In ordinary least squares, the regression coefficients are estimated using the formula ( )

HURDLE AND SELECTION MODELS Jeff Wooldridge Michigan State University BGSE/IZA Course in Microeconometrics July 2009

2. What is the general linear model to be used to model linear trend? (Write out the model) = or

Handling missing data in Stata a whirlwind tour

Premaster Statistics Tutorial 4 Full solutions

Statistics 104 Final Project A Culture of Debt: A Study of Credit Card Spending in America TF: Kevin Rader Anonymous Students: LD, MH, IW, MY

Correlation and Regression

MODELING AUTO INSURANCE PREMIUMS

Simple linear regression

Correlated Random Effects Panel Data Models

Coefficient of Determination

Multinomial and Ordinal Logistic Regression

The average hotel manager recognizes the criticality of forecasting. However, most

Introduction to Regression and Data Analysis

SIMPLE LINEAR CORRELATION. r can range from -1 to 1, and is independent of units of measurement. Correlation can be done on two dependent variables.

Hypothesis testing - Steps

Market Prices of Houses in Atlanta. Eric Slone, Haitian Sun, Po-Hsiang Wang. ECON 3161 Prof. Dhongde

Lab 5 Linear Regression with Within-subject Correlation. Goals: Data: Use the pig data which is in wide format:

Factors affecting online sales

Econometrics Problem Set #3

Interaction effects and group comparisons Richard Williams, University of Notre Dame, Last revised February 20, 2015

Stata Walkthrough 4: Regression, Prediction, and Forecasting

From the help desk: Swamy s random-coefficients model

Part 2: Analysis of Relationship Between Two Variables

Week TSX Index

ESTIMATING AN ECONOMIC MODEL OF CRIME USING PANEL DATA FROM NORTH CAROLINA BADI H. BALTAGI*

Introduction to Time Series Regression and Forecasting

I n d i a n a U n i v e r s i t y U n i v e r s i t y I n f o r m a t i o n T e c h n o l o g y S e r v i c e s

xtmixed & denominator degrees of freedom: myth or magic

Review of Bivariate Regression

Simple Regression Theory II 2010 Samuel L. Baker

Forecasting the US Dollar / Euro Exchange rate Using ARMA Models

Copyright 2007 by Laura Schultz. All rights reserved. Page 1 of 5

3.1. Solving linear equations. Introduction. Prerequisites. Learning Outcomes. Learning Style

Chapter 9 Assessing Studies Based on Multiple Regression

Part 3. Comparing Groups. Chapter 7 Comparing Paired Groups 189. Chapter 8 Comparing Two Independent Groups 217

1 Simple Linear Regression I Least Squares Estimation

Solución del Examen Tipo: 1

outreg help pages Write formatted regression output to a text file After any estimation command: (Text-related options)

Lets suppose we rolled a six-sided die 150 times and recorded the number of times each outcome (1-6) occured. The data is

How To Run Statistical Tests in Excel

Creating plots and tables of estimation results Frame 1. Creating plots and tables of estimation results using parmest and friends

Data Analysis Methodology 1

Basic Statistical and Modeling Procedures Using SAS

Survey Data Analysis in Stata

Generalized Linear Models

5. Linear Regression

1. The parameters to be estimated in the simple linear regression model Y=α+βx+ε ε~n(0,σ) are: a) α, β, σ b) α, β, ε c) a, b, s d) ε, 0, σ

Notes on Applied Linear Regression

We extended the additive model in two variables to the interaction model by adding a third term to the equation.

Using Stata 9 & Higher for OLS Regression Richard Williams, University of Notre Dame, Last revised January 8, 2015

A Panel Data Analysis of Corporate Attributes and Stock Prices for Indian Manufacturing Sector

HYPOTHESIS TESTING: CONFIDENCE INTERVALS, T-TESTS, ANOVAS, AND REGRESSION

Multivariate Logistic Regression

Linear Models in STATA and ANOVA

Transcription:

Econ 371 Problem Set #4 Answer Sheet 6.5 This question focuses on what s called a hedonic regression model; i.e., where the sales price of the home is regressed on the various attributes of the home. The idea of this model is to try to explain what attributes increase (or decrease) the value of the home. These types of models are often used for property tax assessments. In this case, the estimated model is: P rice = 119.2 + (0.485)BDR + (23.4)Bath + (0.156)Hsize + (0.002)LSize + (0.090)Age (48.8)P oor, (1) with R 2 = 0.72 and SER = 41.5. a. The first part of this question asks you to predict the increase in home value associated with adding a bathroom. The question notes that this done by taking up part of the existing bathroom (so that the overall square footage for the house is not changed). From the regression model, the expected change is given by the coefficient on Bath, since if we increase Bath by 1, then the expected P rice increases by 23.4 (or $23,400, since price is measured in thousands of dollars). b. The second part of question asks a similar question, only in this case the additional bedroom is not taken out of the family room, but instead represents a net increase in the size of the house (i.e., Hsize = 100 and Bath = 1). In this case, the increase in the price of the house would be: or $39,000. P rice = (23.4) Bath + (0.156) Hsize = 23.4 + 15.6 = 39 (2) c. The third part of the question asks you to imagine that the owner of a home allows the home to deteriorate to a P oor rating (i.e., so that the change is P oor = 1). In this case, the increase in the price of the house would be: P rice = (48.8) P oor = 48.8 (3) or $48,800. d. Finally, you are asked to compute R 2 for the regression. This is relatively easy to do, as you are told that the adjusted R 2, R 2, is 0.72. What we want to know is: What we know is Rearranging the above equation, we have or equivalently Substituting this expression into (4) yields: R 2 = 1 SSR T SS. (4) R 2 = 1 n 1 SSR n k 1 T SS. (5) n 1 SSR n k 1 T SS = 1 R 2, (6) SSR T SS = n k 1 [ ] 1 R2. (7) n 1 R 2 = 1 n k 1 n 1 [ 1 R2 ]. (8) Plugging in our values for n, k and R 2, we get ( ) 220 6 1 R 2 = 1 [1 0.72] = 1 213 [1 0.72] = 0.728. (9) 220 1 219 1

6.6 This question asks you To consider the causal effect of police on crime using data from a random sample of U.S. counties. You are told that a researcher plans to regress county crime rate on the size of the county s police force. a. The first question asks you to explain why this is likely to suffer from omitted variables bias and to suggest what variables you might want to add to control for such biases. There is no right answer here, but there are likely to be many factors influencing the county s crime rate, such as population density income level in the area the fraction of young males in the area, etc. Each of these factors may also be correlated with the regressor (the size of the county s police force). b. In the second part of the question, you are as to determine the likely direction of the bias. I ll use just one of the possible omitted variables above. Suppose that the crime rate is positively affected by the fraction of young males in the population, and that counties with high crime rates tend to hire more police. In this case, the size of the police force is likely to be positively correlated with the fraction of young males in the population leading to a positive value for the omitted variable bias so that ˆβ 1 > β 1. 7.7 This question continues question 6.5 above, providing standard errors for the estimated hedonic regression model. Specifically, we now have P rice = 119.2 +(0.485)BDR +(23.4)Bath +(0.156)Hsize +(0.002)LSize +(0.090)Age (48.8)P oor (23.90) (2.61) (8.94) (0.11) (0.00048) (0.311) (10.5) a. The first part of the question asks if the coefficient on BDR is statistically significantly different from zero. The corresponding p-value for this test is given by: p value = 2Φ ( t act ) ( ) ˆβ BDR 0 = 2Φ SE( ˆβ BDR ) ( ) = 2Φ 0.485 2.61 = 2Φ ( 0.186) < 2(0.4247) = 0.8494. We would not reject the null hypothesis that the bedroom coefficient is zero. b. The next part of the question notes that a typical five-bedroom house sells for much more that a twobedroom house. It is then asked whether the result in part (a) is consistent with this result. In this case, the results are consistent, because the coefficient on BDR measures the partial effect of the number of bedrooms holding house size (Hsize) constant. Yet, the typical five-bedroom house is much larger than the typical two-bedroom house. Thus, the results in (a) says little about the conventional wisdom. c. In this part of the question, you are asked to imagine that the homeowner has purchased 2000 square feet from an adjacent lot and to calculate the change in the value of her house. In this case, the only thing that has changed is the lot size. The 99% confidence interval for effect of lot size on price is 2000 [.002 ± 2.58(.00048)] or 1.52 to 6.48 (in thousands of dollars). d. Part (d) of the question asks if changes in how the lot size will affect the results; i.e., would another scale be more appropriate here? The fundamental results would not change with a change of scale. Choosing the scale of the variables should be done to make the regression results easy to read and to interpret. If the lot size were measured in thousands of square feet, the estimate coefficient would be 2 instead of 0.002. e. Finally, you are told that the F -statistic for omitting BDR and Age from the regression is F = 0.08. Using this information, you are asked to evaluate whether the coefficients on BDR and Age are jointly statitically different from zero at the 10% level. The 10% critical value from the F 2, distribution is 2.30. Because 0.08 < 2.30, the coefficients are not jointly significant at the 10% level. 2

7.9 This question asks you to consider the regression model: Y i = β 0 + β 1 X 1i + β 2 X 2i + u i (10) and to use Approach 2 to transform the regression model so as to allow a t-statistic test of the hypothesis of interest. a. The first hypothesis is in fact the same as the one we went through in class, since H 0 : β 1 β 2 = 0 is the same restriction as H 0 : β 1 = β 2 One can rewrite the model as: Y i = β 0 + β 1 X 1i + [ β 2 X 1i + β 2 X 1i ] + β 2 X 2i + u i = β 0 + (β 1 β 2 )X 1i + β 2 (X 1i + X 2i ) + u i = β 0 + γ 1 X 1i + β 2 W i + u i where γ 1 β 1 β 2 and W i X 1i + X 2i. Our hypothesis can be rewritten as involving a simple t-test. H 0 : γ 1 = 0 versus H 1 : γ 1 0 (11) b. In this second question, you are asked to test the hypothesis H 0 : β 1 + aβ 2 = 0. The trick is to add and subtract aβ 2 X 1i to our original equation, so that the coefficient of X 1i is our term of interest. Specifically, we get: Y i = β 0 + β 1 X 1i + [aβ 2 X 1i aβ 2 X 1i ] + β 2 X 2i + u i = β 0 + [β 1 X 1i + aβ 2 X 1i ] + [β 2 X 2i aβ 2 X 1i ] + u i = β 0 + (β 1 + aβ 2 )X 1i + β 2 (X 2i ax 1i ) + u i = β 0 + δ 1 X 1i + β 2 Wi + u i where δ 1 β 1 + aβ 2 and W i X 2i ax 1i. Our hypothesis can be rewritten as involving a simple t-test. H 0 : δ 1 = 0 versus H 1 : δ 1 0 (12) c. Finally, your are asked to test the hypothesis H 0 : β 1 + β 2 = 1, or equivalent H 0 : β 1 + β 2 1 = 0. This question is a bit trickier, but the idea is the same. We want to get one of the coefficients in the model to match the parameter combination of interest (i.e., β 1 + β 2 1). Suppose we add and subtract (β 2 1)X 1i to our original equation. We then get Y i = β 0 + β 1 X 1i + [(β 2 1)X 1i (β 2 1)X 1i ] + β 2 X 2i + u i = β 0 + [β 1 X 1i + (β 2 1)X 1i ] + [β 2 X 2i (β 2 1)X 1i ] + u i = β 0 + (β 1 + β 2 1)X 1i + β 2 (X 2i X 1i ) + X 1i + u i = β 0 + τ 1 X 1i + β 2 Z i + X 1i + u i where τ 1 β 1 + β 2 1 and Z i X 2i X 1i. We are almost done, except that we have the extra X 1i on the right hand side of our equation. If we simply subtract it from both sides, we get: Y i X 1i = β 0 + τ 1 X 1i + β 2 Z i + u i Ỹ i = β 0 + τ 1 X 1i + β 2 Z i + u i where Ỹi Y i X 1i. Now our hypothesis can be rewritten as H 0 : τ 1 = 0 versus H 1 : τ 1 0 (13) involving a simple t-test. 3

The two empirical exercises in this homework use the same dataset: TeachingRatings. The data can be downloaded from the Web site listed in the assignment (which you can also reach from the class website). A program that carries all of the tasks for problems E6.1 and E7.2 is appended to this answer sheet. E4.1 a. The first task you are asked to do is to regress the course evaluations (Course E val) on age (Beauty) and to report the estimated slope. The results are as follows: Course Eval = 4.00 +0.133Age, R 2 = 0.04 The slope, then, for this regression is 0.133. (0.025) (0.032) b. Next, you are asked to run an additional regression including some of the other variables in the data set. In my case, I added all of the other variables except Age. The resulting parameter estimates are: Variable Parameter Est. Standard Error p value beauty 0.166 0.032 < 0.001 intro 0.011 0.056 0.840 onecredit 0.635 0.108 < 0.001 female -0.173 0.049 0.001 minority -0.167 0.0677 0.014 nnenglish -0.2447 0.094 0.009 Intercept 4.068 0.037 < 0.001 The fact that the coefficient on beauty does not change substantially with the addition of these other variables suggests that the original regression does not suffer from important omitted variables bias. c. Finally, you are asked to predict the evaluations of Professor Smith, who is a black male with average beauty and is a native English speaker. This would correspond to: Course Eval = (0.166 0)+(0.011 0)+(0.634 0)+(0.173 0)+(0.167 1)+(0.244 0)+4.068 = 3.901 (14) E7.1 This question continues the analysis from E6.1, reported above. a. Your first task is to construct a 95% confidence interval around the slope coefficient. This can be read directly from the Stata output from the first regression. Specifically, the 95% confidence interval is given by: (0.069,0.197). b. The second question asks which variables should be included in the regression. The results seem to support the inclusion of all of the regressors except Intro, which has a large p-value (0.840). One could rerun the model excluding this variable, but the results (in terms of the confidence interval are unchanged out to three significant digits: (0.104,0.228). 4

; Problem Set #4 ; # delimit ; clear; cap log close; ; Specify the output file ; log using Problemset4.log,replace; set more off; ; Read in and summarize the data ; use TeachingRatings.dta; describe; summarize ; ; Estimate the model for question E6.1a ; reg course_eval beauty,r; ; Estimate the model for question E6.1b

; reg course_eval beauty intro onecredit female minority nnenglish,r; scalar Smith = _b[_cons] + 1_b[minority]; scalar list; reg course_eval beauty onecredit female minority nnenglish,r; log close; clear; exit;

Problemset4.log ------ log: C:\Documents and Settings\jaherrig\My Documents\Classes\Economics 371\Stata\Problemset4.log log type: text opened on: 19 Oct 2008, 13:09:13. set more off;. ;. > Read in and summarize the data > > ;. use TeachingRatings.dta;. describe; Contains data from TeachingRatings.dta obs: 463 vars: 8 10 Dec 2005 14:29 size: 15,279 (98.5% of memory free) ------ storage display value variable name type format label variable label ------ minority float %9.0g Minority age float %9.0g Professor's age female float %9.0g female = 1 onecredit byte %8.0g Equal 1 if a one-credit course beauty float %9.0g course_eval float %9.0g intro float %9.0g nnenglish float %9.0g ------ Sorted by:. summarize ; Variable Obs Mean Std. Dev. Min Max -------------+------------------- minority 463.1382289.3455134 0 1 age 463 48.36501 9.802742 29 73 female 463.4211663.4942802 0 1 onecredit 463.0583153.2345922 0 1 beauty 463 4.75e-08.7886477-1.450494 1.970023 -------------+------------------- course_eval 463 3.998272.5548656 2.1 5 intro 463.3390929.4739135 0 1 nnenglish 463.0604752.2386229 0 1. ;. > Estimate the model for question E6.1a > > ;. reg course_eval beauty,r; Linear regression Number of obs = 463 F( 1, 461) = 16.94 Page 1

Problemset4.log Prob > F = 0.0000 R-squared = 0.0357 Root MSE =.54545 Robust course_eval Coef. Std. Err. t P> t [95% Conf. Interval] -------------+--------------------------- beauty.1330014.0323189 4.12 0.000.0694908.1965121 _cons 3.998272.0253493 157.73 0.000 3.948458 4.048087. ;. > Estimate the model for question E6.1b > > ;. reg course_eval beauty intro onecredit female minority nnenglish,r; Linear regression Number of obs = 463 F( 6, 456) = 17.03 Prob > F = 0.0000 R-squared = 0.1546 Root MSE =.51351 Robust course_eval Coef. Std. Err. t P> t [95% Conf. Interval] -------------+--------------------------- beauty.16561.0315686 5.25 0.000.1035721.2276478 intro.011325.0561741 0.20 0.840 -.0990673.1217173 onecredit.6345271.1080864 5.87 0.000.4221178.8469364 female -.1734774.0494898-3.51 0.001 -.2707337 -.0762212 minority -.1666154.0674115-2.47 0.014 -.2990912 -.0341397 nnenglish -.2441613.0936345-2.61 0.009 -.42817 -.0601526 _cons 4.068289.0370092 109.93 0.000 3.995559 4.141019. scalar Smith = _b[_cons] + 1_b[minority];. scalar list; Smith = 3.9016734. reg course_eval beauty onecredit female minority nnenglish,r; Linear regression Number of obs = 463 F( 5, 457) = 20.17 Prob > F = 0.0000 R-squared = 0.1546 Root MSE =.51297 Robust course_eval Coef. Std. Err. t P> t [95% Conf. Interval] -------------+--------------------------- beauty.1660434.0314939 5.27 0.000.1041526.2279342 onecredit.6413254.0966112 6.64 0.000.4514682.8311826 female -.1741755.0495102-3.52 0.000 -.2714713 -.0768797 minority -.1647853.0681097-2.42 0.016 -.2986324 -.0309382 nnenglish -.2480077.0928702-2.67 0.008 -.4305133 -.0655021 _cons 4.072006.0323399 125.91 0.000 4.008453 4.13556 Page 2

Problemset4.log. log close; log: C:\Documents and Settings\jaherrig\My Documents\Classes\Economics 371\Stata\Problemset4.log log type: text closed on: 19 Oct 2008, 13:09:13 ------ Page 3