Department of Economics Session 2012/2013. EC352 Econometric Methods. Solutions to Exercises from Week (0.052)

Size: px
Start display at page:

Download "Department of Economics Session 2012/2013. EC352 Econometric Methods. Solutions to Exercises from Week 10 + 0.0077 (0.052)"

Transcription

1 Department of Economics Session 2012/2013 University of Essex Spring Term Dr Gordon Kemp EC352 Econometric Methods Solutions to Exercises from Week 10 1 Problem 13.7 This exercise refers back to Equation (13.12) from Wooldridge: log(durat) = (0.031) (0.069) (0.0447) n = 5626, R 2 = af change (0.047) af change highearn, highearn (1) which is part of Example 13.4 Effect of Worker Compensation Laws on Weeks out of Work. This refers to a study by Meyer, Viscusi and Durbin (1995) which examined the impact of a change in the cap on weekly earnings that was covered workers compensation (for injuries etc.). Here the control group is low income workers while the treatment group is high income workers. This is because low income workers had earnings that were below the original cap so raising the cap did not raise the amount of the benefit they could get. In contrast high income workers had earnings that were at or above the cap so raising the the cap did increase the amount of the benefit they could get. Raising the amount of the benefit a worker can get will make it more attractive for the worker to remain on benefit for longer (in the event of an injury). The data used in the example is given in INJURY.DTA and the estimates were obtained using the observations for Kentucky. 1. Using the data in injury.dta for Kentucky, the estimated equation when afchange is dropped from Equation (13.12) is: log(durat) = (0.022) (0.042) n = 5626, R 2 = highearn (0.052) afchange highearn (2) Is it surprising that the estimate on the interaction term is fairly close to that in Equation (13.12)? Explain. The equation of interest is: log(durat) = β 0 + δ 0 afchange + β 1 highearn +δ 1 afchange highearn 1

2 Before After After-Before Control β 0 β 0 + δ 0 δ 0 Treatment β 0 + β 1 β 0 + δ 0 + β 1 + δ 1 δ 0 + δ 1 Treatment - Control β 1 β 1 + δ 1 δ 1 Table 1: Illustration of the Difference-in-Differences Estimator see Equation (13.10). As we can see, the estimated coefficient on afchange in Equation (13.12) is very small in magnitude (and when compared to the estimated coefficients on highearn and on the interaction term afchange highearn) and is statistically very insignificant. Consequently it is not a surprise the results change very little as a result of dropping afchange from the Equation (13.12) while keeping highearn and the interaction term: the change is easily explainable as the result of sampling variability. 2. When afchange is included but highearn is dropped, the result is: log(durat) = (0.023) (0.040) n = 5626, R 2 = af change (0.050) afchange highearn (3) Why is the coefficient on the interaction term now so much large than in Equation (13.12)? Explain [Hint: In Equation (13.10), what is the assumption being made about the treatment and control groups if β 1 = 0?] The coefficient on afchange is the increase in log(duration) for low earners. Raising the cap should have no effect on these workers since their income was below the cap anyway and hence we would assume that δ 0 = 0. This is consistent with the estimates in Equation (13.12) and the results in part 1. In contrast, dropping highearn from Equation (13.12) means that we are assuming that β 1 = 0, i.e., that there was no difference in average pre-policy change log durations between the high and low income groups. This is not very plausible since, for example, the two groups may tend to do different jobs for which the effects of injuries may be quite different. In the results from Equation (13.12) we see that the coefficient on highearn, namely β 1, is strongly significantly different from zero (t-statistic exceeds 5) so we would expect dropping this variable to have quite a strong impact. 2 Exercise C14.7 This question is about analyzing the impact of execution rates and unemployment on murder rates at the US state level using panel data to control for unobserved state specific factors.the data are contained in the file murder.dta and consist of a panel all 50 states plus the District of Columbia with three waves: 1987, 1990, and This gives 153 observations. 2

3 1. Consider the unobserved effects model: mrdrte it = θ t + β 1 exec it + β 2 unem it + a i + u it, i = 1,..., 51 t = 1987, 1990, 1993 where: θ t denotes year specific intercepts; a i denotes the unobserved state specific effects; mrdrte it is the murder rate (per 100,000 population) for state i in year t; exec it is total executions over the past 3 years for state i in year t; and unem it is the unemployment rate for state i in year t. If past executions have a deterrent effect, what should be the sign of β 1? What sign do you think β 2 should have? Explain. β 1 should be negative if, ceteris paribus, past executions have a deterrent effect. β 2 is likely to have a positive effect as it tends to reflect relative social economic deprivation which people seem to think are positively related to crime rates including rates of violent crimes such as murder. 2. Using just the years 1990 and 1993, estimate the equation from part (1) by pooled OLS. Ignore the serial correlation problem in the composite errors. Do you find evidence for a deterrent effect? Running OLS on the data for just the years 1990 and 1993 gives:. regress mrdrte exec unem if year >87 Source SS df MS Number of obs = F( 2, 99) = 5.08 Model Prob > F = Residual R-squared = Adj R-squared = Total Root MSE = mrdrte Coef. Std. Err. t P> t [95% Conf. Interval] exec unem _cons Ignoring the any problems in calculating standard errors resulting from unobserved effects then we don t find a deterrent effect the coefficient on exec is positive (not negative) but is insignificant (p-value of 0.663). 3. Now, using 1990 and 1993, estimate the equation by fixed effects. You may use first differencing since you are only using two years of data. Now, is there any evidence of a deterrent effect? 3

4 The first differencing approach is easy to implement here as the variables have already been created in the data set: cmrdrte is mrdrte minus its first lag (giving the change in the murder rate) cexec is exec minus its first lag (giving the change in the total number of executions) and cunem is unem minus its first lag (giving the change in the unemployment rate) Running the first difference regression using just 1990 and 1993 gives:. regress cmrdrte cexec cunem if year==93 Source SS df MS Number of obs = F( 2, 48) = 2.96 Model Prob > F = Residual R-squared = Adj R-squared = Total Root MSE = cmrdrte Coef. Std. Err. t P> t [95% Conf. Interval] cexec cunem _cons There is now evidence of a deterrent effect since the coefficient on cexec is negative every additional execution in a state reduces the murder rate in that state in the next three year period by about 1 per million. Furthermore, this effect is fairly statistically significant (p-value of 0.021). 4. Compute the heteroskedasticity-robust standard error for the estimation in part (3). It will be easiest to use first differencing. Different states are different in many ways and we may worry about heteroskedasticity. Re-running the regression from part (3) but generating heteroskedasticity-robust standard errors gives:. regress cmrdrte cexec cunem if year==93, vce(robust) Linear regression Number of obs = 51 F( 2, 48) = Prob > F = R-squared = Root MSE = Robust cmrdrte Coef. Std. Err. t P> t [95% Conf. Interval] 4

5 cexec cunem _cons which indicates that the deterrent effect is now highly statistically significant (p-value is 0 to 3dp). 5. Find the state that has the largest number for the execution variable in How much bigger is this value than the next highest value. Browsing through the data in Stata reveals that the highest observed value for the execution variables is 34 for Texas (id 44) in The next largest value in 1993 was 11 for Virginia (id 47) so the value for Texas is indeed very high. 6. Estimate the equation using first differences, dropping Texas from the analysis. Compute the usual and heteroskedasticity-robust standard errors. Now what do you find? What is going on? Dropping Texas and rerunning the first-difference regression for 1993 with the usual OLS standard errors gives:. regress cmrdrte cexec cunem if year==93 & id!=44 Source SS df MS Number of obs = F( 2, 47) = 0.32 Model Prob > F = Residual R-squared = Adj R-squared = Total Root MSE = cmrdrte Coef. Std. Err. t P> t [95% Conf. Interval] cexec cunem _cons so the deterrent effect is rather smaller and is no longer statistically significant (p-value of 0.523). Using heteroskedasticity-robust standard errors gives:. regress cmrdrte cexec cunem if year==93 & id!=44, vce(robust) Linear regression Number of obs = 50 F( 2, 47) = 0.54 Prob > F = R-squared = Root MSE = Robust 5

6 cmrdrte Coef. Std. Err. t P> t [95% Conf. Interval] cexec cunem _cons which implies the deterrent effect is still insignificant (p-value of 0.398). In fact, what we are seeing is that dropping Texas both reduces the magnitude of the estimated deterrent effect quite substantially (the estimate shifts from to ) but also increases the standard errors (from the usual OLS ones increase from to while the heteroskedasticity-robust changes from to 0.079). Clearly including excluding Texas has very dramatic effects on the results and suggests that it is an outlier. 7. Use all three years of data and estimate by fixed effects. Include Texas in the analysis. Discuss the size and statistical significance of the deterrent effect compared with only using 1990 and To run this regression in Stata we use the xtreg command with the fe option. Doing so gives:. xtreg mrdrte exec unem, fe Fixed-effects (within) regression Number of obs = 153 Group variable: id Number of groups = 51 R-sq: within = Obs per group: min = 3 between = avg = 3.0 overall = max = 3 F(2,100) = 0.24 corr(u_i, Xb) = Prob > F = mrdrte Coef. Std. Err. t P> t [95% Conf. Interval] exec unem _cons sigma_u sigma_e rho (fraction of variance due to u_i) F test that all u_i=0: F(50, 100) = Prob > F = The deterrent effect is similar to that which we estimated using the first differencing approach with just data from the years 1990 and 1993: however it is no longer statistically significant. We could use standard errors clustered by id:. xtreg mrdrte exec unem, fe vce(cluster id) 6

7 Fixed-effects (within) regression Number of obs = 153 Group variable: id Number of groups = 51 R-sq: within = Obs per group: min = 3 between = avg = 3.0 overall = max = 3 F(2,50) = 1.23 corr(u_i, Xb) = Prob > F = (Std. Err. adjusted for 51 clusters in id) Robust mrdrte Coef. Std. Err. t P> t [95% Conf. Interval] exec unem _cons sigma_u sigma_e rho (fraction of variance due to u_i) Even though the standard errors change quite a lot, the deterrent effect is still insignificant. 7

IAPRI Quantitative Analysis Capacity Building Series. Multiple regression analysis & interpreting results

IAPRI Quantitative Analysis Capacity Building Series. Multiple regression analysis & interpreting results IAPRI Quantitative Analysis Capacity Building Series Multiple regression analysis & interpreting results How important is R-squared? R-squared Published in Agricultural Economics 0.45 Best article of the

More information

Lab 5 Linear Regression with Within-subject Correlation. Goals: Data: Use the pig data which is in wide format:

Lab 5 Linear Regression with Within-subject Correlation. Goals: Data: Use the pig data which is in wide format: Lab 5 Linear Regression with Within-subject Correlation Goals: Data: Fit linear regression models that account for within-subject correlation using Stata. Compare weighted least square, GEE, and random

More information

Correlated Random Effects Panel Data Models

Correlated Random Effects Panel Data Models INTRODUCTION AND LINEAR MODELS Correlated Random Effects Panel Data Models IZA Summer School in Labor Economics May 13-19, 2013 Jeffrey M. Wooldridge Michigan State University 1. Introduction 2. The Linear

More information

DETERMINANTS OF CAPITAL ADEQUACY RATIO IN SELECTED BOSNIAN BANKS

DETERMINANTS OF CAPITAL ADEQUACY RATIO IN SELECTED BOSNIAN BANKS DETERMINANTS OF CAPITAL ADEQUACY RATIO IN SELECTED BOSNIAN BANKS Nađa DRECA International University of Sarajevo nadja.dreca@students.ius.edu.ba Abstract The analysis of a data set of observation for 10

More information

Lecture 15. Endogeneity & Instrumental Variable Estimation

Lecture 15. Endogeneity & Instrumental Variable Estimation Lecture 15. Endogeneity & Instrumental Variable Estimation Saw that measurement error (on right hand side) means that OLS will be biased (biased toward zero) Potential solution to endogeneity instrumental

More information

MULTIPLE REGRESSION EXAMPLE

MULTIPLE REGRESSION EXAMPLE MULTIPLE REGRESSION EXAMPLE For a sample of n = 166 college students, the following variables were measured: Y = height X 1 = mother s height ( momheight ) X 2 = father s height ( dadheight ) X 3 = 1 if

More information

ECON 142 SKETCH OF SOLUTIONS FOR APPLIED EXERCISE #2

ECON 142 SKETCH OF SOLUTIONS FOR APPLIED EXERCISE #2 University of California, Berkeley Prof. Ken Chay Department of Economics Fall Semester, 005 ECON 14 SKETCH OF SOLUTIONS FOR APPLIED EXERCISE # Question 1: a. Below are the scatter plots of hourly wages

More information

Panel Data Analysis Fixed and Random Effects using Stata (v. 4.2)

Panel Data Analysis Fixed and Random Effects using Stata (v. 4.2) Panel Data Analysis Fixed and Random Effects using Stata (v. 4.2) Oscar Torres-Reyna otorres@princeton.edu December 2007 http://dss.princeton.edu/training/ Intro Panel data (also known as longitudinal

More information

Please follow the directions once you locate the Stata software in your computer. Room 114 (Business Lab) has computers with Stata software

Please follow the directions once you locate the Stata software in your computer. Room 114 (Business Lab) has computers with Stata software STATA Tutorial Professor Erdinç Please follow the directions once you locate the Stata software in your computer. Room 114 (Business Lab) has computers with Stata software 1.Wald Test Wald Test is used

More information

Multicollinearity Richard Williams, University of Notre Dame, http://www3.nd.edu/~rwilliam/ Last revised January 13, 2015

Multicollinearity Richard Williams, University of Notre Dame, http://www3.nd.edu/~rwilliam/ Last revised January 13, 2015 Multicollinearity Richard Williams, University of Notre Dame, http://www3.nd.edu/~rwilliam/ Last revised January 13, 2015 Stata Example (See appendices for full example).. use http://www.nd.edu/~rwilliam/stats2/statafiles/multicoll.dta,

More information

August 2012 EXAMINATIONS Solution Part I

August 2012 EXAMINATIONS Solution Part I August 01 EXAMINATIONS Solution Part I (1) In a random sample of 600 eligible voters, the probability that less than 38% will be in favour of this policy is closest to (B) () In a large random sample,

More information

Chapter 10: Basic Linear Unobserved Effects Panel Data. Models:

Chapter 10: Basic Linear Unobserved Effects Panel Data. Models: Chapter 10: Basic Linear Unobserved Effects Panel Data Models: Microeconomic Econometrics I Spring 2010 10.1 Motivation: The Omitted Variables Problem We are interested in the partial effects of the observable

More information

The following postestimation commands for time series are available for regress:

The following postestimation commands for time series are available for regress: Title stata.com regress postestimation time series Postestimation tools for regress with time series Description Syntax for estat archlm Options for estat archlm Syntax for estat bgodfrey Options for estat

More information

Discussion Section 4 ECON 139/239 2010 Summer Term II

Discussion Section 4 ECON 139/239 2010 Summer Term II Discussion Section 4 ECON 139/239 2010 Summer Term II 1. Let s use the CollegeDistance.csv data again. (a) An education advocacy group argues that, on average, a person s educational attainment would increase

More information

Interaction effects between continuous variables (Optional)

Interaction effects between continuous variables (Optional) Interaction effects between continuous variables (Optional) Richard Williams, University of Notre Dame, http://www.nd.edu/~rwilliam/ Last revised February 0, 05 This is a very brief overview of this somewhat

More information

MODEL I: DRINK REGRESSED ON GPA & MALE, WITHOUT CENTERING

MODEL I: DRINK REGRESSED ON GPA & MALE, WITHOUT CENTERING Interpreting Interaction Effects; Interaction Effects and Centering Richard Williams, University of Notre Dame, http://www3.nd.edu/~rwilliam/ Last revised February 20, 2015 Models with interaction effects

More information

Panel Data Analysis Josef Brüderl, University of Mannheim, March 2005

Panel Data Analysis Josef Brüderl, University of Mannheim, March 2005 Panel Data Analysis Josef Brüderl, University of Mannheim, March 2005 This is an introduction to panel data analysis on an applied level using Stata. The focus will be on showing the "mechanics" of these

More information

Nonlinear Regression Functions. SW Ch 8 1/54/

Nonlinear Regression Functions. SW Ch 8 1/54/ Nonlinear Regression Functions SW Ch 8 1/54/ The TestScore STR relation looks linear (maybe) SW Ch 8 2/54/ But the TestScore Income relation looks nonlinear... SW Ch 8 3/54/ Nonlinear Regression General

More information

ESTIMATING AVERAGE TREATMENT EFFECTS: IV AND CONTROL FUNCTIONS, II Jeff Wooldridge Michigan State University BGSE/IZA Course in Microeconometrics

ESTIMATING AVERAGE TREATMENT EFFECTS: IV AND CONTROL FUNCTIONS, II Jeff Wooldridge Michigan State University BGSE/IZA Course in Microeconometrics ESTIMATING AVERAGE TREATMENT EFFECTS: IV AND CONTROL FUNCTIONS, II Jeff Wooldridge Michigan State University BGSE/IZA Course in Microeconometrics July 2009 1. Quantile Treatment Effects 2. Control Functions

More information

Correlation and Regression

Correlation and Regression Correlation and Regression Scatterplots Correlation Explanatory and response variables Simple linear regression General Principles of Data Analysis First plot the data, then add numerical summaries Look

More information

Handling missing data in Stata a whirlwind tour

Handling missing data in Stata a whirlwind tour Handling missing data in Stata a whirlwind tour 2012 Italian Stata Users Group Meeting Jonathan Bartlett www.missingdata.org.uk 20th September 2012 1/55 Outline The problem of missing data and a principled

More information

Statistics 104 Final Project A Culture of Debt: A Study of Credit Card Spending in America TF: Kevin Rader Anonymous Students: LD, MH, IW, MY

Statistics 104 Final Project A Culture of Debt: A Study of Credit Card Spending in America TF: Kevin Rader Anonymous Students: LD, MH, IW, MY Statistics 104 Final Project A Culture of Debt: A Study of Credit Card Spending in America TF: Kevin Rader Anonymous Students: LD, MH, IW, MY ABSTRACT: This project attempted to determine the relationship

More information

Linear Regression Models with Logarithmic Transformations

Linear Regression Models with Logarithmic Transformations Linear Regression Models with Logarithmic Transformations Kenneth Benoit Methodology Institute London School of Economics kbenoit@lse.ac.uk March 17, 2011 1 Logarithmic transformations of variables Considering

More information

Addressing Alternative. Multiple Regression. 17.871 Spring 2012

Addressing Alternative. Multiple Regression. 17.871 Spring 2012 Addressing Alternative Explanations: Multiple Regression 17.871 Spring 2012 1 Did Clinton hurt Gore example Did Clinton hurt Gore in the 2000 election? Treatment is not liking Bill Clinton 2 Bivariate

More information

Marginal Effects for Continuous Variables Richard Williams, University of Notre Dame, http://www3.nd.edu/~rwilliam/ Last revised February 21, 2015

Marginal Effects for Continuous Variables Richard Williams, University of Notre Dame, http://www3.nd.edu/~rwilliam/ Last revised February 21, 2015 Marginal Effects for Continuous Variables Richard Williams, University of Notre Dame, http://www3.nd.edu/~rwilliam/ Last revised February 21, 2015 References: Long 1997, Long and Freese 2003 & 2006 & 2014,

More information

Rockefeller College University at Albany

Rockefeller College University at Albany Rockefeller College University at Albany PAD 705 Handout: Hypothesis Testing on Multiple Parameters In many cases we may wish to know whether two or more variables are jointly significant in a regression.

More information

xtmixed & denominator degrees of freedom: myth or magic

xtmixed & denominator degrees of freedom: myth or magic xtmixed & denominator degrees of freedom: myth or magic 2011 Chicago Stata Conference Phil Ender UCLA Statistical Consulting Group July 2011 Phil Ender xtmixed & denominator degrees of freedom: myth or

More information

HURDLE AND SELECTION MODELS Jeff Wooldridge Michigan State University BGSE/IZA Course in Microeconometrics July 2009

HURDLE AND SELECTION MODELS Jeff Wooldridge Michigan State University BGSE/IZA Course in Microeconometrics July 2009 HURDLE AND SELECTION MODELS Jeff Wooldridge Michigan State University BGSE/IZA Course in Microeconometrics July 2009 1. Introduction 2. A General Formulation 3. Truncated Normal Hurdle Model 4. Lognormal

More information

Nonlinear relationships Richard Williams, University of Notre Dame, http://www3.nd.edu/~rwilliam/ Last revised February 20, 2015

Nonlinear relationships Richard Williams, University of Notre Dame, http://www3.nd.edu/~rwilliam/ Last revised February 20, 2015 Nonlinear relationships Richard Williams, University of Notre Dame, http://www.nd.edu/~rwilliam/ Last revised February, 5 Sources: Berry & Feldman s Multiple Regression in Practice 985; Pindyck and Rubinfeld

More information

Outline. Topic 4 - Analysis of Variance Approach to Regression. Partitioning Sums of Squares. Total Sum of Squares. Partitioning sums of squares

Outline. Topic 4 - Analysis of Variance Approach to Regression. Partitioning Sums of Squares. Total Sum of Squares. Partitioning sums of squares Topic 4 - Analysis of Variance Approach to Regression Outline Partitioning sums of squares Degrees of freedom Expected mean squares General linear test - Fall 2013 R 2 and the coefficient of correlation

More information

The leverage statistic, h, also called the hat-value, is available to identify cases which influence the regression model more than others.

The leverage statistic, h, also called the hat-value, is available to identify cases which influence the regression model more than others. Outliers Outliers are data points which lie outside the general linear pattern of which the midline is the regression line. A rule of thumb is that outliers are points whose standardized residual is greater

More information

Module 5: Multiple Regression Analysis

Module 5: Multiple Regression Analysis Using Statistical Data Using to Make Statistical Decisions: Data Multiple to Make Regression Decisions Analysis Page 1 Module 5: Multiple Regression Analysis Tom Ilvento, University of Delaware, College

More information

Regression Analysis (Spring, 2000)

Regression Analysis (Spring, 2000) Regression Analysis (Spring, 2000) By Wonjae Purposes: a. Explaining the relationship between Y and X variables with a model (Explain a variable Y in terms of Xs) b. Estimating and testing the intensity

More information

Standard errors of marginal effects in the heteroskedastic probit model

Standard errors of marginal effects in the heteroskedastic probit model Standard errors of marginal effects in the heteroskedastic probit model Thomas Cornelißen Discussion Paper No. 320 August 2005 ISSN: 0949 9962 Abstract In non-linear regression models, such as the heteroskedastic

More information

The average hotel manager recognizes the criticality of forecasting. However, most

The average hotel manager recognizes the criticality of forecasting. However, most Introduction The average hotel manager recognizes the criticality of forecasting. However, most managers are either frustrated by complex models researchers constructed or appalled by the amount of time

More information

An introduction to GMM estimation using Stata

An introduction to GMM estimation using Stata An introduction to GMM estimation using Stata David M. Drukker StataCorp German Stata Users Group Berlin June 2010 1 / 29 Outline 1 A quick introduction to GMM 2 Using the gmm command 2 / 29 A quick introduction

More information

Econometrics I: Econometric Methods

Econometrics I: Econometric Methods Econometrics I: Econometric Methods Jürgen Meinecke Research School of Economics, Australian National University 24 May, 2016 Housekeeping Assignment 2 is now history The ps tute this week will go through

More information

Milk Data Analysis. 1. Objective Introduction to SAS PROC MIXED Analyzing protein milk data using STATA Refit protein milk data using PROC MIXED

Milk Data Analysis. 1. Objective Introduction to SAS PROC MIXED Analyzing protein milk data using STATA Refit protein milk data using PROC MIXED 1. Objective Introduction to SAS PROC MIXED Analyzing protein milk data using STATA Refit protein milk data using PROC MIXED 2. Introduction to SAS PROC MIXED The MIXED procedure provides you with flexibility

More information

A Panel Data Analysis of Corporate Attributes and Stock Prices for Indian Manufacturing Sector

A Panel Data Analysis of Corporate Attributes and Stock Prices for Indian Manufacturing Sector Journal of Modern Accounting and Auditing, ISSN 1548-6583 November 2013, Vol. 9, No. 11, 1519-1525 D DAVID PUBLISHING A Panel Data Analysis of Corporate Attributes and Stock Prices for Indian Manufacturing

More information

From the help desk: Swamy s random-coefficients model

From the help desk: Swamy s random-coefficients model The Stata Journal (2003) 3, Number 3, pp. 302 308 From the help desk: Swamy s random-coefficients model Brian P. Poi Stata Corporation Abstract. This article discusses the Swamy (1970) random-coefficients

More information

Sample Size Calculation for Longitudinal Studies

Sample Size Calculation for Longitudinal Studies Sample Size Calculation for Longitudinal Studies Phil Schumm Department of Health Studies University of Chicago August 23, 2004 (Supported by National Institute on Aging grant P01 AG18911-01A1) Introduction

More information

2. Linear regression with multiple regressors

2. Linear regression with multiple regressors 2. Linear regression with multiple regressors Aim of this section: Introduction of the multiple regression model OLS estimation in multiple regression Measures-of-fit in multiple regression Assumptions

More information

Stata Walkthrough 4: Regression, Prediction, and Forecasting

Stata Walkthrough 4: Regression, Prediction, and Forecasting Stata Walkthrough 4: Regression, Prediction, and Forecasting Over drinks the other evening, my neighbor told me about his 25-year-old nephew, who is dating a 35-year-old woman. God, I can t see them getting

More information

Failure to take the sampling scheme into account can lead to inaccurate point estimates and/or flawed estimates of the standard errors.

Failure to take the sampling scheme into account can lead to inaccurate point estimates and/or flawed estimates of the standard errors. Analyzing Complex Survey Data: Some key issues to be aware of Richard Williams, University of Notre Dame, http://www3.nd.edu/~rwilliam/ Last revised January 24, 2015 Rather than repeat material that is

More information

Data Analysis Methodology 1

Data Analysis Methodology 1 Data Analysis Methodology 1 Suppose you inherited the database in Table 1.1 and needed to find out what could be learned from it fast. Say your boss entered your office and said, Here s some software project

More information

Does corporate performance predict the cost of equity capital?

Does corporate performance predict the cost of equity capital? AMERICAN JOURNAL OF SOCIAL AND MANAGEMENT SCIENCES ISSN Print: 2156-1540, ISSN Online: 2151-1559, doi:10.5251/ajsms.2011.2.1.26.33 2010, ScienceHuβ, http://www.scihub.org/ajsms Does corporate performance

More information

Introduction to Panel Data Analysis

Introduction to Panel Data Analysis Introduction to Panel Data Analysis Oliver Lipps / Ursina Kuhn Swiss Centre of Expertise in the Social Sciences (FORS) c/o University of Lausanne Lugano Summer School, August 27-31 212 Introduction panel

More information

THE UNIVERSITY OF CHICAGO, Booth School of Business Business 41202, Spring Quarter 2014, Mr. Ruey S. Tsay. Solutions to Homework Assignment #2

THE UNIVERSITY OF CHICAGO, Booth School of Business Business 41202, Spring Quarter 2014, Mr. Ruey S. Tsay. Solutions to Homework Assignment #2 THE UNIVERSITY OF CHICAGO, Booth School of Business Business 41202, Spring Quarter 2014, Mr. Ruey S. Tsay Solutions to Homework Assignment #2 Assignment: 1. Consumer Sentiment of the University of Michigan.

More information

MGT 267 PROJECT. Forecasting the United States Retail Sales of the Pharmacies and Drug Stores. Done by: Shunwei Wang & Mohammad Zainal

MGT 267 PROJECT. Forecasting the United States Retail Sales of the Pharmacies and Drug Stores. Done by: Shunwei Wang & Mohammad Zainal MGT 267 PROJECT Forecasting the United States Retail Sales of the Pharmacies and Drug Stores Done by: Shunwei Wang & Mohammad Zainal Dec. 2002 The retail sale (Million) ABSTRACT The present study aims

More information

Univariate Regression

Univariate Regression Univariate Regression Correlation and Regression The regression line summarizes the linear relationship between 2 variables Correlation coefficient, r, measures strength of relationship: the closer r is

More information

What s New in Econometrics? Lecture 8 Cluster and Stratified Sampling

What s New in Econometrics? Lecture 8 Cluster and Stratified Sampling What s New in Econometrics? Lecture 8 Cluster and Stratified Sampling Jeff Wooldridge NBER Summer Institute, 2007 1. The Linear Model with Cluster Effects 2. Estimation with a Small Number of Groups and

More information

Introduction to Time Series Regression and Forecasting

Introduction to Time Series Regression and Forecasting Introduction to Time Series Regression and Forecasting (SW Chapter 14) Time series data are data collected on the same observational unit at multiple time periods Aggregate consumption and GDP for a country

More information

Quick Stata Guide by Liz Foster

Quick Stata Guide by Liz Foster by Liz Foster Table of Contents Part 1: 1 describe 1 generate 1 regress 3 scatter 4 sort 5 summarize 5 table 6 tabulate 8 test 10 ttest 11 Part 2: Prefixes and Notes 14 by var: 14 capture 14 use of the

More information

MODELING AUTO INSURANCE PREMIUMS

MODELING AUTO INSURANCE PREMIUMS MODELING AUTO INSURANCE PREMIUMS Brittany Parahus, Siena College INTRODUCTION The findings in this paper will provide the reader with a basic knowledge and understanding of how Auto Insurance Companies

More information

1. The parameters to be estimated in the simple linear regression model Y=α+βx+ε ε~n(0,σ) are: a) α, β, σ b) α, β, ε c) a, b, s d) ε, 0, σ

1. The parameters to be estimated in the simple linear regression model Y=α+βx+ε ε~n(0,σ) are: a) α, β, σ b) α, β, ε c) a, b, s d) ε, 0, σ STA 3024 Practice Problems Exam 2 NOTE: These are just Practice Problems. This is NOT meant to look just like the test, and it is NOT the only thing that you should study. Make sure you know all the material

More information

For more information on the Stata Journal, including information for authors, see the web page. http://www.stata-journal.com

For more information on the Stata Journal, including information for authors, see the web page. http://www.stata-journal.com The Stata Journal Editor H. Joseph Newton Department of Statistics Texas A & M University College Station, Texas 77843 979-845-3142; FAX 979-845-3144 jnewton@stata-journal.com Associate Editors Christopher

More information

4. Multiple Regression in Practice

4. Multiple Regression in Practice 30 Multiple Regression in Practice 4. Multiple Regression in Practice The preceding chapters have helped define the broad principles on which regression analysis is based. What features one should look

More information

Wooldridge, Introductory Econometrics, 3d ed. Chapter 12: Serial correlation and heteroskedasticity in time series regressions

Wooldridge, Introductory Econometrics, 3d ed. Chapter 12: Serial correlation and heteroskedasticity in time series regressions Wooldridge, Introductory Econometrics, 3d ed. Chapter 12: Serial correlation and heteroskedasticity in time series regressions What will happen if we violate the assumption that the errors are not serially

More information

Regression Analysis: A Complete Example

Regression Analysis: A Complete Example Regression Analysis: A Complete Example This section works out an example that includes all the topics we have discussed so far in this chapter. A complete example of regression analysis. PhotoDisc, Inc./Getty

More information

Clustering in the Linear Model

Clustering in the Linear Model Short Guides to Microeconometrics Fall 2014 Kurt Schmidheiny Universität Basel Clustering in the Linear Model 2 1 Introduction Clustering in the Linear Model This handout extends the handout on The Multiple

More information

The Numbers Behind the MLB Anonymous Students: AD, CD, BM; (TF: Kevin Rader)

The Numbers Behind the MLB Anonymous Students: AD, CD, BM; (TF: Kevin Rader) The Numbers Behind the MLB Anonymous Students: AD, CD, BM; (TF: Kevin Rader) Abstract This project measures the effects of various baseball statistics on the win percentage of all the teams in MLB. Data

More information

Interaction effects and group comparisons Richard Williams, University of Notre Dame, http://www3.nd.edu/~rwilliam/ Last revised February 20, 2015

Interaction effects and group comparisons Richard Williams, University of Notre Dame, http://www3.nd.edu/~rwilliam/ Last revised February 20, 2015 Interaction effects and group comparisons Richard Williams, University of Notre Dame, http://www3.nd.edu/~rwilliam/ Last revised February 20, 2015 Note: This handout assumes you understand factor variables,

More information

Using Stata 9 & Higher for OLS Regression Richard Williams, University of Notre Dame, http://www3.nd.edu/~rwilliam/ Last revised January 8, 2015

Using Stata 9 & Higher for OLS Regression Richard Williams, University of Notre Dame, http://www3.nd.edu/~rwilliam/ Last revised January 8, 2015 Using Stata 9 & Higher for OLS Regression Richard Williams, University of Notre Dame, http://www3.nd.edu/~rwilliam/ Last revised January 8, 2015 Introduction. This handout shows you how Stata can be used

More information

I n d i a n a U n i v e r s i t y U n i v e r s i t y I n f o r m a t i o n T e c h n o l o g y S e r v i c e s

I n d i a n a U n i v e r s i t y U n i v e r s i t y I n f o r m a t i o n T e c h n o l o g y S e r v i c e s I n d i a n a U n i v e r s i t y U n i v e r s i t y I n f o r m a t i o n T e c h n o l o g y S e r v i c e s Linear Regression Models for Panel Data Using SAS, Stata, LIMDEP, and SPSS * Hun Myoung Park,

More information

1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96

1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96 1 Final Review 2 Review 2.1 CI 1-propZint Scenario 1 A TV manufacturer claims in its warranty brochure that in the past not more than 10 percent of its TV sets needed any repair during the first two years

More information

is paramount in advancing any economy. For developed countries such as

is paramount in advancing any economy. For developed countries such as Introduction The provision of appropriate incentives to attract workers to the health industry is paramount in advancing any economy. For developed countries such as Australia, the increasing demand for

More information

SIMPLE LINEAR CORRELATION. r can range from -1 to 1, and is independent of units of measurement. Correlation can be done on two dependent variables.

SIMPLE LINEAR CORRELATION. r can range from -1 to 1, and is independent of units of measurement. Correlation can be done on two dependent variables. SIMPLE LINEAR CORRELATION Simple linear correlation is a measure of the degree to which two variables vary together, or a measure of the intensity of the association between two variables. Correlation

More information

Employer-Provided Health Insurance and Labor Supply of Married Women

Employer-Provided Health Insurance and Labor Supply of Married Women Upjohn Institute Working Papers Upjohn Research home page 2011 Employer-Provided Health Insurance and Labor Supply of Married Women Merve Cebi University of Massachusetts - Dartmouth and W.E. Upjohn Institute

More information

MULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS

MULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS MULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS MSR = Mean Regression Sum of Squares MSE = Mean Squared Error RSS = Regression Sum of Squares SSE = Sum of Squared Errors/Residuals α = Level of Significance

More information

UNIVERSITY OF WAIKATO. Hamilton New Zealand

UNIVERSITY OF WAIKATO. Hamilton New Zealand UNIVERSITY OF WAIKATO Hamilton New Zealand Can We Trust Cluster-Corrected Standard Errors? An Application of Spatial Autocorrelation with Exact Locations Known John Gibson University of Waikato Bonggeun

More information

1.1. Simple Regression in Excel (Excel 2010).

1.1. Simple Regression in Excel (Excel 2010). .. Simple Regression in Excel (Excel 200). To get the Data Analysis tool, first click on File > Options > Add-Ins > Go > Select Data Analysis Toolpack & Toolpack VBA. Data Analysis is now available under

More information

Multiple Regression: What Is It?

Multiple Regression: What Is It? Multiple Regression Multiple Regression: What Is It? Multiple regression is a collection of techniques in which there are multiple predictors of varying kinds and a single outcome We are interested in

More information

Title. Syntax. stata.com. fp Fractional polynomial regression. Estimation

Title. Syntax. stata.com. fp Fractional polynomial regression. Estimation Title stata.com fp Fractional polynomial regression Syntax Menu Description Options for fp Options for fp generate Remarks and examples Stored results Methods and formulas Acknowledgment References Also

More information

Globalization: a Road to Innovation

Globalization: a Road to Innovation Globalization: a Road to Innovation Florentina IVANOV 1 Abstract Innovations are precious for an economy and quite valuable for their owners, too. But these good-for-everybody things are a strange product

More information

International Statistical Institute, 56th Session, 2007: Phil Everson

International Statistical Institute, 56th Session, 2007: Phil Everson Teaching Regression using American Football Scores Everson, Phil Swarthmore College Department of Mathematics and Statistics 5 College Avenue Swarthmore, PA198, USA E-mail: peverso1@swarthmore.edu 1. Introduction

More information

Multiple Linear Regression in Data Mining

Multiple Linear Regression in Data Mining Multiple Linear Regression in Data Mining Contents 2.1. A Review of Multiple Linear Regression 2.2. Illustration of the Regression Process 2.3. Subset Selection in Linear Regression 1 2 Chap. 2 Multiple

More information

A Simple Feasible Alternative Procedure to Estimate Models with High-Dimensional Fixed Effects

A Simple Feasible Alternative Procedure to Estimate Models with High-Dimensional Fixed Effects DISCUSSION PAPER SERIES IZA DP No. 3935 A Simple Feasible Alternative Procedure to Estimate Models with High-Dimensional Fixed Effects Paulo Guimarães Pedro Portugal January 2009 Forschungsinstitut zur

More information

The Stata Journal. Editor Nicholas J. Cox Department of Geography Durham University South Road Durham City DH1 3LE UK n.j.cox@stata-journal.

The Stata Journal. Editor Nicholas J. Cox Department of Geography Durham University South Road Durham City DH1 3LE UK n.j.cox@stata-journal. The Stata Journal Editor H. Joseph Newton Department of Statistics Texas A&M University College Station, Texas 77843 979-845-8817; fax 979-845-6077 jnewton@stata-journal.com Associate Editors Christopher

More information

NCSS Statistical Software Principal Components Regression. In ordinary least squares, the regression coefficients are estimated using the formula ( )

NCSS Statistical Software Principal Components Regression. In ordinary least squares, the regression coefficients are estimated using the formula ( ) Chapter 340 Principal Components Regression Introduction is a technique for analyzing multiple regression data that suffer from multicollinearity. When multicollinearity occurs, least squares estimates

More information

A Basic Introduction to Missing Data

A Basic Introduction to Missing Data John Fox Sociology 740 Winter 2014 Outline Why Missing Data Arise Why Missing Data Arise Global or unit non-response. In a survey, certain respondents may be unreachable or may refuse to participate. Item

More information

Models for Longitudinal and Clustered Data

Models for Longitudinal and Clustered Data Models for Longitudinal and Clustered Data Germán Rodríguez December 9, 2008, revised December 6, 2012 1 Introduction The most important assumption we have made in this course is that the observations

More information

Regression step-by-step using Microsoft Excel

Regression step-by-step using Microsoft Excel Step 1: Regression step-by-step using Microsoft Excel Notes prepared by Pamela Peterson Drake, James Madison University Type the data into the spreadsheet The example used throughout this How to is a regression

More information

5. Linear Regression

5. Linear Regression 5. Linear Regression Outline.................................................................... 2 Simple linear regression 3 Linear model............................................................. 4

More information

Introduction to Regression and Data Analysis

Introduction to Regression and Data Analysis Statlab Workshop Introduction to Regression and Data Analysis with Dan Campbell and Sherlock Campbell October 28, 2008 I. The basics A. Types of variables Your variables may take several forms, and it

More information

Implementing Panel-Corrected Standard Errors in R: The pcse Package

Implementing Panel-Corrected Standard Errors in R: The pcse Package Implementing Panel-Corrected Standard Errors in R: The pcse Package Delia Bailey YouGov Polimetrix Jonathan N. Katz California Institute of Technology Abstract This introduction to the R package pcse is

More information

Applied Panel Data Analysis

Applied Panel Data Analysis Applied Panel Data Analysis Using Stata Prof. Dr. Josef Brüderl LMU München April 2015 Contents I I) Panel Data 06 II) The Basic Idea of Panel Data Analysis 17 III) An Intuitive Introduction to Linear

More information

Dynamic Panel Data estimators

Dynamic Panel Data estimators Dynamic Panel Data estimators Christopher F Baum EC 823: Applied Econometrics Boston College, Spring 2013 Christopher F Baum (BC / DIW) Dynamic Panel Data estimators Boston College, Spring 2013 1 / 50

More information

Integrated Resource Plan

Integrated Resource Plan Integrated Resource Plan March 19, 2004 PREPARED FOR KAUA I ISLAND UTILITY COOPERATIVE LCG Consulting 4962 El Camino Real, Suite 112 Los Altos, CA 94022 650-962-9670 1 IRP 1 ELECTRIC LOAD FORECASTING 1.1

More information

25 Working with categorical data and factor variables

25 Working with categorical data and factor variables 25 Working with categorical data and factor variables Contents 25.1 Continuous, categorical, and indicator variables 25.1.1 Converting continuous variables to indicator variables 25.1.2 Converting continuous

More information

SPSS Guide: Regression Analysis

SPSS Guide: Regression Analysis SPSS Guide: Regression Analysis I put this together to give you a step-by-step guide for replicating what we did in the computer lab. It should help you run the tests we covered. The best way to get familiar

More information

Statistical modelling with missing data using multiple imputation. Session 4: Sensitivity Analysis after Multiple Imputation

Statistical modelling with missing data using multiple imputation. Session 4: Sensitivity Analysis after Multiple Imputation Statistical modelling with missing data using multiple imputation Session 4: Sensitivity Analysis after Multiple Imputation James Carpenter London School of Hygiene & Tropical Medicine Email: james.carpenter@lshtm.ac.uk

More information

MEASURING THE INVENTORY TURNOVER IN DISTRIBUTIVE TRADE

MEASURING THE INVENTORY TURNOVER IN DISTRIBUTIVE TRADE MEASURING THE INVENTORY TURNOVER IN DISTRIBUTIVE TRADE Marijan Karić, Ph.D. Josip Juraj Strossmayer University of Osijek Faculty of Economics in Osijek Gajev trg 7, 31000 Osijek, Croatia Phone: +385 31

More information

2. Simple Linear Regression

2. Simple Linear Regression Research methods - II 3 2. Simple Linear Regression Simple linear regression is a technique in parametric statistics that is commonly used for analyzing mean response of a variable Y which changes according

More information

Northern Colorado Retail Study: A shift-share analysis 2000 to 2010

Northern Colorado Retail Study: A shift-share analysis 2000 to 2010 Northern Colorado Retail Study: A shift-share analysis 2000 to 2010 Everitt Real Estate Center Steven P Laposa, PhD Christopher Hannum, PhD Economics Candidate Austin Carter, Senior (Real Estate Major)

More information

Simple linear regression

Simple linear regression Simple linear regression Introduction Simple linear regression is a statistical method for obtaining a formula to predict values of one variable from another where there is a causal relationship between

More information

Multiple Regression in SPSS This example shows you how to perform multiple regression. The basic command is regression : linear.

Multiple Regression in SPSS This example shows you how to perform multiple regression. The basic command is regression : linear. Multiple Regression in SPSS This example shows you how to perform multiple regression. The basic command is regression : linear. In the main dialog box, input the dependent variable and several predictors.

More information

Biostatistics: DESCRIPTIVE STATISTICS: 2, VARIABILITY

Biostatistics: DESCRIPTIVE STATISTICS: 2, VARIABILITY Biostatistics: DESCRIPTIVE STATISTICS: 2, VARIABILITY 1. Introduction Besides arriving at an appropriate expression of an average or consensus value for observations of a population, it is important to

More information

MISSING DATA TECHNIQUES WITH SAS. IDRE Statistical Consulting Group

MISSING DATA TECHNIQUES WITH SAS. IDRE Statistical Consulting Group MISSING DATA TECHNIQUES WITH SAS IDRE Statistical Consulting Group ROAD MAP FOR TODAY To discuss: 1. Commonly used techniques for handling missing data, focusing on multiple imputation 2. Issues that could

More information

Uninformative Feedback and Risk Taking: Evidence from Retail Forex Trading

Uninformative Feedback and Risk Taking: Evidence from Retail Forex Trading Uninformative Feedback and Risk Taking: Evidence from Retail Forex Trading Itzhak Ben-David Fisher College of Business, The Ohio State University, and NBER Justin Birru Fisher College of Business, The

More information