# Marginal Effects for Continuous Variables Richard Williams, University of Notre Dame, Last revised February 21, 2015

 To view this video please enable JavaScript, and consider upgrading to a web browser that supports HTML5 video
Save this PDF as:

Size: px
Start display at page:

Download "Marginal Effects for Continuous Variables Richard Williams, University of Notre Dame, http://www3.nd.edu/~rwilliam/ Last revised February 21, 2015"

## Transcription

1 Marginal Effects for Continuous Variables Richard Williams, University of Notre Dame, Last revised February 21, 2015 References: Long 1997, Long and Freese 2003 & 2006 & 2014, Cameron & Trivedi s Microeconomics Using Stata Revised Edition, 2010 Overview. Marginal effects are computed differently for discrete (i.e. categorical) and continuous variables. Thus handout will explain the difference between the two. With binary independent variables, marginal effects measure discrete change, i.e. how do predicted probabilities change as the binary independent variable changes from 0 to 1? Marginal effects for continuous variables measure the instantaneous rate of change (defined shortly). They are popular in some disciplines (e.g. Economics) because they often provide a good approximation to the amount of change in Y that will be produced by a 1-unit change in X k. But then again, they often do not. Example. We will show Marginal Effects at the Means (MEMS) for both the discrete and continuous independent variables in the following example.. use clear. logit grade gpa tuce i.psi, nolog Logistic regression Number of obs = 32 LR chi2(3) = Prob > chi2 = Log likelihood = Pseudo R2 = grade Coef. Std. Err. z P> z [95% Conf. Interval] gpa tuce psi _cons margins, dydx(*) atmeans Conditional marginal effects Number of obs = 32 Expression : Pr(grade), predict() dy/dx w.r.t. : gpa tuce 1.psi at : gpa = (mean) tuce = (mean) 0.psi =.5625 (mean) 1.psi =.4375 (mean) dy/dx Std. Err. z P> z [95% Conf. Interval] gpa tuce psi Note: dy/dx for factor levels is the discrete change from the base level Marginal Effects for Continuous Variables Page 1

2 Discrete Change for Categorical Variables. Categorical variables, such as psi, can only take on two values, 0 and 1. It wouldn t make much sense to compute how P(Y=1) would change if, say, psi changed from 0 to.6, because that cannot happen. The MEM for categorical variables therefore shows how P(Y=1) changes as the categorical variable changes from 0 to 1, holding all other variables at their means. That is, for a categorical variable X k Marginal Effect X k = Pr(Y = 1 X, X k = 1) Pr(y=1 X, X k = 0) In the current case, the MEM for psi of.456 tells us that, for two hypothetical individuals with average values on gpa (3.12) and tuce (21.94), the predicted probability of success is.456 greater for the individual in psi than for one who is in a traditional classroom. To confirm, we can easily compute the predicted probabilities for those hypothetical individuals, and then compute the difference between the two.. margins psi, atmeans Adjusted predictions Number of obs = 32 Expression : Pr(grade), predict() at : gpa = (mean) tuce = (mean) 0.psi =.5625 (mean) 1.psi =.4375 (mean) Margin Std. Err. z P> z [95% Conf. Interval] psi display For categorical variables with more than two possible values, e.g. religion, the marginal effects show you the difference in the predicted probabilities for cases in one category relative to the reference category. So, for example, if relig was coded 1 = Catholic, 2 = Protestant, 3 = Jewish, 4 = other, the marginal effect for Protestant would show you how much more (or less) likely Protestants were to succeed than were Catholics, the marginal effect for Jewish would show you how much more (or less) likely Jews were to succeed than were Catholics, etc. Keep in mind that these are the marginal effects when all other variables equal their means (hence the term MEMs); the marginal effects will differ at other values of the Xs. Instantaneous rates of change for continuous variables. What does the MEM for gpa of.534 mean? It would be nice if we could say that a one unit increase in gpa will produce a.534 increase in the probability of success for an otherwise average individual. Sometimes statements like that will be (almost) true, but other times they won t. For example, if an Marginal Effects for Continuous Variables Page 2

3 average individual (average meaning gpa = 3.12, tuce = 21.94, psi =.4375) saw a one point increase in their gpa, here is how their predicted probability of success would change:. margins, at(gpa = ( )) atmeans Adjusted predictions Number of obs = 32 Expression : Pr(grade), predict() 1._at : gpa = tuce = (mean) 0.psi =.5625 (mean) 1.psi =.4375 (mean) 2._at : gpa = tuce = (mean) 0.psi =.5625 (mean) 1.psi =.4375 (mean) Margin Std. Err. z P> z [95% Conf. Interval] _at display Note that (a) the predicted increase of.598 is actually more than the MEM for gpa of.534, and (b) in reality, gpa couldn t go up 1 point for a person with an average gpa of MEMs for continuous variables measure the instantaneous rate of change, which may or may not be close to the effect on P(Y=1) of a one unit increase in X k. The appendices explain the concept in detail. What the MEM more or less tells you is that, if, say, X k increased by some very small amount (e.g..001), then P(Y=1) would increase by about.001*.534 = , e.g.. margins, at(gpa = ( )) atmeans noatlegend Adjusted predictions Number of obs = 32 Expression : Pr(grade), predict() Margin Std. Err. z P> z [95% Conf. Interval] _at display Marginal Effects for Continuous Variables Page 3

4 Put another way, for a continuous variable X k, Marginal Effect of X k = limit [Pr(Y = 1 X, X k +Δ) Pr(y=1 X, X k )] / Δ ] as Δ gets closer and closer to 0 The appendices show how to get an exact solution for this. There is no guarantee that a bigger increase in X k, e.g. 1, would produce an increase of 1*.534=.534. This is because the relationship between X k and P(Y = 1) is nonlinear. When X k is measured in small units, e.g. income in dollars, the effect of a 1 unit increase in X k may match up well with the MEM for X k. But, when X k is measured in larger units (e.g. income in millions of dollars) the MEM may or may not provide a very good approximation of the effect of a one unit increase in X k. That is probably one reason why instantaneous rates of change for continuous variables receive relatively little attention, at least in Sociology. More common are approaches which focus on discrete changes, which we will discuss later. Conclusion. Marginal effects can be an informative means for summarizing how change in a response is related to change in a covariate. For categorical variables, the effects of discrete changes are computed, i.e., the marginal effects for categorical variables show how P(Y = 1) is predicted to change as X k changes from 0 to 1 holding all other Xs equal. This can be quite useful, informative, and easy to understand. For continuous independent variables, the marginal effect measures the instantaneous rate of change. If the instantaneous rate of change is similar to the change in P(Y=1) as X k increases by one, this too can be quite useful and intuitive. However, there is no guarantee that this will be the case; it will depend, in part, on how X k is scaled. Subsequent handouts will show how the analysis of discrete changes in continuous variables can make their effects more intelligible. Marginal Effects for Continuous Variables Page 4

5 Appendix A: AMEs for continuous variables, computed manually Calculus can be used to compute marginal effects, but Cameron and Trivedi (Microeconometrics using Stata, Revised Edition, 2010, section 10,6.10, pp ) show that they can also be computed manually. The procedure is as follows: Compute the predicted values using the observed values of the variables. We will call this prediction1. Change one of the continuous independent variables by a very small amount. Cameron and Trivedi suggest using the standard deviation of the variable divided by We will refer to this as delta (Δ). Compute the new predicted values for each case. Call this prediction2. For each case, compute xxxxxx = pppppppppppppppppppp2 pppppppppppppppppppp1 ΔΔ Compute the mean value of xme. This is the AME for the variable in question. Here is an example:. * Appendix A: Compute AMEs manually. webuse nhanes2f, clear. * For convenience, keep only nonmissing cases. keep if!missing(diabetes, female, age) (2 observations deleted). clonevar xage = age. sum xage Variable Obs Mean Std. Dev. Min Max xage gen xdelta = r(sd)/1000. logit diabetes i.female xage, nolog Logistic regression Number of obs = LR chi2(2) = Prob > chi2 = Log likelihood = Pseudo R2 = diabetes Coef. Std. Err. z P> z [95% Conf. Interval] 1.female xage _cons Marginal Effects for Continuous Variables Page 5

6 . margins, dydx(xage) Average marginal effects Number of obs = Expression : Pr(diabetes), predict() dy/dx w.r.t. : xage dy/dx Std. Err. z P> z [95% Conf. Interval] xage predict xage1 (option pr assumed; Pr(diabetes)). replace xage = xage + xdelta xage was byte now float (10335 real changes made). predict xage2 (option pr assumed; Pr(diabetes)). gen xme = (xage2 - xage1) / xdelta. sum xme Variable Obs Mean Std. Dev. Min Max xme end of do-file Marginal Effects for Continuous Variables Page 6

7 Appendix B: Technical Discussion of Marginal Effects (Optional) In binary regression models, the marginal effect is the slope of the probability curve relating X k to Pr(Y=1 X), holding all other variables constant. But what is the slope of a curve??? A little calculus review will help make this clearer. Simple Explanation. Draw a graph of F(X k ) against X k, holding all the other X s constant (e.g. at their means). Chose 2 points, [X k, F(X k )] and [X k + Δ, F(X k + Δ)]. When Δ is very very small, the slope of the line connecting the two points will equal or almost equal the marginal effect of X k. More Detailed Explanation. Again, what is the slope of a curve? Intuitively, think of it this way. Draw a graph of F(X) against X, e.g. F(X) = X 2. Chose specific values of X and F(X), e.g. [2, F(2)]. Choose another point, e.g. [8, F(8)]. Draw a line connecting the points. This line has a slope. The slope is the average rate of change. Now, choose another point that is closer to [2, F(2)], e.g. [7, F(7)]. Draw a line connecting these points. This too will have a slope. Keep on choosing points that are closer to [2, F(2)]. The instantaneous rate of change is the limit of the slopes for the lines connecting [X, F(X)] and [X+Δ, F(X + Δ)] as Δ gets closer and closer to 0. Source: See the above link for more information Calculus is used to compute slopes (& marginal effects). For example, if Y = X 2, then the slope is 2X. Hence, if X = 2, the slope is 4. The following table illustrates this. Note that, as Δ gets smaller and smaller, the slope gets closer and closer to 4. Marginal Effects for Continuous Variables Page 7

8 F(X) = X^2 X = 2 F(2) = 4 δ X+δ F(X+δ) Change in F(X) Change in X Slope Marginal effects are also called instantaneous rates of change; you compute them for a variable while all other variables are held constant. The magnitude of the marginal effect depends on the values of the other variables and their coefficients. The Marginal Effect at the Mean (MEM) is popular (i.e. compute the marginal effects when all x s are at their mean) but many think that Average Marginal Effects (AMEs) are superior. Logistic Regression. Again, calculus is used to compute the marginal effects. In the case of logistic regression, F(X) = P(Y=1 X), and Returning to our earlier example, Marginal Effect for X k = P(Y=1 X) * P(Y = 0 X) * b k.. use clear. logit grade gpa tuce psi, nolog Logistic regression Number of obs = 32 LR chi2(3) = Prob > chi2 = Log likelihood = Pseudo R2 = grade Coef. Std. Err. z P> z [95% Conf. Interval] gpa tuce psi _cons adjust gpa tuce psi, pr Dependent variable: grade Equation: grade Command: logit Covariates set to mean: gpa = , tuce = , psi = All pr Key: pr = Probability Marginal Effects for Continuous Variables Page 8

9 . mfx Marginal effects after logit y = Pr(grade) (predict) = variable dy/dx Std. Err. z P> z [ 95% C.I. ] X gpa tuce psi* (*) dy/dx is for discrete change of dummy variable from 0 to 1 Looking specifically at GPA when all variables are at their means, pr(y=1 X) =.2528, Pr(Y=0 X) =.7472, and b GPA = The marginal effect at the mean for GPA is therefore Marginal Effect of GPA = P(Y=1 X) * P(Y = 0 X) * b GPA =.2528 *.7472 * =.5339 The following table again shows you that, in logistic regression, as the distance between two points gets smaller and smaller, i.e. as Δ gets closer and closer to 0, the slope of the line connecting the points gets closer and closer to the marginal effect. Logistic Regression F(X,GPA) = P(Y=1 X, GPA) GPA= Other X's at Mean F(X, ) = δ GPA+δ F(X,GPA+δ) Change in F(X,GPA) Change in GPA Slope E Probit. In probit, the marginal effect is Marginal Effect for X k = Φ(XB) * b k where Φ is the probability density function for a standardized normal variable. For example, as this diagram shows, Φ(0) =.399: Marginal Effects for Continuous Variables Page 9

10 Example:. use clear. probit grade gpa tuce psi, nolog Probit estimates Number of obs = 32 LR chi2(3) = Prob > chi2 = Log likelihood = Pseudo R2 = grade Coef. Std. Err. z P> z [95% Conf. Interval] gpa tuce psi _cons mfx Marginal effects after probit y = Pr(grade) (predict) = variable dy/dx Std. Err. z P> z [ 95% C.I. ] X gpa tuce psi* (*) dy/dx is for discrete change of dummy variable from 0 to 1. * marginal change for GPA. The invnorm function gives us the Z-score for the stated. * prob of success. The normalden function gives us the pdf value for that Z-score.. display invnorm(.2658) display normalden(invnorm(.2658)) display normalden(invnorm(.2658)) * Marginal Effect for GPA = Φ(XB) * b k = * =.5333 The following table again shows you that, in a probit model, as the distance between two points gets smaller and smaller, i.e. as Δ gets closer and closer to 0, the slope of the line connecting the points gets closer and closer to the marginal effect. Marginal Effects for Continuous Variables Page 10

11 Probit F(X,GPA) = P(Y=1 X, GPA) GPA= Other X's at Mean F(X, ) = δ GPA+δ F(X,GPA+δ) Change in F(X,GPA) Change in GPA Slope E Using the margins command for MEMs & AMEs,. quietly probit grade gpa tuce i.psi, nolog. margins, dydx(*) atmeans Conditional marginal effects Number of obs = 32 Expression : Pr(grade), predict() dy/dx w.r.t. : gpa tuce 1.psi at : gpa = (mean) tuce = (mean) 0.psi =.5625 (mean) 1.psi =.4375 (mean) dy/dx Std. Err. z P> z [95% Conf. Interval] gpa tuce psi Note: dy/dx for factor levels is the discrete change from the base level.. margins, dydx(*) Average marginal effects Number of obs = 32 Expression : Pr(grade), predict() dy/dx w.r.t. : gpa tuce 1.psi dy/dx Std. Err. z P> z [95% Conf. Interval] gpa tuce psi Note: dy/dx for factor levels is the discrete change from the base level. As a sidelight, note that the marginal effects (both MEMs and AMEs) for probit are very similar to the marginal effects for logit. This is usually the case. Marginal Effects for Continuous Variables Page 11

12 OLS. Here is what you get when you compute the marginal effects for OLS:. use clear. reg income educ jobexp black Source SS df MS Number of obs = F( 3, 496) = Model Prob > F = Residual R-squared = Adj R-squared = Total Root MSE = income Coef. Std. Err. t P> t [95% Conf. Interval] educ jobexp black _cons margins, dydx(*) atmeans Conditional marginal effects Number of obs = 500 Model VCE : OLS Expression : Linear prediction, predict() dy/dx w.r.t. : educ jobexp black at : educ = (mean) jobexp = (mean) black =.2 (mean) dy/dx Std. Err. z P> z [95% Conf. Interval] educ jobexp black margins, dydx(*) Average marginal effects Number of obs = 500 Model VCE : OLS Expression : Linear prediction, predict() dy/dx w.r.t. : educ jobexp black dy/dx Std. Err. z P> z [95% Conf. Interval] educ jobexp black The marginal effects are the same as the slope coefficients. This is because relationships are linear in OLS regression and do not vary depending on the values of the other variables. Marginal Effects for Continuous Variables Page 12

### Failure to take the sampling scheme into account can lead to inaccurate point estimates and/or flawed estimates of the standard errors.

Analyzing Complex Survey Data: Some key issues to be aware of Richard Williams, University of Notre Dame, http://www3.nd.edu/~rwilliam/ Last revised January 24, 2015 Rather than repeat material that is

### MODEL I: DRINK REGRESSED ON GPA & MALE, WITHOUT CENTERING

Interpreting Interaction Effects; Interaction Effects and Centering Richard Williams, University of Notre Dame, http://www3.nd.edu/~rwilliam/ Last revised February 20, 2015 Models with interaction effects

### Interaction effects between continuous variables (Optional)

Interaction effects between continuous variables (Optional) Richard Williams, University of Notre Dame, http://www.nd.edu/~rwilliam/ Last revised February 0, 05 This is a very brief overview of this somewhat

### How Do We Test Multiple Regression Coefficients?

How Do We Test Multiple Regression Coefficients? Suppose you have constructed a multiple linear regression model and you have a specific hypothesis to test which involves more than one regression coefficient.

### Interaction effects and group comparisons Richard Williams, University of Notre Dame, http://www3.nd.edu/~rwilliam/ Last revised February 20, 2015

Interaction effects and group comparisons Richard Williams, University of Notre Dame, http://www3.nd.edu/~rwilliam/ Last revised February 20, 2015 Note: This handout assumes you understand factor variables,

### Using Stata 11 & higher for Logistic Regression Richard Williams, University of Notre Dame, Last revised March 28, 2015

Using Stata 11 & higher for Logistic Regression Richard Williams, University of Notre Dame, http://www3.nd.edu/~rwilliam/ Last revised March 28, 2015 NOTE: The routines spost13, lrdrop1, and extremes are

### ESTIMATING AVERAGE TREATMENT EFFECTS: IV AND CONTROL FUNCTIONS, II Jeff Wooldridge Michigan State University BGSE/IZA Course in Microeconometrics

ESTIMATING AVERAGE TREATMENT EFFECTS: IV AND CONTROL FUNCTIONS, II Jeff Wooldridge Michigan State University BGSE/IZA Course in Microeconometrics July 2009 1. Quantile Treatment Effects 2. Control Functions

### From this it is not clear what sort of variable that insure is so list the first 10 observations.

MNL in Stata We have data on the type of health insurance available to 616 psychologically depressed subjects in the United States (Tarlov et al. 1989, JAMA; Wells et al. 1989, JAMA). The insurance is

### MULTIPLE REGRESSION EXAMPLE

MULTIPLE REGRESSION EXAMPLE For a sample of n = 166 college students, the following variables were measured: Y = height X 1 = mother s height ( momheight ) X 2 = father s height ( dadheight ) X 3 = 1 if

### Econ 371 Problem Set #3 Answer Sheet

Econ 371 Problem Set #3 Answer Sheet 4.1 In this question, you are told that a OLS regression analysis of third grade test scores as a function of class size yields the following estimated model. T estscore

### Statistics 104 Final Project A Culture of Debt: A Study of Credit Card Spending in America TF: Kevin Rader Anonymous Students: LD, MH, IW, MY

Statistics 104 Final Project A Culture of Debt: A Study of Credit Card Spending in America TF: Kevin Rader Anonymous Students: LD, MH, IW, MY ABSTRACT: This project attempted to determine the relationship

### REGRESSION LINES IN STATA

REGRESSION LINES IN STATA THOMAS ELLIOTT 1. Introduction to Regression Regression analysis is about eploring linear relationships between a dependent variable and one or more independent variables. Regression

### Discussion Section 4 ECON 139/239 2010 Summer Term II

Discussion Section 4 ECON 139/239 2010 Summer Term II 1. Let s use the CollegeDistance.csv data again. (a) An education advocacy group argues that, on average, a person s educational attainment would increase

### Multicollinearity Richard Williams, University of Notre Dame, http://www3.nd.edu/~rwilliam/ Last revised January 13, 2015

Multicollinearity Richard Williams, University of Notre Dame, http://www3.nd.edu/~rwilliam/ Last revised January 13, 2015 Stata Example (See appendices for full example).. use http://www.nd.edu/~rwilliam/stats2/statafiles/multicoll.dta,

### Using Stata 9 & Higher for OLS Regression Richard Williams, University of Notre Dame, http://www3.nd.edu/~rwilliam/ Last revised January 8, 2015

Using Stata 9 & Higher for OLS Regression Richard Williams, University of Notre Dame, http://www3.nd.edu/~rwilliam/ Last revised January 8, 2015 Introduction. This handout shows you how Stata can be used

### Introduction to Stata

Introduction to Stata September 23, 2014 Stata is one of a few statistical analysis programs that social scientists use. Stata is in the mid-range of how easy it is to use. Other options include SPSS,

### GETTING STARTED: STATA & R BASIC COMMANDS ECONOMETRICS II. Stata Output Regression of wages on education

GETTING STARTED: STATA & R BASIC COMMANDS ECONOMETRICS II Stata Output Regression of wages on education. sum wage educ Variable Obs Mean Std. Dev. Min Max -------------+--------------------------------------------------------

### Regression in Stata. Alicia Doyle Lynch Harvard-MIT Data Center (HMDC)

Regression in Stata Alicia Doyle Lynch Harvard-MIT Data Center (HMDC) Documents for Today Find class materials at: http://libraries.mit.edu/guides/subjects/data/ training/workshops.html Several formats

### August 2012 EXAMINATIONS Solution Part I

August 01 EXAMINATIONS Solution Part I (1) In a random sample of 600 eligible voters, the probability that less than 38% will be in favour of this policy is closest to (B) () In a large random sample,

### HURDLE AND SELECTION MODELS Jeff Wooldridge Michigan State University BGSE/IZA Course in Microeconometrics July 2009

HURDLE AND SELECTION MODELS Jeff Wooldridge Michigan State University BGSE/IZA Course in Microeconometrics July 2009 1. Introduction 2. A General Formulation 3. Truncated Normal Hurdle Model 4. Lognormal

### Lecture 16: Logistic regression diagnostics, splines and interactions. Sandy Eckel 19 May 2007

Lecture 16: Logistic regression diagnostics, splines and interactions Sandy Eckel seckel@jhsph.edu 19 May 2007 1 Logistic Regression Diagnostics Graphs to check assumptions Recall: Graphing was used to

### Correlation and Regression

Correlation and Regression Scatterplots Correlation Explanatory and response variables Simple linear regression General Principles of Data Analysis First plot the data, then add numerical summaries Look

### Institut für Soziologie Eberhard Karls Universität Tübingen www.maartenbuis.nl

from Indirect Extracting from Institut für Soziologie Eberhard Karls Universität Tübingen www.maartenbuis.nl from Indirect What is the effect of x on y? Which effect do I choose: average marginal or marginal

### Nonlinear relationships Richard Williams, University of Notre Dame, http://www3.nd.edu/~rwilliam/ Last revised February 20, 2015

Nonlinear relationships Richard Williams, University of Notre Dame, http://www.nd.edu/~rwilliam/ Last revised February, 5 Sources: Berry & Feldman s Multiple Regression in Practice 985; Pindyck and Rubinfeld

### Please follow the directions once you locate the Stata software in your computer. Room 114 (Business Lab) has computers with Stata software

STATA Tutorial Professor Erdinç Please follow the directions once you locate the Stata software in your computer. Room 114 (Business Lab) has computers with Stata software 1.Wald Test Wald Test is used

### Handling missing data in Stata a whirlwind tour

Handling missing data in Stata a whirlwind tour 2012 Italian Stata Users Group Meeting Jonathan Bartlett www.missingdata.org.uk 20th September 2012 1/55 Outline The problem of missing data and a principled

### From the help desk: hurdle models

The Stata Journal (2003) 3, Number 2, pp. 178 184 From the help desk: hurdle models Allen McDowell Stata Corporation Abstract. This article demonstrates that, although there is no command in Stata for

### Econ 371 Problem Set #3 Answer Sheet

Econ 371 Problem Set #3 Answer Sheet 4.3 In this question, you are told that a OLS regression analysis of average weekly earnings yields the following estimated model. AW E = 696.7 + 9.6 Age, R 2 = 0.023,

### Heteroskedasticity Richard Williams, University of Notre Dame, Last revised January 30, 2015

Heteroskedasticity Richard Williams, University of Notre Dame, http://www3.nd.edu/~rwilliam/ Last revised January 30, 2015 These notes draw heavily from Berry and Feldman, and, to a lesser extent, Allison,

### STATA FUNDAMENTALS FOR MIDDLEBURY COLLEGE ECONOMICS STUDENTS

STATA FUNDAMENTALS FOR MIDDLEBURY COLLEGE ECONOMICS STUDENTS BY EMILY FORREST AUGUST 2008 CONTENTS INTRODUCTION STATA SYNTAX DATASET FILES OPENING A DATASET FROM EXCEL TO STATA WORKING WITH LARGE DATASETS

### Regression Analysis. Data Calculations Output

Regression Analysis In an attempt to find answers to questions such as those posed above, empirical labour economists use a useful tool called regression analysis. Regression analysis is essentially a

### IAPRI Quantitative Analysis Capacity Building Series. Multiple regression analysis & interpreting results

IAPRI Quantitative Analysis Capacity Building Series Multiple regression analysis & interpreting results How important is R-squared? R-squared Published in Agricultural Economics 0.45 Best article of the

### Module 14: Missing Data Stata Practical

Module 14: Missing Data Stata Practical Jonathan Bartlett & James Carpenter London School of Hygiene & Tropical Medicine www.missingdata.org.uk Supported by ESRC grant RES 189-25-0103 and MRC grant G0900724

### Generalized Linear Models

Generalized Linear Models We have previously worked with regression models where the response variable is quantitative and normally distributed. Now we turn our attention to two types of models where the

### Lecture 10: Logistical Regression II Multinomial Data. Prof. Sharyn O Halloran Sustainable Development U9611 Econometrics II

Lecture 10: Logistical Regression II Multinomial Data Prof. Sharyn O Halloran Sustainable Development U9611 Econometrics II Logit vs. Probit Review Use with a dichotomous dependent variable Need a link

### Lecture 15. Endogeneity & Instrumental Variable Estimation

Lecture 15. Endogeneity & Instrumental Variable Estimation Saw that measurement error (on right hand side) means that OLS will be biased (biased toward zero) Potential solution to endogeneity instrumental

### Statistical Modelling in Stata 5: Linear Models

Statistical Modelling in Stata 5: Linear Models Mark Lunt Arthritis Research UK Centre for Excellence in Epidemiology University of Manchester 08/11/2016 Structure This Week What is a linear model? How

### Standard errors of marginal effects in the heteroskedastic probit model

Standard errors of marginal effects in the heteroskedastic probit model Thomas Cornelißen Discussion Paper No. 320 August 2005 ISSN: 0949 9962 Abstract In non-linear regression models, such as the heteroskedastic

### Lecture 13. Use and Interpretation of Dummy Variables. Stop worrying for 1 lecture and learn to appreciate the uses that dummy variables can be put to

Lecture 13. Use and Interpretation of Dummy Variables Stop worrying for 1 lecture and learn to appreciate the uses that dummy variables can be put to Using dummy variables to measure average differences

### Department of Economics Session 2012/2013. EC352 Econometric Methods. Solutions to Exercises from Week 10 + 0.0077 (0.052)

Department of Economics Session 2012/2013 University of Essex Spring Term Dr Gordon Kemp EC352 Econometric Methods Solutions to Exercises from Week 10 1 Problem 13.7 This exercise refers back to Equation

### is paramount in advancing any economy. For developed countries such as

Introduction The provision of appropriate incentives to attract workers to the health industry is paramount in advancing any economy. For developed countries such as Australia, the increasing demand for

### Quick Stata Guide by Liz Foster

by Liz Foster Table of Contents Part 1: 1 describe 1 generate 1 regress 3 scatter 4 sort 5 summarize 5 table 6 tabulate 8 test 10 ttest 11 Part 2: Prefixes and Notes 14 by var: 14 capture 14 use of the

### Department of Economics, Session 2012/2013. EC352 Econometric Methods. Exercises from Week 03

Department of Economics, Session 01/013 University of Essex, Autumn Term Dr Gordon Kemp EC35 Econometric Methods Exercises from Week 03 1 Problem P3.11 The following equation describes the median housing

### Nonlinear Regression Functions. SW Ch 8 1/54/

Nonlinear Regression Functions SW Ch 8 1/54/ The TestScore STR relation looks linear (maybe) SW Ch 8 2/54/ But the TestScore Income relation looks nonlinear... SW Ch 8 3/54/ Nonlinear Regression General

### C2.1. (i) (5 marks) The average participation rate is , the average match rate is summ prate

BOSTON COLLEGE Department of Economics EC 228 01 Econometric Methods Fall 2008, Prof. Baum, Ms. Phillips (tutor), Mr. Dmitriev (grader) Problem Set 2 Due at classtime, Thursday 2 Oct 2008 2.4 (i)(5 marks)

### 25 Working with categorical data and factor variables

25 Working with categorical data and factor variables Contents 25.1 Continuous, categorical, and indicator variables 25.1.1 Converting continuous variables to indicator variables 25.1.2 Converting continuous

### Title. Syntax. stata.com. fp Fractional polynomial regression. Estimation

Title stata.com fp Fractional polynomial regression Syntax Menu Description Options for fp Options for fp generate Remarks and examples Stored results Methods and formulas Acknowledgment References Also

### Lectures 8, 9 & 10. Multiple Regression Analysis

Lectures 8, 9 & 0. Multiple Regression Analysis In which you learn how to apply the principles and tests outlined in earlier lectures to more realistic models involving more than explanatory variable and

### 11. Analysis of Case-control Studies Logistic Regression

Research methods II 113 11. Analysis of Case-control Studies Logistic Regression This chapter builds upon and further develops the concepts and strategies described in Ch.6 of Mother and Child Health:

### The Chi-Square Diagnostic Test for Count Data Models

The Chi-Square Diagnostic Test for Count Data Models M. Manjón-Antoĺın and O. Martínez-Ibañez QURE-CREIP Department of Economics, Rovira i Virgili University. 2012 Spanish Stata Users Group Meeting (Universitat

### Data Analysis Methodology 1

Data Analysis Methodology 1 Suppose you inherited the database in Table 1.1 and needed to find out what could be learned from it fast. Say your boss entered your office and said, Here s some software project

### Econ 371 Problem Set #4 Answer Sheet. P rice = (0.485)BDR + (23.4)Bath + (0.156)Hsize + (0.002)LSize + (0.090)Age (48.

Econ 371 Problem Set #4 Answer Sheet 6.5 This question focuses on what s called a hedonic regression model; i.e., where the sales price of the home is regressed on the various attributes of the home. The

### ECON 142 SKETCH OF SOLUTIONS FOR APPLIED EXERCISE #2

University of California, Berkeley Prof. Ken Chay Department of Economics Fall Semester, 005 ECON 14 SKETCH OF SOLUTIONS FOR APPLIED EXERCISE # Question 1: a. Below are the scatter plots of hourly wages

### ECON Introductory Econometrics Seminar 9

ECON4150 - Introductory Econometrics Seminar 9 Stock and Watson EE13.1 April 28, 2015 Stock and Watson EE13.1 ECON4150 - Introductory Econometrics Seminar 9 April 28, 2015 1 / 15 Empirical exercise E13.1:

### 1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96

1 Final Review 2 Review 2.1 CI 1-propZint Scenario 1 A TV manufacturer claims in its warranty brochure that in the past not more than 10 percent of its TV sets needed any repair during the first two years

### Linear Regression with One Regressor

Linear Regression with One Regressor Michael Ash Lecture 10 Analogy to the Mean True parameter µ Y β 0 and β 1 Meaning Central tendency Intercept and slope E(Y ) E(Y X ) = β 0 + β 1 X Data Y i (X i, Y

### Stata Walkthrough 4: Regression, Prediction, and Forecasting

Stata Walkthrough 4: Regression, Prediction, and Forecasting Over drinks the other evening, my neighbor told me about his 25-year-old nephew, who is dating a 35-year-old woman. God, I can t see them getting

### How to set the main menu of STATA to default factory settings standards

University of Pretoria Data analysis for evaluation studies Examples in STATA version 11 List of data sets b1.dta (To be created by students in class) fp1.xls (To be provided to students) fp1.txt (To be

### Inference for Regression

Simple Linear Regression Inference for Regression The simple linear regression model Estimating regression parameters; Confidence intervals and significance tests for regression parameters Inference about

### Addressing Alternative. Multiple Regression. 17.871 Spring 2012

Addressing Alternative Explanations: Multiple Regression 17.871 Spring 2012 1 Did Clinton hurt Gore example Did Clinton hurt Gore in the 2000 election? Treatment is not liking Bill Clinton 2 Bivariate

### Quantitative Methods for Economics Tutorial 9. Katherine Eyal

Quantitative Methods for Economics Tutorial 9 Katherine Eyal TUTORIAL 9 4 October 2010 ECO3021S Part A: Problems 1. In Problem 2 of Tutorial 7, we estimated the equation ŝleep = 3, 638.25 0.148 totwrk

### especially with continuous

Handling interactions in Stata, especially with continuous predictors Patrick Royston & Willi Sauerbrei German Stata Users meeting, Berlin, 1 June 2012 Interactions general concepts General idea of a (two-way)

### In Chapter 2, we used linear regression to describe linear relationships. The setting for this is a

Math 143 Inference on Regression 1 Review of Linear Regression In Chapter 2, we used linear regression to describe linear relationships. The setting for this is a bivariate data set (i.e., a list of cases/subjects

### Chapter 18. Effect modification and interactions. 18.1 Modeling effect modification

Chapter 18 Effect modification and interactions 18.1 Modeling effect modification weight 40 50 60 70 80 90 100 male female 40 50 60 70 80 90 100 male female 30 40 50 70 dose 30 40 50 70 dose Figure 18.1:

### This section focuses on Chow Test and leaves general discussion on dummy variable models to other section.

Jeeshim and KUCC65 (3//008) Statistical Inferences in Linear Regression: 7 4. Tests of Structural Changes This section focuses on Chow Test and leaves general discussion on dummy variable models to other

### College Education Matters for Happier Marriages and Higher Salaries ----Evidence from State Level Data in the US

College Education Matters for Happier Marriages and Higher Salaries ----Evidence from State Level Data in the US Anonymous Authors: SH, AL, YM Contact TF: Kevin Rader Abstract It is a general consensus

### Regression Analysis: A Complete Example

Regression Analysis: A Complete Example This section works out an example that includes all the topics we have discussed so far in this chapter. A complete example of regression analysis. PhotoDisc, Inc./Getty

### xtmixed & denominator degrees of freedom: myth or magic

xtmixed & denominator degrees of freedom: myth or magic 2011 Chicago Stata Conference Phil Ender UCLA Statistical Consulting Group July 2011 Phil Ender xtmixed & denominator degrees of freedom: myth or

### Proportions as dependent variable

Proportions as dependent variable Maarten L. Buis Vrije Universiteit Amsterdam Department of Social Research Methodology http://home.fsw.vu.nl/m.buis Proportions as dependent variable p. 1/42 Outline Problems

### Multinomial and Ordinal Logistic Regression

Multinomial and Ordinal Logistic Regression ME104: Linear Regression Analysis Kenneth Benoit August 22, 2012 Regression with categorical dependent variables When the dependent variable is categorical,

### 10 Dichotomous or binary responses

10 Dichotomous or binary responses 10.1 Introduction Dichotomous or binary responses are widespread. Examples include being dead or alive, agreeing or disagreeing with a statement, and succeeding or failing

### Forecasting in STATA: Tools and Tricks

Forecasting in STATA: Tools and Tricks Introduction This manual is intended to be a reference guide for time series forecasting in STATA. It will be updated periodically during the semester, and will be

### Lab 5 Linear Regression with Within-subject Correlation. Goals: Data: Use the pig data which is in wide format:

Lab 5 Linear Regression with Within-subject Correlation Goals: Data: Fit linear regression models that account for within-subject correlation using Stata. Compare weighted least square, GEE, and random

### A Simple Feasible Alternative Procedure to Estimate Models with High-Dimensional Fixed Effects

DISCUSSION PAPER SERIES IZA DP No. 3935 A Simple Feasible Alternative Procedure to Estimate Models with High-Dimensional Fixed Effects Paulo Guimarães Pedro Portugal January 2009 Forschungsinstitut zur

### Using Stata for Categorical Data Analysis

Using Stata for Categorical Data Analysis NOTE: These problems make extensive use of Nick Cox s tab_chi, which is actually a collection of routines, and Adrian Mander s ipf command. From within Stata,

### Using Minitab for Regression Analysis: An extended example

Using Minitab for Regression Analysis: An extended example The following example uses data from another text on fertilizer application and crop yield, and is intended to show how Minitab can be used to

### Econometrics II. Lecture 9: Sample Selection Bias

Econometrics II Lecture 9: Sample Selection Bias Måns Söderbom 5 May 2011 Department of Economics, University of Gothenburg. Email: mans.soderbom@economics.gu.se. Web: www.economics.gu.se/soderbom, www.soderbom.net.

### DETERMINANTS OF CAPITAL ADEQUACY RATIO IN SELECTED BOSNIAN BANKS

DETERMINANTS OF CAPITAL ADEQUACY RATIO IN SELECTED BOSNIAN BANKS Nađa DRECA International University of Sarajevo nadja.dreca@students.ius.edu.ba Abstract The analysis of a data set of observation for 10

### Outline. Topic 4 - Analysis of Variance Approach to Regression. Partitioning Sums of Squares. Total Sum of Squares. Partitioning sums of squares

Topic 4 - Analysis of Variance Approach to Regression Outline Partitioning sums of squares Degrees of freedom Expected mean squares General linear test - Fall 2013 R 2 and the coefficient of correlation

### Missing Data Part 1: Overview, Traditional Methods Page 1

Missing Data Part 1: Overview, Traditional Methods Richard Williams, University of Notre Dame, http://www3.nd.edu/~rwilliam/ Last revised January 17, 2015 This discussion borrows heavily from: Applied

### Overview Classes. 12-3 Logistic regression (5) 19-3 Building and applying logistic regression (6) 26-3 Generalizations of logistic regression (7)

Overview Classes 12-3 Logistic regression (5) 19-3 Building and applying logistic regression (6) 26-3 Generalizations of logistic regression (7) 2-4 Loglinear models (8) 5-4 15-17 hrs; 5B02 Building and

### ECON Introductory Econometrics. Lecture 17: Experiments

ECON4150 - Introductory Econometrics Lecture 17: Experiments Monique de Haan (moniqued@econ.uio.no) Stock and Watson Chapter 13 Lecture outline 2 Why study experiments? The potential outcome framework.

### Survey Data Analysis in Stata

Survey Data Analysis in Stata Jeff Pitblado Associate Director, Statistical Software StataCorp LP 2009 Canadian Stata Users Group Meeting Outline 1 Types of data 2 2 Survey data characteristics 4 2.1 Single

### Basic Statistical and Modeling Procedures Using SAS

Basic Statistical and Modeling Procedures Using SAS One-Sample Tests The statistical procedures illustrated in this handout use two datasets. The first, Pulse, has information collected in a classroom

### III. INTRODUCTION TO LOGISTIC REGRESSION. a) Example: APACHE II Score and Mortality in Sepsis

III. INTRODUCTION TO LOGISTIC REGRESSION 1. Simple Logistic Regression a) Example: APACHE II Score and Mortality in Sepsis The following figure shows 30 day mortality in a sample of septic patients as

### BRIEF OVERVIEW ON INTERPRETING COUNT MODEL RISK RATIOS

BRIEF OVERVIEW ON INTERPRETING COUNT MODEL RISK RATIOS An Addendum to Negative Binomial Regression Cambridge University Press (2007) Joseph M. Hilbe 2008, All Rights Reserved This short monograph is intended

### We extended the additive model in two variables to the interaction model by adding a third term to the equation.

Quadratic Models We extended the additive model in two variables to the interaction model by adding a third term to the equation. Similarly, we can extend the linear model in one variable to the quadratic

### Regression III: Dummy Variable Regression

Regression III: Dummy Variable Regression Tom Ilvento FREC 408 Linear Regression Assumptions about the error term Mean of Probability Distribution of the Error term is zero Probability Distribution of

### XPost: Excel Workbooks for the Post-estimation Interpretation of Regression Models for Categorical Dependent Variables

XPost: Excel Workbooks for the Post-estimation Interpretation of Regression Models for Categorical Dependent Variables Contents Simon Cheng hscheng@indiana.edu php.indiana.edu/~hscheng/ J. Scott Long jslong@indiana.edu

### Title. Syntax. nlcom Nonlinear combinations of estimators. Nonlinear combination of estimators one expression. nlcom [ name: ] exp [, options ]

Title nlcom Nonlinear combinations of estimators Syntax Nonlinear combination of estimators one expression nlcom [ name: ] exp [, options ] Nonlinear combinations of estimators more than one expression

### Lecture 16. Endogeneity & Instrumental Variable Estimation (continued)

Lecture 16. Endogeneity & Instrumental Variable Estimation (continued) Seen how endogeneity, Cov(x,u) 0, can be caused by Omitting (relevant) variables from the model Measurement Error in a right hand

### Milk Data Analysis. 1. Objective Introduction to SAS PROC MIXED Analyzing protein milk data using STATA Refit protein milk data using PROC MIXED

1. Objective Introduction to SAS PROC MIXED Analyzing protein milk data using STATA Refit protein milk data using PROC MIXED 2. Introduction to SAS PROC MIXED The MIXED procedure provides you with flexibility

### Analyses on Hurricane Archival Data June 17, 2014

Analyses on Hurricane Archival Data June 17, 2014 This report provides detailed information about analyses of archival data in our PNAS article http://www.pnas.org/content/early/2014/05/29/1402786111.abstract

### Chapter 7: Simple linear regression Learning Objectives

Chapter 7: Simple linear regression Learning Objectives Reading: Section 7.1 of OpenIntro Statistics Video: Correlation vs. causation, YouTube (2:19) Video: Intro to Linear Regression, YouTube (5:18) -

### Examples of Multiple Linear Regression Models

ECON *: Examples of Multple Regresson Models Examples of Multple Lnear Regresson Models Data: Stata tutoral data set n text fle autoraw or autotxt Sample data: A cross-sectonal sample of 7 cars sold n

### I n d i a n a U n i v e r s i t y U n i v e r s i t y I n f o r m a t i o n T e c h n o l o g y S e r v i c e s

I n d i a n a U n i v e r s i t y U n i v e r s i t y I n f o r m a t i o n T e c h n o l o g y S e r v i c e s Linear Regression Models for Panel Data Using SAS, Stata, LIMDEP, and SPSS * Hun Myoung Park,

### Measurement Error 2: Scale Construction (Very Brief Overview) Page 1

Measurement Error 2: Scale Construction (Very Brief Overview) Richard Williams, University of Notre Dame, http://www3.nd.edu/~rwilliam/ Last revised January 22, 2015 This handout draws heavily from Marija