MODEL I: DRINK REGRESSED ON GPA & MALE, WITHOUT CENTERING

Save this PDF as:

Size: px
Start display at page:

Transcription

1 Interpreting Interaction Effects; Interaction Effects and Centering Richard Williams, University of Notre Dame, Last revised February 20, 2015 Models with interaction effects can be a little confusing to understand. The handout provides further discussion of how interaction terms should be interpreted and how centering continuous IVs (i.e. subtracting the mean from each case so the new mean is zero) doesn t actually change what a model means but can make results more interpretable. Interaction Effects Without Centering. This problem is modified from Hamilton s Statistics with Stata 5 and uses data from a survey of undergraduate students collected by Ward and Ault (1990). DRINK is measured on a 33 point scale, where higher values indicate higher levels of drinking. In the sample the mean of Drink is about 19 and the observed scores range between 4 and 33. GPA is the student s Grade Point Average (higher values indicate better grades). The average gpa is about The range of gpa theoretically goes from 0 to 4 but in actuality the lowest gpa in the sample is MALE is coded 1 if the student is male, 0 if Female. MALEGPA = MALE * GPA. Here are the descriptive statistics:. use clear (Student survey (Ward 1990)). sum male drink gpa malegpa Variable Obs Mean Std. Dev. Min Max male drink gpa malegpa First, we regress drink on gpa and male. MODEL I: DRINK REGRESSED ON GPA & MALE, WITHOUT CENTERING. regress drink gpa i.male Source SS df MS Number of obs = F( 2, 215) = Model Prob > F = Residual R-squared = Adj R-squared = Total Root MSE = drink Coef. Std. Err. t P> t [95% Conf. Interval] gpa male _cons The model does not allow for the effects of GPA to differ by gender, but it does allow for a difference in the intercepts. Interpreting each of the regression coefficients, Interpreting Interaction Effects; Interaction Effects and Centering Page 1

2 * The constant term of 26.9 is the predicted drinking score for a female with a 0 gpa. No woman in the sample actually has a gpa this low. So, you can interpret this as the depths to which a woman would plunge if she was doing that badly. * For both men and women, each one unit increase in gpa results, on average, in a decrease in the drinking scale. That is, those with higher gpas tend to drink less. * On average, men score 3.54 points higher on the drinking scale than do women with the same GPAs. As the following graph shows, the lines for men and women are parallel but the intercepts are different. Hence, with Model I, regardless of GPA, the predicted difference between a man and a woman with the same gpa is Here is a visual presentation of the results. [NOTE: The scheme(sj) option creates graphs that are formatted for publication in The Stata Journal and that are good for black and white printing.]. quietly margins male, at(gpa=(0(.5)4)). marginsplot, scheme(sj) noci ytitle() name(intonly) Variables that uniquely identify margins: gpa male Adjusted Predictions of male Grade Point Average Female Male Now see what happens once we add the interaction term. Interpreting Interaction Effects; Interaction Effects and Centering Page 2

3 MODEL II: DRINK REGRESSED ON GPA, MALE, MALEGPA, WITHOUT CENTERING. regress drink gpa male i.male#c.gpa Source SS df MS Number of obs = F( 3, 214) = Model Prob > F = Residual R-squared = Adj R-squared = Total Root MSE = drink Coef. Std. Err. t P> t [95% Conf. Interval] gpa male male#c.gpa _cons quietly margins male, at(gpa=(0(.5)4)). marginsplot, scheme(sj) noci ytitle() name(intgpa) Variables that uniquely identify margins: gpa male Predictive Margins of male Grade Point Average Female Male For convenience, we ll ignore the fact that the effect of malegpa is insignificant (otherwise I d have to scrounge around for another example.) Note that * The effects of gpa and malegpa show you that the effect of gpa is greater in magnitude for women than for men, i.e. higher gpas reduce the drinking of women more than they reduce the drinking of men. Hence, the male/female lines are no longer parallel. As a result, the difference between a man and a woman with the same gpa depends on what the gpa is. The higher the gpa, the greater the expected difference between a man and a woman is. * The intercept is still the predicted drinking score for the non-existent lazy or idiotic woman with a gpa of 0. This number is actually slightly higher than it was in Model I, which reflects the fact that Interpreting Interaction Effects; Interaction Effects and Centering Page 3

4 the estimated effect of gpa on women is now greater since it is no longer being diluted by the weaker effect that gpa has on men. * The coefficient for male in Model II, , is much smaller than it was in Model I ( ). But, this is because it now has a different meaning. Before we added interaction effects, the male/female lines were parallel, and the predicted difference between a man and a woman with the same gpa was always 3.54 regardless of what the gpa actually was. Now, however, the coefficient for male is the predicted difference between a man and a woman who both have a 0 gpa. Since no such people exist, this isn t particularly interesting. I guess you could say that a man and a woman who were doing so poorly would both hit the bottle about as much. For a man and a woman who both have average gpas of about 2.81, the predicted difference is still about 3.5. (You can compute this from the Model II coefficients.) For a man and a woman with perfect gpas, the guy is predicted to score about 5 points higher on the drinking scale. * Also, note that the coefficient for male in model II is not significant, whereas it was in Model I. But again, this reflects the fact that the coefficient has a different meaning now. In Model II, the coefficient for Male tests whether a man and woman who both have 0 gpas significantly differ in their drinking. The results show that they don t. But, at higher levels of gpa, the difference between men and women may be significant. In fact, we ll show that it is down below. * The implication is that, once you add interaction effects, the main effects may or may not be particularly interesting, at least as they stand, and you should be careful in how you interpret them. For example, it would be wrong in this case to attach some profound meaning to the change in the effect of Male; the change just reflects the fact that the Male coefficient has different meanings in the two models. Likewise, the fact that Male becomes insignificant is not particularly interesting, because it is only testing the difference between men and women at a specific point, when gpa = 0. Once interaction terms are added, you are primarily interested in their significance, rather than the significance of the terms used to compute them. Interaction Effects with Centering. If you want results that are a little more meaningful and easy to interpret, one approach is to center continuous IVs first (i.e. subtract the mean from each case), and then compute the interaction term and estimate the model. (Only center continuous variables though, i.e. you don t want to center categorical dummy variables like gender. Also, you only center IVs, not DVs.) Once we center GPA, a score of 0 on gpacentered means the person has average grades, i.e. a gpa of about In SPSS, you would run descriptive statistics to determine the means of variables. In Stata, centering is more easily accomplished.. sum gpa, meanonly. gen gpacentered = gpa - r(mean) (25 missing values generated). label variable gpacentered "Grade Point Average Centered" First, we ll estimate the model without the interaction term. Interpreting Interaction Effects; Interaction Effects and Centering Page 4

5 MODEL III: DRINK REGRESSED ON GPA & MALE, WITH CENTERING. regress drink gpacentered i.male Source SS df MS Number of obs = F( 2, 215) = Model Prob > F = Residual R-squared = Adj R-squared = Total Root MSE = drink Coef. Std. Err. t P> t [95% Conf. Interval] gpacentered male _cons quietly margins male, at(gpacentered=(-3(.5)1.5)). marginsplot, scheme(sj) noci ytitle() name(intonlycntr) Variables that uniquely identify margins: gpacentered male Adjusted Predictions of male Grade Point Average Centered Female Male Note that everything is pretty much the same as before we centered (in Model I), except the intercept has changed. In Model I, the intercept of 26.9 was the predicted score of the nonexistent destitute woman who was failing everything (no wonder she drinks so much). In Model III with gpa centered, the intercept (17.215) is the predicted drinking score of a woman with average grades. A score of 0 on gpa corresponds to a score of about on gpacentered, so it is still the case that a woman with 0 gpa would have a predicted drinking score of Hence, centering doesn t change what the model predicts, but it changes the interpretation of the intercept. Now, we ll see what happens when we add the interaction: Interpreting Interaction Effects; Interaction Effects and Centering Page 5

6 MODEL IV: DRINK REGRESSED ON GPA, MALE, MALEGPA, WITH CENTERING. regress drink gpacentered i.male i.male#c.gpacentered Source SS df MS Number of obs = F( 3, 214) = Model Prob > F = Residual R-squared = Adj R-squared = Total Root MSE = drink Coef. Std. Err. t P> t [95% Conf. Interval] gpacentered male male#c.gpacentered _cons quietly margins male, at(gpacentered=(-3(.5)1.5)). marginsplot, scheme(sj) noci ytitle() name(intgpacntr) Variables that uniquely identify margins: gpacentered male Adjusted Predictions of male Grade Point Average Centered Female Male Note that * Except for Male and the Constant, the various model terms in Model IV are the same as before we centered in Model II. Likewise, the plot is the same, except everything has been shifted to the left because we centered gpa. (If we used the uncentered GPA, the plots would be identical.) Centering does not change the substantive meaning of the model or the predictions that are made; but it may make the results more easily interpretable. * The intercept in Model IV, 17.26, now reflects the average drinking score for a woman with an average gpa, rather than the predicted score for the non-existent drunkard who has failed everything. Since such a person (or somebody close to her) actually does exist, the intercept is more meaningful than it was in Model II. Interpreting Interaction Effects; Interaction Effects and Centering Page 6

7 * The coefficient for male (3.55) is now the average difference between a male with an average gpa and a female with an average gpa. This is probably more meaningful than looking at the difference between the nonexistent man and woman who are flunking everything. * With gpa centered, adding the interaction term produces much less change in the estimated effect of male between Models III and IV than it did when gpa was not centered (Model I versus Model II). At least in this case, this is because the predicted difference between the average man and woman is about the same regardless of whether the model includes interaction terms or not, whereas the predicted difference between a man and a woman who are failing everything changes quite a bit once you add the interaction term. Further, the Model IV difference in drinking between the average man and the average woman is statistically significant, even though the Model II difference between the 0 gpa man and woman is not. Other Issues and Options to Be Aware of If you do center, be consistent throughout, i.e. different sample selections could produce different means, so comparing results produced by different centerings could be deceptive. You don t have to use the mean when centering; you could use any value that was of substantive interest. o o o For example, if you were particularly interested in comparing male and female C students to each other, you could subtract 2.0 from each gpa. Then, a score of 0 on gpacentered would correspond to a C gpa. The intercept would be the predicted drinking score for a C woman, and the male coefficient would be the predicted difference between C men and C women. Or, in our Income/Education example, you could subtract 12 from education so that a score of 0 on centered education corresponded to a high school degree. You would then modify your interpretations accordingly, i.e. the main effect of Income would be the effect of Income for people who had a high school degree, the main effect of Education would still be the effect of education for a person with average income, and the intercept would be the predicted Y score for a person with average income and a high school degree. Basically, the key is to have a score of 0 on the IV correspond to something that is substantively interesting, rather than have it be a value that could not (or at least does not) actually occur in the data. Conclusions You don t have to center continuous IVs in a model with interaction terms. It won t actually change what the model means or what it predicts. But, centering continuous IVs and/or presenting plots may make your coefficients more interpretable. If you don t center, don t get hung up looking at changes in the main effects of the variables used to compute the interactions. These are to be expected, because the meaning of these terms changes once you add the interaction terms. Also, don t be concerned if the main effect of the dummy is insignificant once you ve added the interaction; this just means that, when the IVs = 0, the difference between groups is insignificant, but it may be significant when the IV does not = 0. Once interaction effects are added, the more critical thing is the significance of the interaction terms, not the terms that were used to compute the interactions. Whether you center or not, the interaction terms will stay the same. Interpreting Interaction Effects; Interaction Effects and Centering Page 7

8 Appendix: Marginal Effects and Confidence Intervals Here is another approach that may be useful. Rather than plot separate lines for men and women, we can plot a single line that shows the difference between the predicted values for each gender. This is known as the marginal effect of gender. In an OLS regression analysis the Marginal Effect for a categorical variable shows how E(Y) changes as the categorical variable changes from 0 to 1, after controlling in some way for any other variables in the model. With a dichotomous independent variable, the ME is the difference in the adjusted predictions for the two groups, in this case men and women. Also, I have been excluding confidence intervals in my graphs; but if you include them, you can easily see whether and when the differences in adjusted predictions are statistically significant, i.e. if the confidence interval for the marginal effect includes 0 the difference is not statistically significant, otherwise it is. Here is an example:. quietly regress drink gpa male i.male#c.gpa. quietly margins r.male, at(gpa=(0(.2)4)). marginsplot, scheme(sj) ytitle() /// > yline(0) ylabel(#10) xlabel(#20) name(margeffect) Variables that uniquely identify margins: gpa Contrasts of Predictive Margins of male with 95% CIs Grade Point Average As before, this shows us that the predicted difference between a man and a woman who both have a gpa of 0 is almost zero. For a man and a woman with average gpa (2.81) the predicted difference is about 3.5; and for a 4.0 gpa the predicted difference is around 5. The confidence intervals, however, reveal that the predicted differences are not statistically significant until gpa is about 2.2 or greater. Interpreting Interaction Effects; Interaction Effects and Centering Page 8

Interaction effects between continuous variables (Optional)

Interaction effects between continuous variables (Optional) Richard Williams, University of Notre Dame, http://www.nd.edu/~rwilliam/ Last revised February 0, 05 This is a very brief overview of this somewhat

Marginal Effects for Continuous Variables Richard Williams, University of Notre Dame, http://www3.nd.edu/~rwilliam/ Last revised February 21, 2015

Marginal Effects for Continuous Variables Richard Williams, University of Notre Dame, http://www3.nd.edu/~rwilliam/ Last revised February 21, 2015 References: Long 1997, Long and Freese 2003 & 2006 & 2014,

Interaction effects and group comparisons Richard Williams, University of Notre Dame, http://www3.nd.edu/~rwilliam/ Last revised February 20, 2015

Interaction effects and group comparisons Richard Williams, University of Notre Dame, http://www3.nd.edu/~rwilliam/ Last revised February 20, 2015 Note: This handout assumes you understand factor variables,

REGRESSION LINES IN STATA

REGRESSION LINES IN STATA THOMAS ELLIOTT 1. Introduction to Regression Regression analysis is about eploring linear relationships between a dependent variable and one or more independent variables. Regression

Stata Walkthrough 4: Regression, Prediction, and Forecasting

Stata Walkthrough 4: Regression, Prediction, and Forecasting Over drinks the other evening, my neighbor told me about his 25-year-old nephew, who is dating a 35-year-old woman. God, I can t see them getting

MULTIPLE REGRESSION EXAMPLE

MULTIPLE REGRESSION EXAMPLE For a sample of n = 166 college students, the following variables were measured: Y = height X 1 = mother s height ( momheight ) X 2 = father s height ( dadheight ) X 3 = 1 if

Introduction to Stata

Introduction to Stata September 23, 2014 Stata is one of a few statistical analysis programs that social scientists use. Stata is in the mid-range of how easy it is to use. Other options include SPSS,

Failure to take the sampling scheme into account can lead to inaccurate point estimates and/or flawed estimates of the standard errors.

Analyzing Complex Survey Data: Some key issues to be aware of Richard Williams, University of Notre Dame, http://www3.nd.edu/~rwilliam/ Last revised January 24, 2015 Rather than repeat material that is

August 2012 EXAMINATIONS Solution Part I

August 01 EXAMINATIONS Solution Part I (1) In a random sample of 600 eligible voters, the probability that less than 38% will be in favour of this policy is closest to (B) () In a large random sample,

Correlation and Regression

Correlation and Regression Scatterplots Correlation Explanatory and response variables Simple linear regression General Principles of Data Analysis First plot the data, then add numerical summaries Look

Nonlinear relationships Richard Williams, University of Notre Dame, http://www3.nd.edu/~rwilliam/ Last revised February 20, 2015

Nonlinear relationships Richard Williams, University of Notre Dame, http://www.nd.edu/~rwilliam/ Last revised February, 5 Sources: Berry & Feldman s Multiple Regression in Practice 985; Pindyck and Rubinfeld

Multicollinearity Richard Williams, University of Notre Dame, http://www3.nd.edu/~rwilliam/ Last revised January 13, 2015

Multicollinearity Richard Williams, University of Notre Dame, http://www3.nd.edu/~rwilliam/ Last revised January 13, 2015 Stata Example (See appendices for full example).. use http://www.nd.edu/~rwilliam/stats2/statafiles/multicoll.dta,

Regression in Stata. Alicia Doyle Lynch Harvard-MIT Data Center (HMDC)

Regression in Stata Alicia Doyle Lynch Harvard-MIT Data Center (HMDC) Documents for Today Find class materials at: http://libraries.mit.edu/guides/subjects/data/ training/workshops.html Several formats

Regression Analysis. Data Calculations Output

Regression Analysis In an attempt to find answers to questions such as those posed above, empirical labour economists use a useful tool called regression analysis. Regression analysis is essentially a

Using Stata 9 & Higher for OLS Regression Richard Williams, University of Notre Dame, http://www3.nd.edu/~rwilliam/ Last revised January 8, 2015

Using Stata 9 & Higher for OLS Regression Richard Williams, University of Notre Dame, http://www3.nd.edu/~rwilliam/ Last revised January 8, 2015 Introduction. This handout shows you how Stata can be used

Statistics 104 Final Project A Culture of Debt: A Study of Credit Card Spending in America TF: Kevin Rader Anonymous Students: LD, MH, IW, MY

Statistics 104 Final Project A Culture of Debt: A Study of Credit Card Spending in America TF: Kevin Rader Anonymous Students: LD, MH, IW, MY ABSTRACT: This project attempted to determine the relationship

Using Stata 11 & higher for Logistic Regression Richard Williams, University of Notre Dame, Last revised March 28, 2015

Using Stata 11 & higher for Logistic Regression Richard Williams, University of Notre Dame, http://www3.nd.edu/~rwilliam/ Last revised March 28, 2015 NOTE: The routines spost13, lrdrop1, and extremes are

Department of Economics Session 2012/2013. EC352 Econometric Methods. Solutions to Exercises from Week 10 + 0.0077 (0.052)

Department of Economics Session 2012/2013 University of Essex Spring Term Dr Gordon Kemp EC352 Econometric Methods Solutions to Exercises from Week 10 1 Problem 13.7 This exercise refers back to Equation

IAPRI Quantitative Analysis Capacity Building Series. Multiple regression analysis & interpreting results

IAPRI Quantitative Analysis Capacity Building Series Multiple regression analysis & interpreting results How important is R-squared? R-squared Published in Agricultural Economics 0.45 Best article of the

Main Effects and Interactions

Main Effects & Interactions page 1 Main Effects and Interactions So far, we ve talked about studies in which there is just one independent variable, such as violence of television program. You might randomly

STATA Tutorial Professor Erdinç Please follow the directions once you locate the Stata software in your computer. Room 114 (Business Lab) has computers with Stata software 1.Wald Test Wald Test is used

ESTIMATING AVERAGE TREATMENT EFFECTS: IV AND CONTROL FUNCTIONS, II Jeff Wooldridge Michigan State University BGSE/IZA Course in Microeconometrics

ESTIMATING AVERAGE TREATMENT EFFECTS: IV AND CONTROL FUNCTIONS, II Jeff Wooldridge Michigan State University BGSE/IZA Course in Microeconometrics July 2009 1. Quantile Treatment Effects 2. Control Functions

Nonlinear Regression Functions. SW Ch 8 1/54/

Nonlinear Regression Functions SW Ch 8 1/54/ The TestScore STR relation looks linear (maybe) SW Ch 8 2/54/ But the TestScore Income relation looks nonlinear... SW Ch 8 3/54/ Nonlinear Regression General

Addressing Alternative. Multiple Regression. 17.871 Spring 2012

Addressing Alternative Explanations: Multiple Regression 17.871 Spring 2012 1 Did Clinton hurt Gore example Did Clinton hurt Gore in the 2000 election? Treatment is not liking Bill Clinton 2 Bivariate

Discussion Section 4 ECON 139/239 2010 Summer Term II

Discussion Section 4 ECON 139/239 2010 Summer Term II 1. Let s use the CollegeDistance.csv data again. (a) An education advocacy group argues that, on average, a person s educational attainment would increase

MODELING AUTO INSURANCE PREMIUMS Brittany Parahus, Siena College INTRODUCTION The findings in this paper will provide the reader with a basic knowledge and understanding of how Auto Insurance Companies

College Education Matters for Happier Marriages and Higher Salaries ----Evidence from State Level Data in the US

College Education Matters for Happier Marriages and Higher Salaries ----Evidence from State Level Data in the US Anonymous Authors: SH, AL, YM Contact TF: Kevin Rader Abstract It is a general consensus

Lecture 15. Endogeneity & Instrumental Variable Estimation

Lecture 15. Endogeneity & Instrumental Variable Estimation Saw that measurement error (on right hand side) means that OLS will be biased (biased toward zero) Potential solution to endogeneity instrumental

Chapter 7: Simple linear regression Learning Objectives

Chapter 7: Simple linear regression Learning Objectives Reading: Section 7.1 of OpenIntro Statistics Video: Correlation vs. causation, YouTube (2:19) Video: Intro to Linear Regression, YouTube (5:18) -

Rockefeller College University at Albany

Rockefeller College University at Albany PAD 705 Handout: Hypothesis Testing on Multiple Parameters In many cases we may wish to know whether two or more variables are jointly significant in a regression.

Measurement Error 2: Scale Construction (Very Brief Overview) Page 1

Measurement Error 2: Scale Construction (Very Brief Overview) Richard Williams, University of Notre Dame, http://www3.nd.edu/~rwilliam/ Last revised January 22, 2015 This handout draws heavily from Marija

Forecasting in STATA: Tools and Tricks

Forecasting in STATA: Tools and Tricks Introduction This manual is intended to be a reference guide for time series forecasting in STATA. It will be updated periodically during the semester, and will be

Econ 371 Problem Set #3 Answer Sheet

Econ 371 Problem Set #3 Answer Sheet 4.3 In this question, you are told that a OLS regression analysis of average weekly earnings yields the following estimated model. AW E = 696.7 + 9.6 Age, R 2 = 0.023,

Testing and Interpreting Interactions in Regression In a Nutshell

Testing and Interpreting Interactions in Regression In a Nutshell The principles given here always apply when interpreting the coefficients in a multiple regression analysis containing interactions. However,

Answer: C. The strength of a correlation does not change if units change by a linear transformation such as: Fahrenheit = 32 + (5/9) * Centigrade

Statistics Quiz Correlation and Regression -- ANSWERS 1. Temperature and air pollution are known to be correlated. We collect data from two laboratories, in Boston and Montreal. Boston makes their measurements

From this it is not clear what sort of variable that insure is so list the first 10 observations.

MNL in Stata We have data on the type of health insurance available to 616 psychologically depressed subjects in the United States (Tarlov et al. 1989, JAMA; Wells et al. 1989, JAMA). The insurance is

Lectures 8, 9 & 10. Multiple Regression Analysis

Lectures 8, 9 & 0. Multiple Regression Analysis In which you learn how to apply the principles and tests outlined in earlier lectures to more realistic models involving more than explanatory variable and

Multinomial and Ordinal Logistic Regression

Multinomial and Ordinal Logistic Regression ME104: Linear Regression Analysis Kenneth Benoit August 22, 2012 Regression with categorical dependent variables When the dependent variable is categorical,

1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96

1 Final Review 2 Review 2.1 CI 1-propZint Scenario 1 A TV manufacturer claims in its warranty brochure that in the past not more than 10 percent of its TV sets needed any repair during the first two years

ECON 142 SKETCH OF SOLUTIONS FOR APPLIED EXERCISE #2

University of California, Berkeley Prof. Ken Chay Department of Economics Fall Semester, 005 ECON 14 SKETCH OF SOLUTIONS FOR APPLIED EXERCISE # Question 1: a. Below are the scatter plots of hourly wages

Data Analysis Methodology 1

Data Analysis Methodology 1 Suppose you inherited the database in Table 1.1 and needed to find out what could be learned from it fast. Say your boss entered your office and said, Here s some software project

International Statistical Institute, 56th Session, 2007: Phil Everson

Teaching Regression using American Football Scores Everson, Phil Swarthmore College Department of Mathematics and Statistics 5 College Avenue Swarthmore, PA198, USA E-mail: peverso1@swarthmore.edu 1. Introduction

We extended the additive model in two variables to the interaction model by adding a third term to the equation.

Quadratic Models We extended the additive model in two variables to the interaction model by adding a third term to the equation. Similarly, we can extend the linear model in one variable to the quadratic

When to use Excel. When NOT to use Excel 9/24/2014

Analyzing Quantitative Assessment Data with Excel October 2, 2014 Jeremy Penn, Ph.D. Director When to use Excel You want to quickly summarize or analyze your assessment data You want to create basic visual

Factor B: Curriculum New Math Control Curriculum (B (B 1 ) Overall Mean (marginal) Females (A 1 ) Factor A: Gender Males (A 2) X 21

1 Factorial ANOVA The ANOVA designs we have dealt with up to this point, known as simple ANOVA or oneway ANOVA, had only one independent grouping variable or factor. However, oftentimes a researcher has

is paramount in advancing any economy. For developed countries such as

Introduction The provision of appropriate incentives to attract workers to the health industry is paramount in advancing any economy. For developed countries such as Australia, the increasing demand for

Examples of Multiple Linear Regression Models

ECON *: Examples of Multple Regresson Models Examples of Multple Lnear Regresson Models Data: Stata tutoral data set n text fle autoraw or autotxt Sample data: A cross-sectonal sample of 7 cars sold n

Lab 5 Linear Regression with Within-subject Correlation. Goals: Data: Use the pig data which is in wide format:

Lab 5 Linear Regression with Within-subject Correlation Goals: Data: Fit linear regression models that account for within-subject correlation using Stata. Compare weighted least square, GEE, and random

Linear Regression Models with Logarithmic Transformations

Linear Regression Models with Logarithmic Transformations Kenneth Benoit Methodology Institute London School of Economics kbenoit@lse.ac.uk March 17, 2011 1 Logarithmic transformations of variables Considering

" Y. Notation and Equations for Regression Lecture 11/4. Notation:

Notation: Notation and Equations for Regression Lecture 11/4 m: The number of predictor variables in a regression Xi: One of multiple predictor variables. The subscript i represents any number from 1 through

MULTIPLE REGRESSION WITH CATEGORICAL DATA

DEPARTMENT OF POLITICAL SCIENCE AND INTERNATIONAL RELATIONS Posc/Uapp 86 MULTIPLE REGRESSION WITH CATEGORICAL DATA I. AGENDA: A. Multiple regression with categorical variables. Coding schemes. Interpreting

False. Model 2 is not a special case of Model 1, because Model 2 includes X5, which is not part of Model 1. What she ought to do is estimate

Sociology 59 - Research Statistics I Final Exam Answer Key December 6, 00 Where appropriate, show your work - partial credit may be given. (On the other hand, don't waste a lot of time on excess verbiage.)

Regression Analysis: A Complete Example

Regression Analysis: A Complete Example This section works out an example that includes all the topics we have discussed so far in this chapter. A complete example of regression analysis. PhotoDisc, Inc./Getty

Using Stata for Categorical Data Analysis

Using Stata for Categorical Data Analysis NOTE: These problems make extensive use of Nick Cox s tab_chi, which is actually a collection of routines, and Adrian Mander s ipf command. From within Stata,

Outline. Topic 4 - Analysis of Variance Approach to Regression. Partitioning Sums of Squares. Total Sum of Squares. Partitioning sums of squares

Topic 4 - Analysis of Variance Approach to Regression Outline Partitioning sums of squares Degrees of freedom Expected mean squares General linear test - Fall 2013 R 2 and the coefficient of correlation

Using Minitab for Regression Analysis: An extended example

Using Minitab for Regression Analysis: An extended example The following example uses data from another text on fertilizer application and crop yield, and is intended to show how Minitab can be used to

11. Analysis of Case-control Studies Logistic Regression

Research methods II 113 11. Analysis of Case-control Studies Logistic Regression This chapter builds upon and further develops the concepts and strategies described in Ch.6 of Mother and Child Health:

25 Working with categorical data and factor variables

25 Working with categorical data and factor variables Contents 25.1 Continuous, categorical, and indicator variables 25.1.1 Converting continuous variables to indicator variables 25.1.2 Converting continuous

Chapter 4 and 5 solutions

Chapter 4 and 5 solutions 4.4. Three different washing solutions are being compared to study their effectiveness in retarding bacteria growth in five gallon milk containers. The analysis is done in a laboratory,

SPSS Resources. 1. See website (readings) for SPSS tutorial & Stats handout

Analyzing Data SPSS Resources 1. See website (readings) for SPSS tutorial & Stats handout Don t have your own copy of SPSS? 1. Use the libraries to analyze your data 2. Download a trial version of SPSS

3: Graphing Data. Objectives

3: Graphing Data Objectives Create histograms, box plots, stem-and-leaf plots, pie charts, bar graphs, scatterplots, and line graphs Edit graphs using the Chart Editor Use chart templates SPSS has the

Handling missing data in Stata a whirlwind tour

Handling missing data in Stata a whirlwind tour 2012 Italian Stata Users Group Meeting Jonathan Bartlett www.missingdata.org.uk 20th September 2012 1/55 Outline The problem of missing data and a principled

Title. Syntax. stata.com. fp Fractional polynomial regression. Estimation

Title stata.com fp Fractional polynomial regression Syntax Menu Description Options for fp Options for fp generate Remarks and examples Stored results Methods and formulas Acknowledgment References Also

HYPOTHESIS TESTING: CONFIDENCE INTERVALS, T-TESTS, ANOVAS, AND REGRESSION

HYPOTHESIS TESTING: CONFIDENCE INTERVALS, T-TESTS, ANOVAS, AND REGRESSION HOD 2990 10 November 2010 Lecture Background This is a lightning speed summary of introductory statistical methods for senior undergraduate

A Picture Really Is Worth a Thousand Words

4 A Picture Really Is Worth a Thousand Words Difficulty Scale (pretty easy, but not a cinch) What you ll learn about in this chapter Why a picture is really worth a thousand words How to create a histogram

Moderation. Moderation

Stats - Moderation Moderation A moderator is a variable that specifies conditions under which a given predictor is related to an outcome. The moderator explains when a DV and IV are related. Moderation

Chapter 7 Section 7.1: Inference for the Mean of a Population

Chapter 7 Section 7.1: Inference for the Mean of a Population Now let s look at a similar situation Take an SRS of size n Normal Population : N(, ). Both and are unknown parameters. Unlike what we used

Econometrics Problem Set #3

Econometrics Problem Set #3 Nathaniel Higgins nhiggins@jhu.edu Assignment The assignment was to read chapter 3 and hand in answers to the following problems at the end of the chapter: C3.1 C3.8. C3.1 A

How to set the main menu of STATA to default factory settings standards

University of Pretoria Data analysis for evaluation studies Examples in STATA version 11 List of data sets b1.dta (To be created by students in class) fp1.xls (To be provided to students) fp1.txt (To be

Quick Stata Guide by Liz Foster

by Liz Foster Table of Contents Part 1: 1 describe 1 generate 1 regress 3 scatter 4 sort 5 summarize 5 table 6 tabulate 8 test 10 ttest 11 Part 2: Prefixes and Notes 14 by var: 14 capture 14 use of the

GETTING STARTED: STATA & R BASIC COMMANDS ECONOMETRICS II. Stata Output Regression of wages on education

GETTING STARTED: STATA & R BASIC COMMANDS ECONOMETRICS II Stata Output Regression of wages on education. sum wage educ Variable Obs Mean Std. Dev. Min Max -------------+--------------------------------------------------------

4. Descriptive Statistics: Measures of Variability and Central Tendency

4. Descriptive Statistics: Measures of Variability and Central Tendency Objectives Calculate descriptive for continuous and categorical data Edit output tables Although measures of central tendency and

Implementation Committee for Gender Based Salary Adjustments (as identified in the Pay Equity Report, 2005)

Implementation Committee for Gender Based Salary Adjustments (as identified in the Pay Equity Report, 2005) Final Report March 2006 Implementation Committee for Gender Based Salary Adjustments (as identified

Development of the nomolog program and its evolution

Development of the nomolog program and its evolution Towards the implementation of a nomogram generator for the Cox regression Alexander Zlotnik, Telecom.Eng. Víctor Abraira Santos, PhD Ramón y Cajal University

DETERMINANTS OF CAPITAL ADEQUACY RATIO IN SELECTED BOSNIAN BANKS

DETERMINANTS OF CAPITAL ADEQUACY RATIO IN SELECTED BOSNIAN BANKS Nađa DRECA International University of Sarajevo nadja.dreca@students.ius.edu.ba Abstract The analysis of a data set of observation for 10

How To Run Statistical Tests in Excel

How To Run Statistical Tests in Excel Microsoft Excel is your best tool for storing and manipulating data, calculating basic descriptive statistics such as means and standard deviations, and conducting

The Dummy s Guide to Data Analysis Using SPSS

The Dummy s Guide to Data Analysis Using SPSS Mathematics 57 Scripps College Amy Gamble April, 2001 Amy Gamble 4/30/01 All Rights Rerserved TABLE OF CONTENTS PAGE Helpful Hints for All Tests...1 Tests

Regression Analysis: Basic Concepts

The simple linear model Regression Analysis: Basic Concepts Allin Cottrell Represents the dependent variable, y i, as a linear function of one independent variable, x i, subject to a random disturbance

The importance of graphing the data: Anscombe s regression examples

The importance of graphing the data: Anscombe s regression examples Bruce Weaver Northern Health Research Conference Nipissing University, North Bay May 30-31, 2008 B. Weaver, NHRC 2008 1 The Objective

Introduction to Analysis of Variance (ANOVA) Limitations of the t-test

Introduction to Analysis of Variance (ANOVA) The Structural Model, The Summary Table, and the One- Way ANOVA Limitations of the t-test Although the t-test is commonly used, it has limitations Can only

Regression of Systolic Blood Pressure on Age, Weight & Cholesterol

Regression of Systolic Blood Pressure on Age, Weight & Cholesterol 1 * bp.sas; 2 options ls=120 ps=75 nocenter nodate; 3 title Regression of Systolic Blood Pressure on Age, Weight & Cholesterol ; 4 * BP

NCSS Statistical Software Principal Components Regression. In ordinary least squares, the regression coefficients are estimated using the formula ( )

Chapter 340 Principal Components Regression Introduction is a technique for analyzing multiple regression data that suffer from multicollinearity. When multicollinearity occurs, least squares estimates

Analyses on Hurricane Archival Data June 17, 2014

Analyses on Hurricane Archival Data June 17, 2014 This report provides detailed information about analyses of archival data in our PNAS article http://www.pnas.org/content/early/2014/05/29/1402786111.abstract

The Numbers Behind the MLB Anonymous Students: AD, CD, BM; (TF: Kevin Rader)

The Numbers Behind the MLB Anonymous Students: AD, CD, BM; (TF: Kevin Rader) Abstract This project measures the effects of various baseball statistics on the win percentage of all the teams in MLB. Data

xtmixed & denominator degrees of freedom: myth or magic

xtmixed & denominator degrees of freedom: myth or magic 2011 Chicago Stata Conference Phil Ender UCLA Statistical Consulting Group July 2011 Phil Ender xtmixed & denominator degrees of freedom: myth or

The average hotel manager recognizes the criticality of forecasting. However, most

Introduction The average hotel manager recognizes the criticality of forecasting. However, most managers are either frustrated by complex models researchers constructed or appalled by the amount of time

HURDLE AND SELECTION MODELS Jeff Wooldridge Michigan State University BGSE/IZA Course in Microeconometrics July 2009

HURDLE AND SELECTION MODELS Jeff Wooldridge Michigan State University BGSE/IZA Course in Microeconometrics July 2009 1. Introduction 2. A General Formulation 3. Truncated Normal Hurdle Model 4. Lognormal

ANOVA. February 12, 2015

ANOVA February 12, 2015 1 ANOVA models Last time, we discussed the use of categorical variables in multivariate regression. Often, these are encoded as indicator columns in the design matrix. In [1]: %%R

1.1. Simple Regression in Excel (Excel 2010).

.. Simple Regression in Excel (Excel 200). To get the Data Analysis tool, first click on File > Options > Add-Ins > Go > Select Data Analysis Toolpack & Toolpack VBA. Data Analysis is now available under

STATA FUNDAMENTALS FOR MIDDLEBURY COLLEGE ECONOMICS STUDENTS

STATA FUNDAMENTALS FOR MIDDLEBURY COLLEGE ECONOMICS STUDENTS BY EMILY FORREST AUGUST 2008 CONTENTS INTRODUCTION STATA SYNTAX DATASET FILES OPENING A DATASET FROM EXCEL TO STATA WORKING WITH LARGE DATASETS

One-Way Analysis of Variance

One-Way Analysis of Variance Note: Much of the math here is tedious but straightforward. We ll skim over it in class but you should be sure to ask questions if you don t understand it. I. Overview A. We

1/27/2013. PSY 512: Advanced Statistics for Psychological and Behavioral Research 2

PSY 512: Advanced Statistics for Psychological and Behavioral Research 2 Introduce moderated multiple regression Continuous predictor continuous predictor Continuous predictor categorical predictor Understand

High School Graduation Rates in Maryland Technical Appendix

High School Graduation Rates in Maryland Technical Appendix Data All data for the brief were obtained from the National Center for Education Statistics Common Core of Data (CCD). This data represents the

Quantitative Methods for Economics Tutorial 9. Katherine Eyal

Quantitative Methods for Economics Tutorial 9 Katherine Eyal TUTORIAL 9 4 October 2010 ECO3021S Part A: Problems 1. In Problem 2 of Tutorial 7, we estimated the equation ŝleep = 3, 638.25 0.148 totwrk

Implementation Committee for Gender Based Salary Adjustments (as identified in the Pay Equity Report, 2005)

Implementation Committee for Gender Based Salary Adjustments (as identified in the Pay Equity Report, 2005) Final Report March 2006 Implementation Committee for Gender Based Salary Adjustments (as identified

Econometrics Problem Set #2

Econometrics Problem Set #2 Nathaniel Higgins nhiggins@jhu.edu Assignment The homework assignment was to read chapter 2 and hand in answers to the following problems at the end of the chapter: 2.1 2.5

THE EFFECT ON STUDENT PERFORMANCE OF WEB- BASED LEARNING AND HOMEWORK IN MICROECONOMICS

Page 115 THE EFFECT ON STUDENT PERFORMANCE OF WEB- BASED LEARNING AND HOMEWORK IN MICROECONOMICS Jay A. Johnson, Southeastern Louisiana University Russell McKenzie, Southeastern Louisiana University ABSTRACT