1. A two category dummy variable without interactions.

Size: px
Start display at page:

Download "1. A two category dummy variable without interactions."

Transcription

1 Vartanian: Data Analysis 541 Dummy Variables in Regression Models 1. A two category dummy variable without interactions. You want to test to see if the mean values for males and females are different from one another. Since this is an OLS model, we need some interval level dependent variable. We'll use income. What you will do is create a single dummy variable that will indicate whether or not the individual is a male or a female. Z 1 is a (0,1) variable. If Z 1 =1 then the observation is a male. If Z 1 =0 then the observation is a female. (1) Y=a+c 1 Z 1 Where c 1 is the parameter estimate for the variable Z 1. c 1 will tell you the difference in income levels between the two groups. If c 1 is a positive number, this will indicate that males have a mean income level that is c 1 larger than females. If c 1 is negative, males will have a mean income level that is c 1 less than females. Why? What you have to do is substitute the different values of Z 1 in the equation to determine the overall mean income levels for the two groups. For males, you will substitute in Z 1 =1 in the equation 1 above. For females, you will substitute in Z 1 =0. You will get a mean income level for both groups by doing this. Why not include variables for both males and females? Because if you did, you would have perfect multicollinearity within your model. Why? If you know that the person is a male, you have perfect knowledge that the person is not a female. Thus, if you're coding your variables as 0,1 for male and female, you would have something that looks like the following: obs male female In other words, when you know the value of either male or female, you can perfectly predict the d:\wp61\probsets\lect2.phd\dummyvar.phd Page 1

2 value of the other variable. Thus, the correlation coefficient between these two variables is perfect, or r=1. We will not be able to determine a SE for these regression coefficients because SE by1.2 ' S 2 y.12 [ j X 2& ( j x 1 )2 ](1&r n ). Also, we cannot determine b coefficients since we cannot distinguish between the two variables. With perfect collinearity, SAS and other computer programs will reject one of the variables but will determine the b s and SEs for the other perfectly correlated variable. To determine the a (or the intercept) and c coefficients for males, (2) Y=a+c 1 *1 Y=a+c 1 For Females, (3) Y=a+c 1 *0 Y=a The difference between the two groups is c 1. Example: Let's say that we only have 6 observations in a sample so that it will be easy to see how dummy variables work. Sex Income ($) From this, you can see that the mean income levels are: Females: 300 Males: 200 In other words, there is a $100 difference in income levels between the two groups. If Z 1 =1 for Page 2

3 males and Z 1 =0 for females, then the equation will have a c 1 value = -100, since the difference in mean income levels is 100 and male mean income is lower than female mean income. What will a (or the intercept) be? We know the value for the mean income levels, so a, the intercept, must make the mean income levels come out to the 300 and 200 levels above. Thus, for females, Y= a (Z 1 ) We know that Z 1 =0 for females, therefore in order to make y=300, a=300. For males, Y= a (Z 1 ) We know that Z 1 =1 for males, therefore in order to make y=200, a=300. Of course, a will always turn out to be the same value in both equations (since they're the same equation but have different values for Z 1 ). This is the logic of dummy variables. When you actually use dummy variables, you will often use them with other variables within the model so that determining the values of a and c 1 will not always be so evident. Using dummy variables in an OLS regression model will give you similar information to using group t-tests or anova models with two groups. The significance levels will be exactly the same using any of these three different statistics (t-test, anova, or dummy variables). 2. Using categorical variables with more than two categories as dummy variables. In this second situation, you are examining the differences in income level among people of different races. Let's assume that there are people of 4 different races you're examining: Whites, African Americans, Hispanics and Asian Americans. What you will do is create 4 separate dummy variables -- one for each of the people of the different races. Let's call these variables z 1, z 2, z 3 and z 4. Let's say that if you're white, z 1 =1, otherwise z 1 =0; if you're African American, z 2 =1, otherwise z 2 =0; if you're Hispanic, z 3 =1, otherwise z 3 =0; and if you're Asian American, z 4 =1, otherwise z 4 =0. What the regression equation will do is examine the differences in income between each of the groups and an excluded group. Thus, what you will be doing is excluding one of the groups above from the regression analysis and the regression results, or the parameters you find will be d:\wp61\probsets\lect2.phd\dummyvar.phd Page 3

4 the difference in income between the group you're examining and the excluded group. Let's say that we exclude Asian Americans from the regression analysis. Our regression equation would be, Y=a+c 1 z 1 +c 2 z 2 +c 3 z 3 The value for c 1 will be the difference in mean income between whites and Asian Americans. c 2 will be the difference in income between African Americans and Asian Americans, and c 3 will be the difference in income between Hispanics and Asian Americans. If the values for the c coefficients are positive, this will indicate that the group being examined has a higher level of income than the excluded group, Asian Americans. Example: Race Income The mean income ($) levels for Whites 200 Af Am 700 Hispanics 400 As A 500 Thus, value for the coefficients will be: c 1 =-300 c 2 =200 c 3 =-100 Page 4

5 Y=a+c 1 z 1 +c 2 z 2 +c 3 z 3 If we want to examine the mean income level for whites, we would give z 1 a value of 1 and z2 and z3 values of 0. Y=a+c 1 We know that the mean income level for whites is =a-300 Therefore, a=500. We could figure out the value of a for the other groups as well. Y=a+c 2 700=a+200 Therefore, a=500. The equation for this, assuming that Asian Americans is the excluded category will be, Y= z z z The interaction between a two category dummy variable and an interval scale variable in a regression model. You re examining the effects of a two category race variable and age on income. X 1 is the variable for age and Z1 is the variable for being white (Z 1 =1 if you re white; Z 1 =0 otherwise). The equation will look like the following: Y p = a + b 1 X 1 +c 1 Z 1 + d 1 X 1 Z 1 b 1 is your coefficient estimate for age, c 1 is the difference in income between whites and people of other races, and d 1 is the difference in the effect of age between whites and people of a different races. If the coefficient estimate for the interaction variable is significant, this means that the interval scale variable, age, has a different effect on whites relative to those of other races. Let s say we re examining income (the DV) and have the following output. d:\wp61\probsets\lect2.phd\dummyvar.phd Page 5

6 Parameter Estimates Parameter Standard T for H0: Variable DF Estimate Error Parameter=0 Prob > T INTERCEP WHITE AGE WHITEAGE Here, a=20830 b 1 = c 1 =27879 d 1 = The d1 coefficient is telling us that age has a more negative effect for whites than it does for people of other races. As the age of white people increases by 1 year, their income decreases by dollars more than it does for people of other races. Thus, the d 1 coefficient estimate is telling you the differential effect that the interval scale independent variable (age) has on the dependent variable (income) for the included group in the dummy variable (whites) versus the excluded group (non-whites). Since the estimate for Whiteage is negative, this is telling us that effect of age on whites is more negative than the effect of age on races other than white. If this coefficient estimate was positive and significant, it would mean that the effect of age on whites would be more positive than the effect of age on people of other races. In order to then determine the slope and intercept coefficients for members of the different races (whites and non-whites), we would do the following: For Whites: Yp = ( ) X + (27879) Z + ( )XZ We can set Z=1, so Yp=20830 ( )X ( )X We can now put like terms together, Yp= ( ) + ( )X Yp= ( )X So the slope coefficient is and the intercept is for whites. For Races other than White: Yp = ( ) X + (27879) Z + ( )XZ Page 6

7 Set Z=0. Yp = ( )X So the slope is and the intercept is for races other than white. 4. The interaction between a four category set of dummy variables and an interval variable in a regression model. Let's say that we're examining the effects of race and age on income. X 1 will be the variable for age. The model without any interactions would be Y=a+b 1 x 1 +c 1 z 1 +c 2 z 2 +c 3 z 3. If we believed that there was some interaction between race and age, we could interact these variables and the model would look like the following, Y=a+b 1 x 1 +c 1 z 1 +c 2 z 2 +c 3 z 3 + d 11 x 1 z 1 +d 12 x 1 z 2 +d 13 x 1 z 3 You will again be excluding one of the categorical variables as well as one of the interaction variables. d 11, d 12, and d 13 are the coefficient estimates the interactions between age and white, age and African American, and age and Hispanic, respectively. Each of the d coefficient estimates (the interaction coefficient estimates) will indicate if the effects of the interval scale variable are different for members of the different races relative to the excluded group. Thus, if d 11 is positive and significant, it indicates that the effects of age on whites is greater than the effects of age on Asian Americans (the excluded group) for the dependent variable (income). If d 12 is negative and significant, it indicates that the effects of age on African Americans is less than the effects of age on Asian Americans (the excluded group) for the dependent variable (income). You now want to determine the slope for the age variable and the intercept for each of the different races. You ll do this by setting the appropriate z variable equal to 1 and the other z variables equal to 0. For whites: Y=a+b 1 x 1 +c 1 +d 11 x 1 This is because the values of z 2 and z 3 are equal to 0 when we are examining whites. The value of z 1 =1 when we are examining whites. For African Americans: Set z2=1, z1=0, z3=0. Y=a+b 1 x 1 +c 2 +d 12 x 1 d:\wp61\probsets\lect2.phd\dummyvar.phd Page 7

8 For Hispanics: Set z1=0, z2=0 and z3=1. Y=a+b 1 x 1 +c 3 +d 13 x 1 For Asian Americans: Set z1=0, z2=0, and z3=0. Y=a+b 1 x 1 What we find is that once we set the z values equal to either 1 or 0, we are left with the c coefficient standing on their own. These c coefficients are simply constants. Since the c values are not being multiplied by an X variable, these c coefficients are not slopes. We can add these constants to the intercept term, a, and come up with a new intercept term. We can also group together the coefficients that go with x 1 to determine a new slope coefficient for each of the different races. Each race will have a different slope and intercept. For Whites: Y=(a+c 1 )+(b 1 +d 11 )x 1 The new intercept is a+c 1 and the new slope is (b 1 +d 11 ). For African Americans: Y=(a+c 2 )+(b 1 +d 12 )x 1 The new intercept is a+c 2 and the new slope is (b 1 +d 12 ). For Hispanics: Y=(a+c 3 )+(b 1 +d 13 )x 1 The new intercept is a+c 3 and the new slope is (b 1 +d 13 ). For Asian Americans: Y=a+b 1 x 1 The intercept is a and the slope is b 1. Page 8

SPSS Guide: Regression Analysis

SPSS Guide: Regression Analysis SPSS Guide: Regression Analysis I put this together to give you a step-by-step guide for replicating what we did in the computer lab. It should help you run the tests we covered. The best way to get familiar

More information

MULTIPLE REGRESSION WITH CATEGORICAL DATA

MULTIPLE REGRESSION WITH CATEGORICAL DATA DEPARTMENT OF POLITICAL SCIENCE AND INTERNATIONAL RELATIONS Posc/Uapp 86 MULTIPLE REGRESSION WITH CATEGORICAL DATA I. AGENDA: A. Multiple regression with categorical variables. Coding schemes. Interpreting

More information

Elasticity. I. What is Elasticity?

Elasticity. I. What is Elasticity? Elasticity I. What is Elasticity? The purpose of this section is to develop some general rules about elasticity, which may them be applied to the four different specific types of elasticity discussed in

More information

Class 19: Two Way Tables, Conditional Distributions, Chi-Square (Text: Sections 2.5; 9.1)

Class 19: Two Way Tables, Conditional Distributions, Chi-Square (Text: Sections 2.5; 9.1) Spring 204 Class 9: Two Way Tables, Conditional Distributions, Chi-Square (Text: Sections 2.5; 9.) Big Picture: More than Two Samples In Chapter 7: We looked at quantitative variables and compared the

More information

Binary Logistic Regression

Binary Logistic Regression Binary Logistic Regression Main Effects Model Logistic regression will accept quantitative, binary or categorical predictors and will code the latter two in various ways. Here s a simple model including

More information

Module 5: Multiple Regression Analysis

Module 5: Multiple Regression Analysis Using Statistical Data Using to Make Statistical Decisions: Data Multiple to Make Regression Decisions Analysis Page 1 Module 5: Multiple Regression Analysis Tom Ilvento, University of Delaware, College

More information

HYPOTHESIS TESTING: CONFIDENCE INTERVALS, T-TESTS, ANOVAS, AND REGRESSION

HYPOTHESIS TESTING: CONFIDENCE INTERVALS, T-TESTS, ANOVAS, AND REGRESSION HYPOTHESIS TESTING: CONFIDENCE INTERVALS, T-TESTS, ANOVAS, AND REGRESSION HOD 2990 10 November 2010 Lecture Background This is a lightning speed summary of introductory statistical methods for senior undergraduate

More information

MULTIPLE REGRESSION EXAMPLE

MULTIPLE REGRESSION EXAMPLE MULTIPLE REGRESSION EXAMPLE For a sample of n = 166 college students, the following variables were measured: Y = height X 1 = mother s height ( momheight ) X 2 = father s height ( dadheight ) X 3 = 1 if

More information

Hypothesis testing - Steps

Hypothesis testing - Steps Hypothesis testing - Steps Steps to do a two-tailed test of the hypothesis that β 1 0: 1. Set up the hypotheses: H 0 : β 1 = 0 H a : β 1 0. 2. Compute the test statistic: t = b 1 0 Std. error of b 1 =

More information

11. Analysis of Case-control Studies Logistic Regression

11. Analysis of Case-control Studies Logistic Regression Research methods II 113 11. Analysis of Case-control Studies Logistic Regression This chapter builds upon and further develops the concepts and strategies described in Ch.6 of Mother and Child Health:

More information

II. DISTRIBUTIONS distribution normal distribution. standard scores

II. DISTRIBUTIONS distribution normal distribution. standard scores Appendix D Basic Measurement And Statistics The following information was developed by Steven Rothke, PhD, Department of Psychology, Rehabilitation Institute of Chicago (RIC) and expanded by Mary F. Schmidt,

More information

NOTES ON HLM TERMINOLOGY

NOTES ON HLM TERMINOLOGY HLML01cc 1 FI=HLML01cc NOTES ON HLM TERMINOLOGY by Ralph B. Taylor breck@rbtaylor.net All materials copyright (c) 1998-2002 by Ralph B. Taylor LEVEL 1 Refers to the model describing units within a grouping:

More information

IAPRI Quantitative Analysis Capacity Building Series. Multiple regression analysis & interpreting results

IAPRI Quantitative Analysis Capacity Building Series. Multiple regression analysis & interpreting results IAPRI Quantitative Analysis Capacity Building Series Multiple regression analysis & interpreting results How important is R-squared? R-squared Published in Agricultural Economics 0.45 Best article of the

More information

Wooldridge, Introductory Econometrics, 4th ed. Chapter 7: Multiple regression analysis with qualitative information: Binary (or dummy) variables

Wooldridge, Introductory Econometrics, 4th ed. Chapter 7: Multiple regression analysis with qualitative information: Binary (or dummy) variables Wooldridge, Introductory Econometrics, 4th ed. Chapter 7: Multiple regression analysis with qualitative information: Binary (or dummy) variables We often consider relationships between observed outcomes

More information

Chapter 5 Analysis of variance SPSS Analysis of variance

Chapter 5 Analysis of variance SPSS Analysis of variance Chapter 5 Analysis of variance SPSS Analysis of variance Data file used: gss.sav How to get there: Analyze Compare Means One-way ANOVA To test the null hypothesis that several population means are equal,

More information

1/27/2013. PSY 512: Advanced Statistics for Psychological and Behavioral Research 2

1/27/2013. PSY 512: Advanced Statistics for Psychological and Behavioral Research 2 PSY 512: Advanced Statistics for Psychological and Behavioral Research 2 Introduce moderated multiple regression Continuous predictor continuous predictor Continuous predictor categorical predictor Understand

More information

Testing and Interpreting Interactions in Regression In a Nutshell

Testing and Interpreting Interactions in Regression In a Nutshell Testing and Interpreting Interactions in Regression In a Nutshell The principles given here always apply when interpreting the coefficients in a multiple regression analysis containing interactions. However,

More information

Premaster Statistics Tutorial 4 Full solutions

Premaster Statistics Tutorial 4 Full solutions Premaster Statistics Tutorial 4 Full solutions Regression analysis Q1 (based on Doane & Seward, 4/E, 12.7) a. Interpret the slope of the fitted regression = 125,000 + 150. b. What is the prediction for

More information

2. Linear regression with multiple regressors

2. Linear regression with multiple regressors 2. Linear regression with multiple regressors Aim of this section: Introduction of the multiple regression model OLS estimation in multiple regression Measures-of-fit in multiple regression Assumptions

More information

Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression

Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression Objectives: To perform a hypothesis test concerning the slope of a least squares line To recognize that testing for a

More information

Association Between Variables

Association Between Variables Contents 11 Association Between Variables 767 11.1 Introduction............................ 767 11.1.1 Measure of Association................. 768 11.1.2 Chapter Summary.................... 769 11.2 Chi

More information

Module 3: Correlation and Covariance

Module 3: Correlation and Covariance Using Statistical Data to Make Decisions Module 3: Correlation and Covariance Tom Ilvento Dr. Mugdim Pašiƒ University of Delaware Sarajevo Graduate School of Business O ften our interest in data analysis

More information

1) Write the following as an algebraic expression using x as the variable: Triple a number subtracted from the number

1) Write the following as an algebraic expression using x as the variable: Triple a number subtracted from the number 1) Write the following as an algebraic expression using x as the variable: Triple a number subtracted from the number A. 3(x - x) B. x 3 x C. 3x - x D. x - 3x 2) Write the following as an algebraic expression

More information

Chapter Seven. Multiple regression An introduction to multiple regression Performing a multiple regression on SPSS

Chapter Seven. Multiple regression An introduction to multiple regression Performing a multiple regression on SPSS Chapter Seven Multiple regression An introduction to multiple regression Performing a multiple regression on SPSS Section : An introduction to multiple regression WHAT IS MULTIPLE REGRESSION? Multiple

More information

Coefficient of Determination

Coefficient of Determination Coefficient of Determination The coefficient of determination R 2 (or sometimes r 2 ) is another measure of how well the least squares equation ŷ = b 0 + b 1 x performs as a predictor of y. R 2 is computed

More information

Multinomial and Ordinal Logistic Regression

Multinomial and Ordinal Logistic Regression Multinomial and Ordinal Logistic Regression ME104: Linear Regression Analysis Kenneth Benoit August 22, 2012 Regression with categorical dependent variables When the dependent variable is categorical,

More information

SCHOOL OF HEALTH AND HUMAN SCIENCES DON T FORGET TO RECODE YOUR MISSING VALUES

SCHOOL OF HEALTH AND HUMAN SCIENCES DON T FORGET TO RECODE YOUR MISSING VALUES SCHOOL OF HEALTH AND HUMAN SCIENCES Using SPSS Topics addressed today: 1. Differences between groups 2. Graphing Use the s4data.sav file for the first part of this session. DON T FORGET TO RECODE YOUR

More information

Multicollinearity Richard Williams, University of Notre Dame, http://www3.nd.edu/~rwilliam/ Last revised January 13, 2015

Multicollinearity Richard Williams, University of Notre Dame, http://www3.nd.edu/~rwilliam/ Last revised January 13, 2015 Multicollinearity Richard Williams, University of Notre Dame, http://www3.nd.edu/~rwilliam/ Last revised January 13, 2015 Stata Example (See appendices for full example).. use http://www.nd.edu/~rwilliam/stats2/statafiles/multicoll.dta,

More information

August 2012 EXAMINATIONS Solution Part I

August 2012 EXAMINATIONS Solution Part I August 01 EXAMINATIONS Solution Part I (1) In a random sample of 600 eligible voters, the probability that less than 38% will be in favour of this policy is closest to (B) () In a large random sample,

More information

Standard errors of marginal effects in the heteroskedastic probit model

Standard errors of marginal effects in the heteroskedastic probit model Standard errors of marginal effects in the heteroskedastic probit model Thomas Cornelißen Discussion Paper No. 320 August 2005 ISSN: 0949 9962 Abstract In non-linear regression models, such as the heteroskedastic

More information

PROC LOGISTIC: Traps for the unwary Peter L. Flom, Independent statistical consultant, New York, NY

PROC LOGISTIC: Traps for the unwary Peter L. Flom, Independent statistical consultant, New York, NY PROC LOGISTIC: Traps for the unwary Peter L. Flom, Independent statistical consultant, New York, NY ABSTRACT Keywords: Logistic. INTRODUCTION This paper covers some gotchas in SAS R PROC LOGISTIC. A gotcha

More information

Chemical Kinetics. 2. Using the kinetics of a given reaction a possible reaction mechanism

Chemical Kinetics. 2. Using the kinetics of a given reaction a possible reaction mechanism 1. Kinetics is the study of the rates of reaction. Chemical Kinetics 2. Using the kinetics of a given reaction a possible reaction mechanism 3. What is a reaction mechanism? Why is it important? A reaction

More information

Correlation. What Is Correlation? Perfect Correlation. Perfect Correlation. Greg C Elvers

Correlation. What Is Correlation? Perfect Correlation. Perfect Correlation. Greg C Elvers Correlation Greg C Elvers What Is Correlation? Correlation is a descriptive statistic that tells you if two variables are related to each other E.g. Is your related to how much you study? When two variables

More information

CALCULATIONS & STATISTICS

CALCULATIONS & STATISTICS CALCULATIONS & STATISTICS CALCULATION OF SCORES Conversion of 1-5 scale to 0-100 scores When you look at your report, you will notice that the scores are reported on a 0-100 scale, even though respondents

More information

Click on the links below to jump directly to the relevant section

Click on the links below to jump directly to the relevant section Click on the links below to jump directly to the relevant section What is algebra? Operations with algebraic terms Mathematical properties of real numbers Order of operations What is Algebra? Algebra is

More information

LOGIT AND PROBIT ANALYSIS

LOGIT AND PROBIT ANALYSIS LOGIT AND PROBIT ANALYSIS A.K. Vasisht I.A.S.R.I., Library Avenue, New Delhi 110 012 amitvasisht@iasri.res.in In dummy regression variable models, it is assumed implicitly that the dependent variable Y

More information

Introduction to Quantitative Methods

Introduction to Quantitative Methods Introduction to Quantitative Methods October 15, 2009 Contents 1 Definition of Key Terms 2 2 Descriptive Statistics 3 2.1 Frequency Tables......................... 4 2.2 Measures of Central Tendencies.................

More information

Analyzing Intervention Effects: Multilevel & Other Approaches. Simplest Intervention Design. Better Design: Have Pretest

Analyzing Intervention Effects: Multilevel & Other Approaches. Simplest Intervention Design. Better Design: Have Pretest Analyzing Intervention Effects: Multilevel & Other Approaches Joop Hox Methodology & Statistics, Utrecht Simplest Intervention Design R X Y E Random assignment Experimental + Control group Analysis: t

More information

Statistics 104 Final Project A Culture of Debt: A Study of Credit Card Spending in America TF: Kevin Rader Anonymous Students: LD, MH, IW, MY

Statistics 104 Final Project A Culture of Debt: A Study of Credit Card Spending in America TF: Kevin Rader Anonymous Students: LD, MH, IW, MY Statistics 104 Final Project A Culture of Debt: A Study of Credit Card Spending in America TF: Kevin Rader Anonymous Students: LD, MH, IW, MY ABSTRACT: This project attempted to determine the relationship

More information

4. Multiple Regression in Practice

4. Multiple Regression in Practice 30 Multiple Regression in Practice 4. Multiple Regression in Practice The preceding chapters have helped define the broad principles on which regression analysis is based. What features one should look

More information

Simple Linear Regression, Scatterplots, and Bivariate Correlation

Simple Linear Regression, Scatterplots, and Bivariate Correlation 1 Simple Linear Regression, Scatterplots, and Bivariate Correlation This section covers procedures for testing the association between two continuous variables using the SPSS Regression and Correlate analyses.

More information

Economics of Strategy (ECON 4550) Maymester 2015 Applications of Regression Analysis

Economics of Strategy (ECON 4550) Maymester 2015 Applications of Regression Analysis Economics of Strategy (ECON 4550) Maymester 015 Applications of Regression Analysis Reading: ACME Clinic (ECON 4550 Coursepak, Page 47) and Big Suzy s Snack Cakes (ECON 4550 Coursepak, Page 51) Definitions

More information

Rockefeller College University at Albany

Rockefeller College University at Albany Rockefeller College University at Albany PAD 705 Handout: Hypothesis Testing on Multiple Parameters In many cases we may wish to know whether two or more variables are jointly significant in a regression.

More information

Georgia Department of Education Common Core Georgia Performance Standards Framework Teacher Edition Coordinate Algebra Unit 4

Georgia Department of Education Common Core Georgia Performance Standards Framework Teacher Edition Coordinate Algebra Unit 4 Equal Salaries for Equal Work? Mathematical Goals Represent data on a scatter plot Describe how two variables are related Informally assess the fit of a function by plotting and analyzing residuals Fit

More information

Pricing I: Linear Demand

Pricing I: Linear Demand Pricing I: Linear Demand This module covers the relationships between price and quantity, maximum willing to buy, maximum reservation price, profit maximizing price, and price elasticity, assuming a linear

More information

Elementary Statistics

Elementary Statistics Elementary Statistics Chapter 1 Dr. Ghamsary Page 1 Elementary Statistics M. Ghamsary, Ph.D. Chap 01 1 Elementary Statistics Chapter 1 Dr. Ghamsary Page 2 Statistics: Statistics is the science of collecting,

More information

Discussion Section 4 ECON 139/239 2010 Summer Term II

Discussion Section 4 ECON 139/239 2010 Summer Term II Discussion Section 4 ECON 139/239 2010 Summer Term II 1. Let s use the CollegeDistance.csv data again. (a) An education advocacy group argues that, on average, a person s educational attainment would increase

More information

Regression step-by-step using Microsoft Excel

Regression step-by-step using Microsoft Excel Step 1: Regression step-by-step using Microsoft Excel Notes prepared by Pamela Peterson Drake, James Madison University Type the data into the spreadsheet The example used throughout this How to is a regression

More information

The first three steps in a logistic regression analysis with examples in IBM SPSS. Steve Simon P.Mean Consulting www.pmean.com

The first three steps in a logistic regression analysis with examples in IBM SPSS. Steve Simon P.Mean Consulting www.pmean.com The first three steps in a logistic regression analysis with examples in IBM SPSS. Steve Simon P.Mean Consulting www.pmean.com 2. Why do I offer this webinar for free? I offer free statistics webinars

More information

Chapter 7: Dummy variable regression

Chapter 7: Dummy variable regression Chapter 7: Dummy variable regression Why include a qualitative independent variable?........................................ 2 Simplest model 3 Simplest case.............................................................

More information

Simple Regression Theory II 2010 Samuel L. Baker

Simple Regression Theory II 2010 Samuel L. Baker SIMPLE REGRESSION THEORY II 1 Simple Regression Theory II 2010 Samuel L. Baker Assessing how good the regression equation is likely to be Assignment 1A gets into drawing inferences about how close the

More information

Pricing and Output Decisions: i Perfect. Managerial Economics: Economic Tools for Today s Decision Makers, 4/e By Paul Keat and Philip Young

Pricing and Output Decisions: i Perfect. Managerial Economics: Economic Tools for Today s Decision Makers, 4/e By Paul Keat and Philip Young Chapter 9 Pricing and Output Decisions: i Perfect Competition and Monopoly M i l E i E i Managerial Economics: Economic Tools for Today s Decision Makers, 4/e By Paul Keat and Philip Young Pricing and

More information

Lecture 15. Endogeneity & Instrumental Variable Estimation

Lecture 15. Endogeneity & Instrumental Variable Estimation Lecture 15. Endogeneity & Instrumental Variable Estimation Saw that measurement error (on right hand side) means that OLS will be biased (biased toward zero) Potential solution to endogeneity instrumental

More information

LOGISTIC REGRESSION ANALYSIS

LOGISTIC REGRESSION ANALYSIS LOGISTIC REGRESSION ANALYSIS C. Mitchell Dayton Department of Measurement, Statistics & Evaluation Room 1230D Benjamin Building University of Maryland September 1992 1. Introduction and Model Logistic

More information

1. The parameters to be estimated in the simple linear regression model Y=α+βx+ε ε~n(0,σ) are: a) α, β, σ b) α, β, ε c) a, b, s d) ε, 0, σ

1. The parameters to be estimated in the simple linear regression model Y=α+βx+ε ε~n(0,σ) are: a) α, β, σ b) α, β, ε c) a, b, s d) ε, 0, σ STA 3024 Practice Problems Exam 2 NOTE: These are just Practice Problems. This is NOT meant to look just like the test, and it is NOT the only thing that you should study. Make sure you know all the material

More information

ASSIGNMENT 4 PREDICTIVE MODELING AND GAINS CHARTS

ASSIGNMENT 4 PREDICTIVE MODELING AND GAINS CHARTS DATABASE MARKETING Fall 2015, max 24 credits Dead line 15.10. ASSIGNMENT 4 PREDICTIVE MODELING AND GAINS CHARTS PART A Gains chart with excel Prepare a gains chart from the data in \\work\courses\e\27\e20100\ass4b.xls.

More information

Illustration (and the use of HLM)

Illustration (and the use of HLM) Illustration (and the use of HLM) Chapter 4 1 Measurement Incorporated HLM Workshop The Illustration Data Now we cover the example. In doing so we does the use of the software HLM. In addition, we will

More information

Section 14 Simple Linear Regression: Introduction to Least Squares Regression

Section 14 Simple Linear Regression: Introduction to Least Squares Regression Slide 1 Section 14 Simple Linear Regression: Introduction to Least Squares Regression There are several different measures of statistical association used for understanding the quantitative relationship

More information

Answer: C. The strength of a correlation does not change if units change by a linear transformation such as: Fahrenheit = 32 + (5/9) * Centigrade

Answer: C. The strength of a correlation does not change if units change by a linear transformation such as: Fahrenheit = 32 + (5/9) * Centigrade Statistics Quiz Correlation and Regression -- ANSWERS 1. Temperature and air pollution are known to be correlated. We collect data from two laboratories, in Boston and Montreal. Boston makes their measurements

More information

CHAPTER 13 SIMPLE LINEAR REGRESSION. Opening Example. Simple Regression. Linear Regression

CHAPTER 13 SIMPLE LINEAR REGRESSION. Opening Example. Simple Regression. Linear Regression Opening Example CHAPTER 13 SIMPLE LINEAR REGREION SIMPLE LINEAR REGREION! Simple Regression! Linear Regression Simple Regression Definition A regression model is a mathematical equation that descries the

More information

Directions for using SPSS

Directions for using SPSS Directions for using SPSS Table of Contents Connecting and Working with Files 1. Accessing SPSS... 2 2. Transferring Files to N:\drive or your computer... 3 3. Importing Data from Another File Format...

More information

This unit will lay the groundwork for later units where the students will extend this knowledge to quadratic and exponential functions.

This unit will lay the groundwork for later units where the students will extend this knowledge to quadratic and exponential functions. Algebra I Overview View unit yearlong overview here Many of the concepts presented in Algebra I are progressions of concepts that were introduced in grades 6 through 8. The content presented in this course

More information

Office of Institutional Research & Planning

Office of Institutional Research & Planning NECC Northern Essex Community College NECC College Math Tutoring Center Results Spring 2011 The College Math Tutoring Center at Northern Essex Community College opened its doors to students in the Spring

More information

MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.

MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. MBA 640 Survey of Microeconomics Fall 2006, Quiz 6 Name MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. 1) A monopoly is best defined as a firm that

More information

Moderator and Mediator Analysis

Moderator and Mediator Analysis Moderator and Mediator Analysis Seminar General Statistics Marijtje van Duijn October 8, Overview What is moderation and mediation? What is their relation to statistical concepts? Example(s) October 8,

More information

Panel Data Analysis in Stata

Panel Data Analysis in Stata Panel Data Analysis in Stata Anton Parlow Lab session Econ710 UWM Econ Department??/??/2010 or in a S-Bahn in Berlin, you never know.. Our plan Introduction to Panel data Fixed vs. Random effects Testing

More information

Solving Linear Equations in One Variable. Worked Examples

Solving Linear Equations in One Variable. Worked Examples Solving Linear Equations in One Variable Worked Examples Solve the equation 30 x 1 22x Solve the equation 30 x 1 22x Our goal is to isolate the x on one side. We ll do that by adding (or subtracting) quantities

More information

Nonlinear Regression Functions. SW Ch 8 1/54/

Nonlinear Regression Functions. SW Ch 8 1/54/ Nonlinear Regression Functions SW Ch 8 1/54/ The TestScore STR relation looks linear (maybe) SW Ch 8 2/54/ But the TestScore Income relation looks nonlinear... SW Ch 8 3/54/ Nonlinear Regression General

More information

MODEL I: DRINK REGRESSED ON GPA & MALE, WITHOUT CENTERING

MODEL I: DRINK REGRESSED ON GPA & MALE, WITHOUT CENTERING Interpreting Interaction Effects; Interaction Effects and Centering Richard Williams, University of Notre Dame, http://www3.nd.edu/~rwilliam/ Last revised February 20, 2015 Models with interaction effects

More information

Part 1 Expressions, Equations, and Inequalities: Simplifying and Solving

Part 1 Expressions, Equations, and Inequalities: Simplifying and Solving Section 7 Algebraic Manipulations and Solving Part 1 Expressions, Equations, and Inequalities: Simplifying and Solving Before launching into the mathematics, let s take a moment to talk about the words

More information

Chapter 7: Simple linear regression Learning Objectives

Chapter 7: Simple linear regression Learning Objectives Chapter 7: Simple linear regression Learning Objectives Reading: Section 7.1 of OpenIntro Statistics Video: Correlation vs. causation, YouTube (2:19) Video: Intro to Linear Regression, YouTube (5:18) -

More information

Canton-Massillon PM2.5 Nonattainment Area Monitor Missing Data Analysis

Canton-Massillon PM2.5 Nonattainment Area Monitor Missing Data Analysis Canton-Massillon PM2.5 Nonattainment Area Monitor Missing Data Analysis The current Canton-Massillon nonattainment area is located in norast Ohio and includes Stark County. The area has two monitors measuring

More information

Session 7 Bivariate Data and Analysis

Session 7 Bivariate Data and Analysis Session 7 Bivariate Data and Analysis Key Terms for This Session Previously Introduced mean standard deviation New in This Session association bivariate analysis contingency table co-variation least squares

More information

How To Run Statistical Tests in Excel

How To Run Statistical Tests in Excel How To Run Statistical Tests in Excel Microsoft Excel is your best tool for storing and manipulating data, calculating basic descriptive statistics such as means and standard deviations, and conducting

More information

-2- Reason: This is harder. I'll give an argument in an Addendum to this handout.

-2- Reason: This is harder. I'll give an argument in an Addendum to this handout. LINES Slope The slope of a nonvertical line in a coordinate plane is defined as follows: Let P 1 (x 1, y 1 ) and P 2 (x 2, y 2 ) be any two points on the line. Then slope of the line = y 2 y 1 change in

More information

Simple Methods and Procedures Used in Forecasting

Simple Methods and Procedures Used in Forecasting Simple Methods and Procedures Used in Forecasting The project prepared by : Sven Gingelmaier Michael Richter Under direction of the Maria Jadamus-Hacura What Is Forecasting? Prediction of future events

More information

Figure 1.1 Percentage of persons without health insurance coverage: all ages, United States, 1997-2001

Figure 1.1 Percentage of persons without health insurance coverage: all ages, United States, 1997-2001 Figure 1.1 Percentage of persons without health insurance coverage: all ages, United States, 1997-2001 DATA SOURCE: Family Core component of the 1997-2001 National Health Interview Surveys. The estimate

More information

MULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS

MULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS MULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS MSR = Mean Regression Sum of Squares MSE = Mean Squared Error RSS = Regression Sum of Squares SSE = Sum of Squared Errors/Residuals α = Level of Significance

More information

A Basic Introduction to Missing Data

A Basic Introduction to Missing Data John Fox Sociology 740 Winter 2014 Outline Why Missing Data Arise Why Missing Data Arise Global or unit non-response. In a survey, certain respondents may be unreachable or may refuse to participate. Item

More information

Do faculty salaries rise with job seniority?

Do faculty salaries rise with job seniority? Economics Letters 58 (1998) 39 44 Do faculty salaries rise with job seniority? * Debra A. Barbezat, Michael R. Donihue Economics Department, Colby College, Waterville, ME 04901, USA Received 7 October

More information

Multiple Regression in SPSS This example shows you how to perform multiple regression. The basic command is regression : linear.

Multiple Regression in SPSS This example shows you how to perform multiple regression. The basic command is regression : linear. Multiple Regression in SPSS This example shows you how to perform multiple regression. The basic command is regression : linear. In the main dialog box, input the dependent variable and several predictors.

More information

Linear Models in STATA and ANOVA

Linear Models in STATA and ANOVA Session 4 Linear Models in STATA and ANOVA Page Strengths of Linear Relationships 4-2 A Note on Non-Linear Relationships 4-4 Multiple Linear Regression 4-5 Removal of Variables 4-8 Independent Samples

More information

Using An Ordered Logistic Regression Model with SAS Vartanian: SW 541

Using An Ordered Logistic Regression Model with SAS Vartanian: SW 541 Using An Ordered Logistic Regression Model with SAS Vartanian: SW 541 libname in1 >c:\=; Data first; Set in1.extract; A=1; PROC LOGIST OUTEST=DD MAXITER=100 ORDER=DATA; OUTPUT OUT=CC XBETA=XB P=PROB; MODEL

More information

Introduction to proc glm

Introduction to proc glm Lab 7: Proc GLM and one-way ANOVA STT 422: Summer, 2004 Vince Melfi SAS has several procedures for analysis of variance models, including proc anova, proc glm, proc varcomp, and proc mixed. We mainly will

More information

NPV Versus IRR. W.L. Silber -1000 0 0 +300 +600 +900. We know that if the cost of capital is 18 percent we reject the project because the NPV

NPV Versus IRR. W.L. Silber -1000 0 0 +300 +600 +900. We know that if the cost of capital is 18 percent we reject the project because the NPV NPV Versus IRR W.L. Silber I. Our favorite project A has the following cash flows: -1 + +6 +9 1 2 We know that if the cost of capital is 18 percent we reject the project because the net present value is

More information

Additional sources Compilation of sources: http://lrs.ed.uiuc.edu/tseportal/datacollectionmethodologies/jin-tselink/tselink.htm

Additional sources Compilation of sources: http://lrs.ed.uiuc.edu/tseportal/datacollectionmethodologies/jin-tselink/tselink.htm Mgt 540 Research Methods Data Analysis 1 Additional sources Compilation of sources: http://lrs.ed.uiuc.edu/tseportal/datacollectionmethodologies/jin-tselink/tselink.htm http://web.utk.edu/~dap/random/order/start.htm

More information

District % of Students Met Standard (Proficient) and Commended (Advanced) State % of Students Met Standard (Proficient) and Commended (Advanced)

District % of Students Met Standard (Proficient) and Commended (Advanced) State % of Students Met Standard (Proficient) and Commended (Advanced) Texas 2012 NCLB Report Card Part I Preview C O N F I D E N T I A L 14 Part I - Campus Level: Student Performance for Each District Campus Compared to the State, Percent of, Student Achievement by Proficiency

More information

ECON 142 SKETCH OF SOLUTIONS FOR APPLIED EXERCISE #2

ECON 142 SKETCH OF SOLUTIONS FOR APPLIED EXERCISE #2 University of California, Berkeley Prof. Ken Chay Department of Economics Fall Semester, 005 ECON 14 SKETCH OF SOLUTIONS FOR APPLIED EXERCISE # Question 1: a. Below are the scatter plots of hourly wages

More information

Hedge Effectiveness Testing

Hedge Effectiveness Testing Hedge Effectiveness Testing Using Regression Analysis Ira G. Kawaller, Ph.D. Kawaller & Company, LLC Reva B. Steinberg BDO Seidman LLP When companies use derivative instruments to hedge economic exposures,

More information

1. Suppose that a score on a final exam depends upon attendance and unobserved factors that affect exam performance (such as student ability).

1. Suppose that a score on a final exam depends upon attendance and unobserved factors that affect exam performance (such as student ability). Examples of Questions on Regression Analysis: 1. Suppose that a score on a final exam depends upon attendance and unobserved factors that affect exam performance (such as student ability). Then,. When

More information

Elasticity: The Responsiveness of Demand and Supply

Elasticity: The Responsiveness of Demand and Supply Chapter 6 Elasticity: The Responsiveness of Demand and Supply Chapter Outline 61 LEARNING OBJECTIVE 61 The Price Elasticity of Demand and Its Measurement Learning Objective 1 Define the price elasticity

More information

Section 3-7. Marginal Analysis in Business and Economics. Marginal Cost, Revenue, and Profit. 202 Chapter 3 The Derivative

Section 3-7. Marginal Analysis in Business and Economics. Marginal Cost, Revenue, and Profit. 202 Chapter 3 The Derivative 202 Chapter 3 The Derivative Section 3-7 Marginal Analysis in Business and Economics Marginal Cost, Revenue, and Profit Application Marginal Average Cost, Revenue, and Profit Marginal Cost, Revenue, and

More information

Formula for linear models. Prediction, extrapolation, significance test against zero slope.

Formula for linear models. Prediction, extrapolation, significance test against zero slope. Formula for linear models. Prediction, extrapolation, significance test against zero slope. Last time, we looked the linear regression formula. It s the line that fits the data best. The Pearson correlation

More information

Multiple Regression: What Is It?

Multiple Regression: What Is It? Multiple Regression Multiple Regression: What Is It? Multiple regression is a collection of techniques in which there are multiple predictors of varying kinds and a single outcome We are interested in

More information

No Solution Equations Let s look at the following equation: 2 +3=2 +7

No Solution Equations Let s look at the following equation: 2 +3=2 +7 5.4 Solving Equations with Infinite or No Solutions So far we have looked at equations where there is exactly one solution. It is possible to have more than solution in other types of equations that are

More information

Correlation key concepts:

Correlation key concepts: CORRELATION Correlation key concepts: Types of correlation Methods of studying correlation a) Scatter diagram b) Karl pearson s coefficient of correlation c) Spearman s Rank correlation coefficient d)

More information

Annual Report of Life Insurance Examinations Calendar Year 2010

Annual Report of Life Insurance Examinations Calendar Year 2010 Annual Report of Life Insurance Examinations Calendar Year 2010 OVERVIEW This report was prepared according to the provisions of section 626.2415, Florida Statutes, and is published annually using data

More information

Row vs. Column Percents. tab PRAYER DEGREE, row col

Row vs. Column Percents. tab PRAYER DEGREE, row col Bivariate Analysis - Crosstabulation One of most basic research tools shows how x varies with respect to y Interpretation of table depends upon direction of percentaging example Row vs. Column Percents.

More information

Adverse Impact and Test Validation Book Series: Multiple Regression. Introduction. Comparison of Compensation using

Adverse Impact and Test Validation Book Series: Multiple Regression. Introduction. Comparison of Compensation using Adverse Impact and Test Validation Book Series: Multiple Regression Using Multiple Regression to Examine Compensation Practices Introduction Reasons for Investigating Pay Equity: The Equal Pay Act of 1963

More information