Chapter 7: Dummy variable regression
|
|
|
- Caitlin Stone
- 9 years ago
- Views:
Transcription
1 Chapter 7: Dummy variable regression Why include a qualitative independent variable? Simplest model 3 Simplest case Example (continued) Possible solution: separate regressions Independent variable vs. regressor Common slope model Testing More general models 10 More than one quantitative independent variable Polytomous independent variables Example (continued) Testing with polytomous independent variable R commands More than one qualitative independent variable Interaction 17 Definition Interaction vs. correlation Constructing regressors Testing Principle of marginality Polytomous independent variables Hypothesis tests Standardized estimates Interaction between categorical variables
2 Why include a qualitative independent variable? We are interested in the effect of a qualitative independent variable (for example: do men earn more than women?) We want to better predict/describe the dependent variable. We can make the errors smaller by including variables like gender, race, etc. Qualitative variables may be confounding factors. Omitting them may cause biased estimates of other coefficients. 2 / 26 Simplest model 3 / 26 Simplest case Example: Dependent variable: income One quantitative independent variable: education One dichotomous (can take two values) independent variable: gender Assume effect of either independent variable is the same, regardless of the value of the other variable (additivity, parallel regression lines) - See pictures from book. Usual assumptions on statistical errors: independent, zero means, constant variance, normally distributed, fixed X s or X independent of statistical errors. 4 / 26 Example (continued) Suppose that we are interested in the effect of education on income, and that gender has an effect on income. See pictures from book. Scenario 1: Gender and education are uncorrelated Gender is not a confounding factor Omitting gender gives correct slope estimate, but larger errors Scenario 2: Gender and education are correlated Gender is a confounding factor Omitting gender gives biased slope estimate, and larger errors 5 / 26 2
3 Possible solution: separate regressions Fit separate regression for men and women Disadvantages: How to test for the effect of gender? If it is reasonable to assume that regressions for men and women are parallel, then it is more efficient to use all data to estimate the common slope. 6 / 26 Independent variable vs. regressor Y =income, X=education, D=regressor for gender: D i = { 1 for men 0 for women Independent variable = real variables of interest Regressor = variable put in the regression model In general, regressors are functions of the independent variables. Sometimes regressors are equal to the independent variables. 7 / 26 Common slope model Y i = α + βx i + γd i + ǫ i For women (D i = 0): For men (D i = 1): Y i = α + βx i + γ 0 + ǫ i = α + βx i + ǫ i Y i = α + βx i + γ 1 + ǫ i = (α + γ) + βx i + ǫ i See picture from book. What are the interpretations of α, β and γ? What happens if we code D = 1 for women and D = 0 for men? 8 / 26 3
4 Testing Test the partial effect of gender: H 0 : γ = 0, H a : γ 0 Same as before: Compute t-statistic or incremental F-test Test the partial effect of education: H 0 : β = 0, H a : β 0 Same as before: Compute t-statistic or incremental F-test Cystic fibrosis example. 9 / 26 More general models 10 / 26 More than one quantitative independent variable All methods go through, as long as we assume parallel regression surfaces. Model: Y i = α + β 1 X i1 + + β k X ik + γd i + ǫ i. Women (D i = 0): Y i = α + β 1 X i1 + + β k X ik + γ 0 + ǫ i = α + β 1 X i1 + + β k X ik + ǫ i Men (D i = 1): Y i = α + β 1 X i1 + + β k X ik + γ 1 + ǫ i = (α + γ) + β 1 X i1 + + β k X ik + ǫ i Interpretation of α, β 1,...,β k, γ. 11 / 26 4
5 Polytomous independent variables Qualitative variable with more than two categories Example: Duncan data: Dependent variable: Y =prestige Quantitative independent variables: X 1 =income and X 2 =education Qualitative independent variable: type (bc, prof, wc) D 1 and D 2 are regressors for type: Type D 1 D 2 Blue collar (bc) 0 0 Professsional (prof) 1 0 White collar (wc) 0 1 If there are p categories, use p 1 dummy regressors. What happens if we use p regressors? 12 / 26 Example (continued) Y = α + β 1 X 1 + β 2 X 2 + γ 1 D 1 + γ 2 D 2 + ǫ Blue collar (D i1 = 0 and D i2 = 0): Y i = α + β 1 X i1 + β 2 X i2 + γ γ ǫ i = α + β 1 X i1 + β 2 X i2 + ǫ i Professional (D i1 = 1 and D i2 = 0): Y i = α + β 1 X i1 + β 2 X i2 + γ γ ǫ i = (α + γ 1 ) + β 1 X i1 + β 2 X i2 + ǫ i White collar (D i1 = 0 and D i2 = 1): Y i = α + β 1 X i1 + β 2 X i2 + γ γ ǫ i = (α + γ 2 ) + β 1 X i1 + β 2 X i2 + ǫ i 13 / 26 5
6 Testing with polytomous independent variable Test partial effect of type, i.e., the effect of type controlling for income and education. H 0 : γ 1 = γ 2 = 0 H a : at least one γ j 0, j = 1,2. Incremental F-test: Null model: Y = α + β 1 X 1 + β 2 X 2 + ǫ Full model: Y = α + β 1 X 1 + β 2 X 2 + γ 1 D 1 + γ 2 D 2 + ǫ What do the individual p-values in summary(lm()) mean? First look at F-test, then at individual p-values 14 / 26 R commands Creating dummy variables by hand: D1 <- (type=="prof")*1 D2 <- (type=="wc")*1 m1 <- lm(prestige education+income+d1+d2) Letting R do things automatically: m1 <- lm(prestige education+income+type) m1 <- lm(prestige education+income+factor(type)) The use of factor(): factor() is not needed in this example, because the coding of the categories is in words: bc, prof, wc. It is essential to use factor() if the coding of the categories is numerical! To be safe, you can always use factor. Example R-code 15 / 26 6
7 More than one qualitative independent variable Example: Y =prestige, X 1 =income, X 2 =education, Type D 1 D 2 Blue collar 0 0 Professional 1 0 White collar 0 1 and Gender D3 Women 0 Men 1 Y = α + β 1 X 1 + β 2 X 2 + γ 1 D 1 + γ 2 D 2 + γ 3 D 3 + ǫ What is the equation for men with professional jobs? And for women with white collar jobs? 16 / 26 Interaction 17 / 26 Definition Two variables are said to interact in determining a dependent variable if the partial effect of one depends on the value of the other. So far we only studied models without interaction. Interaction between a quantitative and a qualitative variable means that the regression surfaces are not parallel. See picture. Interaction between two qualitative variables means that the effect of one of the variables depends on the value of the other variable. Example: the effect of type of job on prestige is bigger for men than for women. Interaction between two quantitative variables is a bit harder to interpret, and we may consider that later. 18 / 26 Interaction vs. correlation First, note that in general, the independent variables are not independent of each other. Correlation: Independent variables are statistically related to each other. Interaction: Effect of one independent variable on the dependent variable depends on the value of the other independent variable. Two independent variables can interact whether or not they are correlated. 19 / 26 7
8 Constructing regressors Y =income, X=education, D=dummy for gender Y i = α + βx i + γd i + δ(x i D i ) + ǫ i Note X D is a new regressor. It is a function of X and D, but not a linear function. Therefore we do not get perfect collinearity. Women (D i = 0): Men (D i = 1) Y i = α + βx i + γ 0 + δ(x i 0) + ǫ i = α + βx i + ǫ i Y i = α + βx i + γ 1 + δ(x i 1) + ǫ i = (α + γ) + (β + δ)x i + ǫ i Interpretation of α, β, γ, δ. 20 / 26 Testing Testing for interaction is testing for a difference in slope between men and women. H 0 : δ = 0 and H a : δ 0. What is the difference between: The model with interaction Fitting two separate regression lines for men and women 21 / 26 Principle of marginality If interaction is significant, do not test or interpret main effects: First test for interaction effect. If no interaction, test and interpret main effects. If interaction is included in the model, main effects should also be included. See pictures of models that violate the principle of marginality. 22 / 26 8
9 Polytomous independent variables Create interaction regressors by taking the products of all dummy variable regressors and the quantitative variable. Example: Y =prestige, X 1 =education, X 2 =income D 1,D 2 =dummies for type of job Y = α + β 1 X 1 + β 2 X 2 + γ 1 D 1 + γ 2 D 2 + δ 11 X 1 D 1 + δ 12 X 1 D 2 + δ 21 X 2 D 1 + δ 22 X 2 D 2 + ǫ Interpretation of parameters 23 / 26 Hypothesis tests When testing for main effects and interactions, follow principle of marginality Use incremental F-test Examples in R-code 24 / 26 Standardized estimates Do not standardize dummy-regressor coefficients. Dummy regressor coefficient has clear interpretation. By standardizing it, this interpretation gets lost. Therefore we don t standardize dummy regressor coefficients. Also, don t standardize interaction regressors. You can standardize the quantitative independent variable before taking its product with the dummy regressor. 25 / 26 9
10 Interaction between categorical variables Example: Does reproduction reduce lifespan of male fruitflies? Experiment: male flies with 1 pregnant (not receptive) female per day male flies with 8 pregnant females per day male flies with 1 virgin (receptive) female per day male flies with 8 virgin females per day male flies without females Each group contains 25 fruitflies Available information: Thorax length in mm Percentage of time sleeping Longevity in days See plots 26 / 26 10
IAPRI Quantitative Analysis Capacity Building Series. Multiple regression analysis & interpreting results
IAPRI Quantitative Analysis Capacity Building Series Multiple regression analysis & interpreting results How important is R-squared? R-squared Published in Agricultural Economics 0.45 Best article of the
ECON 142 SKETCH OF SOLUTIONS FOR APPLIED EXERCISE #2
University of California, Berkeley Prof. Ken Chay Department of Economics Fall Semester, 005 ECON 14 SKETCH OF SOLUTIONS FOR APPLIED EXERCISE # Question 1: a. Below are the scatter plots of hourly wages
MULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS
MULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS MSR = Mean Regression Sum of Squares MSE = Mean Squared Error RSS = Regression Sum of Squares SSE = Sum of Squared Errors/Residuals α = Level of Significance
17. SIMPLE LINEAR REGRESSION II
17. SIMPLE LINEAR REGRESSION II The Model In linear regression analysis, we assume that the relationship between X and Y is linear. This does not mean, however, that Y can be perfectly predicted from X.
1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96
1 Final Review 2 Review 2.1 CI 1-propZint Scenario 1 A TV manufacturer claims in its warranty brochure that in the past not more than 10 percent of its TV sets needed any repair during the first two years
Multinomial and Ordinal Logistic Regression
Multinomial and Ordinal Logistic Regression ME104: Linear Regression Analysis Kenneth Benoit August 22, 2012 Regression with categorical dependent variables When the dependent variable is categorical,
Multivariate Logistic Regression
1 Multivariate Logistic Regression As in univariate logistic regression, let π(x) represent the probability of an event that depends on p covariates or independent variables. Then, using an inv.logit formulation
Elements of statistics (MATH0487-1)
Elements of statistics (MATH0487-1) Prof. Dr. Dr. K. Van Steen University of Liège, Belgium December 10, 2012 Introduction to Statistics Basic Probability Revisited Sampling Exploratory Data Analysis -
Solución del Examen Tipo: 1
Solución del Examen Tipo: 1 Universidad Carlos III de Madrid ECONOMETRICS Academic year 2009/10 FINAL EXAM May 17, 2010 DURATION: 2 HOURS 1. Assume that model (III) verifies the assumptions of the classical
Nonlinear Regression Functions. SW Ch 8 1/54/
Nonlinear Regression Functions SW Ch 8 1/54/ The TestScore STR relation looks linear (maybe) SW Ch 8 2/54/ But the TestScore Income relation looks nonlinear... SW Ch 8 3/54/ Nonlinear Regression General
11. Analysis of Case-control Studies Logistic Regression
Research methods II 113 11. Analysis of Case-control Studies Logistic Regression This chapter builds upon and further develops the concepts and strategies described in Ch.6 of Mother and Child Health:
Introduction to Regression and Data Analysis
Statlab Workshop Introduction to Regression and Data Analysis with Dan Campbell and Sherlock Campbell October 28, 2008 I. The basics A. Types of variables Your variables may take several forms, and it
Point Biserial Correlation Tests
Chapter 807 Point Biserial Correlation Tests Introduction The point biserial correlation coefficient (ρ in this chapter) is the product-moment correlation calculated between a continuous random variable
HYPOTHESIS TESTING: CONFIDENCE INTERVALS, T-TESTS, ANOVAS, AND REGRESSION
HYPOTHESIS TESTING: CONFIDENCE INTERVALS, T-TESTS, ANOVAS, AND REGRESSION HOD 2990 10 November 2010 Lecture Background This is a lightning speed summary of introductory statistical methods for senior undergraduate
Bivariate Statistics Session 2: Measuring Associations Chi-Square Test
Bivariate Statistics Session 2: Measuring Associations Chi-Square Test Features Of The Chi-Square Statistic The chi-square test is non-parametric. That is, it makes no assumptions about the distribution
Introduction to Quantitative Methods
Introduction to Quantitative Methods October 15, 2009 Contents 1 Definition of Key Terms 2 2 Descriptive Statistics 3 2.1 Frequency Tables......................... 4 2.2 Measures of Central Tendencies.................
1. Suppose that a score on a final exam depends upon attendance and unobserved factors that affect exam performance (such as student ability).
Examples of Questions on Regression Analysis: 1. Suppose that a score on a final exam depends upon attendance and unobserved factors that affect exam performance (such as student ability). Then,. When
Module 5: Multiple Regression Analysis
Using Statistical Data Using to Make Statistical Decisions: Data Multiple to Make Regression Decisions Analysis Page 1 Module 5: Multiple Regression Analysis Tom Ilvento, University of Delaware, College
1/27/2013. PSY 512: Advanced Statistics for Psychological and Behavioral Research 2
PSY 512: Advanced Statistics for Psychological and Behavioral Research 2 Introduce moderated multiple regression Continuous predictor continuous predictor Continuous predictor categorical predictor Understand
Stat 412/512 CASE INFLUENCE STATISTICS. Charlotte Wickham. stat512.cwick.co.nz. Feb 2 2015
Stat 412/512 CASE INFLUENCE STATISTICS Feb 2 2015 Charlotte Wickham stat512.cwick.co.nz Regression in your field See website. You may complete this assignment in pairs. Find a journal article in your field
2. What is the general linear model to be used to model linear trend? (Write out the model) = + + + or
Simple and Multiple Regression Analysis Example: Explore the relationships among Month, Adv.$ and Sales $: 1. Prepare a scatter plot of these data. The scatter plots for Adv.$ versus Sales, and Month versus
Panel Data Analysis in Stata
Panel Data Analysis in Stata Anton Parlow Lab session Econ710 UWM Econ Department??/??/2010 or in a S-Bahn in Berlin, you never know.. Our plan Introduction to Panel data Fixed vs. Random effects Testing
Wooldridge, Introductory Econometrics, 4th ed. Chapter 7: Multiple regression analysis with qualitative information: Binary (or dummy) variables
Wooldridge, Introductory Econometrics, 4th ed. Chapter 7: Multiple regression analysis with qualitative information: Binary (or dummy) variables We often consider relationships between observed outcomes
Answer: C. The strength of a correlation does not change if units change by a linear transformation such as: Fahrenheit = 32 + (5/9) * Centigrade
Statistics Quiz Correlation and Regression -- ANSWERS 1. Temperature and air pollution are known to be correlated. We collect data from two laboratories, in Boston and Montreal. Boston makes their measurements
Multiple Regression in SPSS This example shows you how to perform multiple regression. The basic command is regression : linear.
Multiple Regression in SPSS This example shows you how to perform multiple regression. The basic command is regression : linear. In the main dialog box, input the dependent variable and several predictors.
We extended the additive model in two variables to the interaction model by adding a third term to the equation.
Quadratic Models We extended the additive model in two variables to the interaction model by adding a third term to the equation. Similarly, we can extend the linear model in one variable to the quadratic
Interaction between quantitative predictors
Interaction between quantitative predictors In a first-order model like the ones we have discussed, the association between E(y) and a predictor x j does not depend on the value of the other predictors
Basic Statistics and Data Analysis for Health Researchers from Foreign Countries
Basic Statistics and Data Analysis for Health Researchers from Foreign Countries Volkert Siersma [email protected] The Research Unit for General Practice in Copenhagen Dias 1 Content Quantifying association
1. The parameters to be estimated in the simple linear regression model Y=α+βx+ε ε~n(0,σ) are: a) α, β, σ b) α, β, ε c) a, b, s d) ε, 0, σ
STA 3024 Practice Problems Exam 2 NOTE: These are just Practice Problems. This is NOT meant to look just like the test, and it is NOT the only thing that you should study. Make sure you know all the material
DEPARTMENT OF PSYCHOLOGY UNIVERSITY OF LANCASTER MSC IN PSYCHOLOGICAL RESEARCH METHODS ANALYSING AND INTERPRETING DATA 2 PART 1 WEEK 9
DEPARTMENT OF PSYCHOLOGY UNIVERSITY OF LANCASTER MSC IN PSYCHOLOGICAL RESEARCH METHODS ANALYSING AND INTERPRETING DATA 2 PART 1 WEEK 9 Analysis of covariance and multiple regression So far in this course,
Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression
Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression Objectives: To perform a hypothesis test concerning the slope of a least squares line To recognize that testing for a
Forecast. Forecast is the linear function with estimated coefficients. Compute with predict command
Forecast Forecast is the linear function with estimated coefficients T T + h = b0 + b1timet + h Compute with predict command Compute residuals Forecast Intervals eˆ t = = y y t+ h t+ h yˆ b t+ h 0 b Time
Contrast Coding in Multiple Regression Analysis: Strengths, Weaknesses, and Utility of Popular Coding Structures
Journal of Data Science 8(2010), 61-73 Contrast Coding in Multiple Regression Analysis: Strengths, Weaknesses, and Utility of Popular Coding Structures Matthew J. Davis Texas A&M University Abstract: The
5. Linear Regression
5. Linear Regression Outline.................................................................... 2 Simple linear regression 3 Linear model............................................................. 4
Interaction effects and group comparisons Richard Williams, University of Notre Dame, http://www3.nd.edu/~rwilliam/ Last revised February 20, 2015
Interaction effects and group comparisons Richard Williams, University of Notre Dame, http://www3.nd.edu/~rwilliam/ Last revised February 20, 2015 Note: This handout assumes you understand factor variables,
Multiple Linear Regression
Multiple Linear Regression A regression with two or more explanatory variables is called a multiple regression. Rather than modeling the mean response as a straight line, as in simple regression, it is
Auxiliary Variables in Mixture Modeling: 3-Step Approaches Using Mplus
Auxiliary Variables in Mixture Modeling: 3-Step Approaches Using Mplus Tihomir Asparouhov and Bengt Muthén Mplus Web Notes: No. 15 Version 8, August 5, 2014 1 Abstract This paper discusses alternatives
Unit 12 Logistic Regression Supplementary Chapter 14 in IPS On CD (Chap 16, 5th ed.)
Unit 12 Logistic Regression Supplementary Chapter 14 in IPS On CD (Chap 16, 5th ed.) Logistic regression generalizes methods for 2-way tables Adds capability studying several predictors, but Limited to
Research Methods & Experimental Design
Research Methods & Experimental Design 16.422 Human Supervisory Control April 2004 Research Methods Qualitative vs. quantitative Understanding the relationship between objectives (research question) and
Simple Linear Regression Inference
Simple Linear Regression Inference 1 Inference requirements The Normality assumption of the stochastic term e is needed for inference even if it is not a OLS requirement. Therefore we have: Interpretation
Pearson's Correlation Tests
Chapter 800 Pearson's Correlation Tests Introduction The correlation coefficient, ρ (rho), is a popular statistic for describing the strength of the relationship between two variables. The correlation
Using Heterogeneous Choice Models. To Compare Logit and Probit Coefficients Across Groups
Using Heterogeneous Choice Models To Compare Logit and Probit Coefficients Across Groups Revised March 2009* Richard Williams, [email protected] * A final version of this paper appears in Sociological Methods
Regression Analysis (Spring, 2000)
Regression Analysis (Spring, 2000) By Wonjae Purposes: a. Explaining the relationship between Y and X variables with a model (Explain a variable Y in terms of Xs) b. Estimating and testing the intensity
Directions for using SPSS
Directions for using SPSS Table of Contents Connecting and Working with Files 1. Accessing SPSS... 2 2. Transferring Files to N:\drive or your computer... 3 3. Importing Data from Another File Format...
Chapter 7: Simple linear regression Learning Objectives
Chapter 7: Simple linear regression Learning Objectives Reading: Section 7.1 of OpenIntro Statistics Video: Correlation vs. causation, YouTube (2:19) Video: Intro to Linear Regression, YouTube (5:18) -
An Introduction to Statistics Course (ECOE 1302) Spring Semester 2011 Chapter 10- TWO-SAMPLE TESTS
The Islamic University of Gaza Faculty of Commerce Department of Economics and Political Sciences An Introduction to Statistics Course (ECOE 130) Spring Semester 011 Chapter 10- TWO-SAMPLE TESTS Practice
Analyzing Intervention Effects: Multilevel & Other Approaches. Simplest Intervention Design. Better Design: Have Pretest
Analyzing Intervention Effects: Multilevel & Other Approaches Joop Hox Methodology & Statistics, Utrecht Simplest Intervention Design R X Y E Random assignment Experimental + Control group Analysis: t
General Method: Difference of Means. 3. Calculate df: either Welch-Satterthwaite formula or simpler df = min(n 1, n 2 ) 1.
General Method: Difference of Means 1. Calculate x 1, x 2, SE 1, SE 2. 2. Combined SE = SE1 2 + SE2 2. ASSUMES INDEPENDENT SAMPLES. 3. Calculate df: either Welch-Satterthwaite formula or simpler df = min(n
5. Multiple regression
5. Multiple regression QBUS6840 Predictive Analytics https://www.otexts.org/fpp/5 QBUS6840 Predictive Analytics 5. Multiple regression 2/39 Outline Introduction to multiple linear regression Some useful
II. DISTRIBUTIONS distribution normal distribution. standard scores
Appendix D Basic Measurement And Statistics The following information was developed by Steven Rothke, PhD, Department of Psychology, Rehabilitation Institute of Chicago (RIC) and expanded by Mary F. Schmidt,
MULTIPLE REGRESSION WITH CATEGORICAL DATA
DEPARTMENT OF POLITICAL SCIENCE AND INTERNATIONAL RELATIONS Posc/Uapp 86 MULTIPLE REGRESSION WITH CATEGORICAL DATA I. AGENDA: A. Multiple regression with categorical variables. Coding schemes. Interpreting
IMPACT EVALUATION: INSTRUMENTAL VARIABLE METHOD
REPUBLIC OF SOUTH AFRICA GOVERNMENT-WIDE MONITORING & IMPACT EVALUATION SEMINAR IMPACT EVALUATION: INSTRUMENTAL VARIABLE METHOD SHAHID KHANDKER World Bank June 2006 ORGANIZED BY THE WORLD BANK AFRICA IMPACT
Financial Risk Management Exam Sample Questions/Answers
Financial Risk Management Exam Sample Questions/Answers Prepared by Daniel HERLEMONT 1 2 3 4 5 6 Chapter 3 Fundamentals of Statistics FRM-99, Question 4 Random walk assumes that returns from one time period
Univariate Regression
Univariate Regression Correlation and Regression The regression line summarizes the linear relationship between 2 variables Correlation coefficient, r, measures strength of relationship: the closer r is
Correlational Research
Correlational Research Chapter Fifteen Correlational Research Chapter Fifteen Bring folder of readings The Nature of Correlational Research Correlational Research is also known as Associational Research.
Introduction to Fixed Effects Methods
Introduction to Fixed Effects Methods 1 1.1 The Promise of Fixed Effects for Nonexperimental Research... 1 1.2 The Paired-Comparisons t-test as a Fixed Effects Method... 2 1.3 Costs and Benefits of Fixed
Topic 1 - Introduction to Labour Economics. Professor H.J. Schuetze Economics 370. What is Labour Economics?
Topic 1 - Introduction to Labour Economics Professor H.J. Schuetze Economics 370 What is Labour Economics? Let s begin by looking at what economics is in general Study of interactions between decision
Part 2: Analysis of Relationship Between Two Variables
Part 2: Analysis of Relationship Between Two Variables Linear Regression Linear correlation Significance Tests Multiple regression Linear Regression Y = a X + b Dependent Variable Independent Variable
Moderator and Mediator Analysis
Moderator and Mediator Analysis Seminar General Statistics Marijtje van Duijn October 8, Overview What is moderation and mediation? What is their relation to statistical concepts? Example(s) October 8,
LOGIT AND PROBIT ANALYSIS
LOGIT AND PROBIT ANALYSIS A.K. Vasisht I.A.S.R.I., Library Avenue, New Delhi 110 012 [email protected] In dummy regression variable models, it is assumed implicitly that the dependent variable Y
2. Linear regression with multiple regressors
2. Linear regression with multiple regressors Aim of this section: Introduction of the multiple regression model OLS estimation in multiple regression Measures-of-fit in multiple regression Assumptions
Lecture 15. Endogeneity & Instrumental Variable Estimation
Lecture 15. Endogeneity & Instrumental Variable Estimation Saw that measurement error (on right hand side) means that OLS will be biased (biased toward zero) Potential solution to endogeneity instrumental
Statistics Review PSY379
Statistics Review PSY379 Basic concepts Measurement scales Populations vs. samples Continuous vs. discrete variable Independent vs. dependent variable Descriptive vs. inferential stats Common analyses
August 2012 EXAMINATIONS Solution Part I
August 01 EXAMINATIONS Solution Part I (1) In a random sample of 600 eligible voters, the probability that less than 38% will be in favour of this policy is closest to (B) () In a large random sample,
Mgmt 469. Regression Basics. You have all had some training in statistics and regression analysis. Still, it is useful to review
Mgmt 469 Regression Basics You have all had some training in statistics and regression analysis. Still, it is useful to review some basic stuff. In this note I cover the following material: What is a regression
Wooldridge, Introductory Econometrics, 3d ed. Chapter 12: Serial correlation and heteroskedasticity in time series regressions
Wooldridge, Introductory Econometrics, 3d ed. Chapter 12: Serial correlation and heteroskedasticity in time series regressions What will happen if we violate the assumption that the errors are not serially
Additional sources Compilation of sources: http://lrs.ed.uiuc.edu/tseportal/datacollectionmethodologies/jin-tselink/tselink.htm
Mgt 540 Research Methods Data Analysis 1 Additional sources Compilation of sources: http://lrs.ed.uiuc.edu/tseportal/datacollectionmethodologies/jin-tselink/tselink.htm http://web.utk.edu/~dap/random/order/start.htm
Testing and Interpreting Interactions in Regression In a Nutshell
Testing and Interpreting Interactions in Regression In a Nutshell The principles given here always apply when interpreting the coefficients in a multiple regression analysis containing interactions. However,
The correlation coefficient
The correlation coefficient Clinical Biostatistics The correlation coefficient Martin Bland Correlation coefficients are used to measure the of the relationship or association between two quantitative
Econometrics Simple Linear Regression
Econometrics Simple Linear Regression Burcu Eke UC3M Linear equations with one variable Recall what a linear equation is: y = b 0 + b 1 x is a linear equation with one variable, or equivalently, a straight
Multiple Regression: What Is It?
Multiple Regression Multiple Regression: What Is It? Multiple regression is a collection of techniques in which there are multiple predictors of varying kinds and a single outcome We are interested in
NCSS Statistical Software Principal Components Regression. In ordinary least squares, the regression coefficients are estimated using the formula ( )
Chapter 340 Principal Components Regression Introduction is a technique for analyzing multiple regression data that suffer from multicollinearity. When multicollinearity occurs, least squares estimates
Psychology 60 Fall 2013 Practice Exam Actual Exam: Next Monday. Good luck!
Psychology 60 Fall 2013 Practice Exam Actual Exam: Next Monday. Good luck! Name: 1. The basic idea behind hypothesis testing: A. is important only if you want to compare two populations. B. depends on
Correlation. What Is Correlation? Perfect Correlation. Perfect Correlation. Greg C Elvers
Correlation Greg C Elvers What Is Correlation? Correlation is a descriptive statistic that tells you if two variables are related to each other E.g. Is your related to how much you study? When two variables
COMPARISONS OF CUSTOMER LOYALTY: PUBLIC & PRIVATE INSURANCE COMPANIES.
277 CHAPTER VI COMPARISONS OF CUSTOMER LOYALTY: PUBLIC & PRIVATE INSURANCE COMPANIES. This chapter contains a full discussion of customer loyalty comparisons between private and public insurance companies
Multiple Choice Models II
Multiple Choice Models II Laura Magazzini University of Verona [email protected] http://dse.univr.it/magazzini Laura Magazzini (@univr.it) Multiple Choice Models II 1 / 28 Categorical data Categorical
A Primer on Forecasting Business Performance
A Primer on Forecasting Business Performance There are two common approaches to forecasting: qualitative and quantitative. Qualitative forecasting methods are important when historical data is not available.
HYPOTHESIS TESTING: POWER OF THE TEST
HYPOTHESIS TESTING: POWER OF THE TEST The first 6 steps of the 9-step test of hypothesis are called "the test". These steps are not dependent on the observed data values. When planning a research project,
Regression Analysis: A Complete Example
Regression Analysis: A Complete Example This section works out an example that includes all the topics we have discussed so far in this chapter. A complete example of regression analysis. PhotoDisc, Inc./Getty
Week TSX Index 1 8480 2 8470 3 8475 4 8510 5 8500 6 8480
1) The S & P/TSX Composite Index is based on common stock prices of a group of Canadian stocks. The weekly close level of the TSX for 6 weeks are shown: Week TSX Index 1 8480 2 8470 3 8475 4 8510 5 8500
Copyright 2007 by Laura Schultz. All rights reserved. Page 1 of 5
Using Your TI-83/84 Calculator: Linear Correlation and Regression Elementary Statistics Dr. Laura Schultz This handout describes how to use your calculator for various linear correlation and regression
Please follow the directions once you locate the Stata software in your computer. Room 114 (Business Lab) has computers with Stata software
STATA Tutorial Professor Erdinç Please follow the directions once you locate the Stata software in your computer. Room 114 (Business Lab) has computers with Stata software 1.Wald Test Wald Test is used
C. The null hypothesis is not rejected when the alternative hypothesis is true. A. population parameters.
Sample Multiple Choice Questions for the material since Midterm 2. Sample questions from Midterms and 2 are also representative of questions that may appear on the final exam.. A randomly selected sample
Minitab Tutorials for Design and Analysis of Experiments. Table of Contents
Table of Contents Introduction to Minitab...2 Example 1 One-Way ANOVA...3 Determining Sample Size in One-way ANOVA...8 Example 2 Two-factor Factorial Design...9 Example 3: Randomized Complete Block Design...14
14.74 Lecture 9: Child Labor
14.74 Lecture 9: Child Labor Prof. Esther Duflo March 9, 2004 1 The Facts In 2002, 210 million did child labor, half of them full time. 10% of the world s children work full time. In central Africa, 33%
SIMON FRASER UNIVERSITY
SIMON FRASER UNIVERSITY BUEC 333: Statistics for Business and Economics. MIDTERM EXAM: PART I Instructor: Alex Jameson Appiah February. 27, 1996. Time: 50 mins. Name: ------------------------------------------------------
Overview of Non-Parametric Statistics PRESENTER: ELAINE EISENBEISZ OWNER AND PRINCIPAL, OMEGA STATISTICS
Overview of Non-Parametric Statistics PRESENTER: ELAINE EISENBEISZ OWNER AND PRINCIPAL, OMEGA STATISTICS About Omega Statistics Private practice consultancy based in Southern California, Medical and Clinical
Module 5: Statistical Analysis
Module 5: Statistical Analysis To answer more complex questions using your data, or in statistical terms, to test your hypothesis, you need to use more advanced statistical tests. This module reviews the
Regression III: Advanced Methods
Lecture 16: Generalized Additive Models Regression III: Advanced Methods Bill Jacoby Michigan State University http://polisci.msu.edu/jacoby/icpsr/regress3 Goals of the Lecture Introduce Additive Models
POLYNOMIAL AND MULTIPLE REGRESSION. Polynomial regression used to fit nonlinear (e.g. curvilinear) data into a least squares linear regression model.
Polynomial Regression POLYNOMIAL AND MULTIPLE REGRESSION Polynomial regression used to fit nonlinear (e.g. curvilinear) data into a least squares linear regression model. It is a form of linear regression
Factors affecting online sales
Factors affecting online sales Table of contents Summary... 1 Research questions... 1 The dataset... 2 Descriptive statistics: The exploratory stage... 3 Confidence intervals... 4 Hypothesis tests... 4
Independent t- Test (Comparing Two Means)
Independent t- Test (Comparing Two Means) The objectives of this lesson are to learn: the definition/purpose of independent t-test when to use the independent t-test the use of SPSS to complete an independent
Class 19: Two Way Tables, Conditional Distributions, Chi-Square (Text: Sections 2.5; 9.1)
Spring 204 Class 9: Two Way Tables, Conditional Distributions, Chi-Square (Text: Sections 2.5; 9.) Big Picture: More than Two Samples In Chapter 7: We looked at quantitative variables and compared the
KSTAT MINI-MANUAL. Decision Sciences 434 Kellogg Graduate School of Management
KSTAT MINI-MANUAL Decision Sciences 434 Kellogg Graduate School of Management Kstat is a set of macros added to Excel and it will enable you to do the statistics required for this course very easily. To
SPSS ADVANCED ANALYSIS WENDIANN SETHI SPRING 2011
SPSS ADVANCED ANALYSIS WENDIANN SETHI SPRING 2011 Statistical techniques to be covered Explore relationships among variables Correlation Regression/Multiple regression Logistic regression Factor analysis
Name: Date: Use the following to answer questions 2-3:
Name: Date: 1. A study is conducted on students taking a statistics class. Several variables are recorded in the survey. Identify each variable as categorical or quantitative. A) Type of car the student
Statistical Machine Learning
Statistical Machine Learning UoC Stats 37700, Winter quarter Lecture 4: classical linear and quadratic discriminants. 1 / 25 Linear separation For two classes in R d : simple idea: separate the classes
Institute of Actuaries of India Subject CT3 Probability and Mathematical Statistics
Institute of Actuaries of India Subject CT3 Probability and Mathematical Statistics For 2015 Examinations Aim The aim of the Probability and Mathematical Statistics subject is to provide a grounding in
