Panel Data Analysis in Stata
|
|
|
- Carol Lucas
- 9 years ago
- Views:
Transcription
1 Panel Data Analysis in Stata Anton Parlow Lab session Econ710 UWM Econ Department??/??/2010 or in a S-Bahn in Berlin, you never know..
2 Our plan Introduction to Panel data Fixed vs. Random effects Testing for fixed effects Testing for random effects Fixed or random effect? Example: Gravity model Exercises
3 Introduction to Panel data Panel data are cross-sectional data observed over time e.g we observe the same households, firms or countries over a couple of years. Panel data are also known as longitudinal data. In general: y it = α + βx it + ɛ it where i = 1..N cross-sectional observations and t = 1..T years Panel data have following advantages over pooled data (Baltagi 2004): (1) account for heterogeneity across individual units which is assumed away in pooled data (2) deal with time-invariant omitted variables as we can find in pooled data (3) are less likely to have problems with autocorrelation and multicollinearity as time series data do Aranello (2003) emphasizes (1) as the advantage from using panel data. There are basically two types of panel models, the fixed effects and the random effects model. They differ by their assumptions how the heterogeneity is captured and estimation techniques (fixed = OLS, random = GLS).
4 Fixed vs. random effects The fixed effect model assumes that individual heterogeneity is captured by the intercept term. This means every individual gets his own intercept α i while the slope coefficients are the same. This also means that the heterogeneity is associated with the regressors on the right hand side. The fixed effects model is also known as Least square dummy variable estimator (LSDV) because we assign pretty much a dummy to every individual. The random effects model assume in some sense that the individual effects are captured by the intercept and a random component µ i. This random component is not associated with the regressors on the right hand side and part of the error term. The intercept becomes α + µ i. That is the reason why some textbooks write both capture the heterogeneity by the intercept term. The assumption of the random effects model that individual effects are not associated with explanatory variables is a big one! But it allows us to estimate the effect of time-invariant variables which cancel out in a fixed effects estimation. Baltagi (and Hsiao) introduce both estimators as a one-way-error component model. For both estimators the error-term ɛ it equals µ i + v it where µ i captures the individual effect and is assumed to be fixed in the fixed effects model. For the random effects model it is stochastic and distributed. In other words individual effect are not correlated with the error-term but with the regressors in the fixed effects model (vice versa in the random effects model).
5 Fixed vs. random effects continued The regression equations come down to for the fixed effects model: y it = α i + β 1 X it + ɛ it where ɛ it = µ i + v it and µ i = 0 and for the random effects model: y it = α + β 1 X it + ɛ it where ɛ it = µ i + v it You know that for the random effects model you need to use a GLS-estimator which is a weighted average of between and within effects. It tells you where the variation comes from e.g. from within the individuals or between the individuals. The LSDV estimator assumes all the variation (or heterogeneity) comes from the within or from the individuals. If you assume all the variation comes from between the groups you have a between-estimator still using OLS. Let the random effects estimator be: ˆβ GLS = Wxy + Φ2 B xy W xx + Φ 2 B xx where Φ 2 is the weight on the between variation. The Stata output will tell you where the variations came from.
6 Testing for fixed effects Testing for fixed effects involves a F-test comparing for the pooled OLS results with the results from the LSDV-estimation. The pooled OLS is the restricted model and if we reject H 0 fixed effects are present. The F-test has following form: F = (RSS URSS)/(N 1) URSS/(NT N K) F N 1,N(T 1) K Don t worry this is part of the fixed effect output. Although always a nice exercise to this by hand..
7 Testing for random effects Involves a LaGrange Multiplier test developed by Breusch and Pagan. After a random effects regression this tests for the presence of random effects in the underlying pooled OLS. Following Baltagi (2004) that λ LM = The null hypothesis is H 0 = var(µ) = 0 nt 2(T 1) ni=1 ( T t=1 ɛ it ) 2 ni=1 Tt=1 ɛ 2 it 1 and χ 2 1 If we can reject the null random effects are present (remember p < 0.05 and you can reject any null!) Does it mean random effects are more efficient than fixed effects if random effects are present? Not necessarily but the Hausman specification test helps a bit to decide.
8 Fixed or random effect? The Hausman specification test is a very general test and can be used if two models could be used for the same question. In our example we have the fixed and the random effects model. Both models will be consistent estimator but we assume that the random effects estimator is more efficient e.g. uses less degrees of freedom. The null hypothesis tells us pretty much the same while the alternative is that only the fixed effect model is consistent. If we reject the Null we cannot use the random effects model. The problem is that the Hausman test rejects the random effects model very often and does not work very well in small samples (Baum 2006). It comes down what you think which model is more appropriate given your data and your question. But in general the Hausman test looks likes this (Hosny 2009): [ H = β ˆ FE β ˆ ] RE [Var( β ˆ FE β ˆ 1 [ RE )] β ˆ FE β ˆ ] RE and χ 2 k 1
9 Example: Gravity model Imagine someone gives you data for trade and conflicts between countries. Furthermore he is very generous in gives you also GDP, per capita income and the actual distance between country pairs. You want to know if conflicts affect trade negatively or in other terms if trade promotes somehow peace. Big question and there is a big debate in political science. Someone tells you a gravity model is similar to the one in physics and using your variables could look like this: ln(trade ij ) = β 0 + β 1 ln(gdp i ) + β 2 ln(gdp j ) + β 3 distance ij +β 3 ln(pci i ) + β 4 ln(pci j ) + n i=1 γ i at i + n i=1 γ j at j + ɛ ij who knows maybe you should also add country attributes at i. What would you use? Now imagine there are many papers out there just estimating pooled regression models in varies forms. Do you think if we observe countries over time trading with each other, that they miss something while assuming countries stay the same over time? Likely that you would say yes and want to use a panel estimation.
10 Example: Gravity model continued Usually you have your cross-section in annual form meaning for every year one data-file. If you want to use them together you have to merge them into on data-set. Before you can merge every observations needs an unique identifier id and you need for every annual dataset also a variable indicating which year it is. Example: You observe trade between the US and Germany over 2 years. This is the same trade relationship, so give it an id-number equals 1 or id = 1 for both years. Imagine you observe them 1988 and The year variable takes the value 1988 in 1988 (!) and of course 1989 in 1989 (!). The identifier variable allows to follow this trade relationship over time in a panel. Before you can merge data-sets, they have to be sorted individually. You open every data-set and sort them: e.g. if you have your data-set for 1989 use: sort id year do the same for every year following. Open the first data set and use following command for merging another to it: merge id year using location and name of the other data-set.dta sort again! And merge another to it.. do it until you have all years merged into one data-set (=your panel)
11 Example: Gravity model continued Tell Stata you want to use it as a panel xtset id year if your panel is strongly balanced, then you don t have to worry about unbalanced panels Let us do a simple pooled OLS: reg ldyt conflict ij lrgdp i lrgdp j lpci i lpci j ldist1 distance2 and compare it to a fixed effects estimation (=LSDV) xtreg ldyt conflict ij lrgdp i lrgdp j lpci i lpci j ldist1 distance2, fe where Stata uses xt-commands for panel models the option fe tells Stata to estimate a fixed effect model Look at the attached output! At the bottom you see the F-test for pooled OLS vs. fixed effects. You should be able to reject the null. We can conclude fixed effects are present.
12 Example: Gravity model continued Now let us estimate a random effects model xtreg ldyt conflict ij lrgdp i lrgdp j lpci i lpci j ldist1 distance2, re See only the option changed to re. If you don t specify an option, Stata assume a random effects model anyway. Let us test for random effects in the underlying pooled OLS using the random effects regression results. The Breusch-Pagan test has following command: xttest0 You should find random effects! Or you can reject the Null!
13 Example: Gravity model continued Finally let us do a Hausman specification test for testing fixed against random effects. We have to use the estimates of the fixed and random effects models. xtreg ldyt conflict ij lrgdp i lrgdp j lpci i lpci j ldist1 distance2, fe est store fixed which saves the results in fixed xtreg ldyt conflict ij lrgdp i lrgdp j lpci i lpci j ldist1 distance2, re est store random which saves the results in random and finally hausman fixed random where the second model is the one you think, which is more efficient. We should be able to reject the Null and conclude that fixed effects are more efficient.
14 Exercises Estimate the above pooled regression. Do the fixed effect model again. Use the RSS from the pooled and the fixed effects regression to compute the F-test by hand. Hint use the help for the xtreg command to figure out how to find the RSS from the fixed effects model. (okay: use display e(rss after the regression)
Panel Data Analysis Fixed and Random Effects using Stata (v. 4.2)
Panel Data Analysis Fixed and Random Effects using Stata (v. 4.2) Oscar Torres-Reyna [email protected] December 2007 http://dss.princeton.edu/training/ Intro Panel data (also known as longitudinal
ECON 142 SKETCH OF SOLUTIONS FOR APPLIED EXERCISE #2
University of California, Berkeley Prof. Ken Chay Department of Economics Fall Semester, 005 ECON 14 SKETCH OF SOLUTIONS FOR APPLIED EXERCISE # Question 1: a. Below are the scatter plots of hourly wages
I n d i a n a U n i v e r s i t y U n i v e r s i t y I n f o r m a t i o n T e c h n o l o g y S e r v i c e s
I n d i a n a U n i v e r s i t y U n i v e r s i t y I n f o r m a t i o n T e c h n o l o g y S e r v i c e s Linear Regression Models for Panel Data Using SAS, Stata, LIMDEP, and SPSS * Hun Myoung Park,
Regression Analysis (Spring, 2000)
Regression Analysis (Spring, 2000) By Wonjae Purposes: a. Explaining the relationship between Y and X variables with a model (Explain a variable Y in terms of Xs) b. Estimating and testing the intensity
Correlated Random Effects Panel Data Models
INTRODUCTION AND LINEAR MODELS Correlated Random Effects Panel Data Models IZA Summer School in Labor Economics May 13-19, 2013 Jeffrey M. Wooldridge Michigan State University 1. Introduction 2. The Linear
Example: Boats and Manatees
Figure 9-6 Example: Boats and Manatees Slide 1 Given the sample data in Table 9-1, find the value of the linear correlation coefficient r, then refer to Table A-6 to determine whether there is a significant
Panel Data: Linear Models
Panel Data: Linear Models Laura Magazzini University of Verona [email protected] http://dse.univr.it/magazzini Laura Magazzini (@univr.it) Panel Data: Linear Models 1 / 45 Introduction Outline What
Introduction to Quantitative Methods
Introduction to Quantitative Methods October 15, 2009 Contents 1 Definition of Key Terms 2 2 Descriptive Statistics 3 2.1 Frequency Tables......................... 4 2.2 Measures of Central Tendencies.................
MULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS
MULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS MSR = Mean Regression Sum of Squares MSE = Mean Squared Error RSS = Regression Sum of Squares SSE = Sum of Squared Errors/Residuals α = Level of Significance
Course Objective This course is designed to give you a basic understanding of how to run regressions in SPSS.
SPSS Regressions Social Science Research Lab American University, Washington, D.C. Web. www.american.edu/provost/ctrl/pclabs.cfm Tel. x3862 Email. [email protected] Course Objective This course is designed
Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression
Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression Objectives: To perform a hypothesis test concerning the slope of a least squares line To recognize that testing for a
Ordinal Regression. Chapter
Ordinal Regression Chapter 4 Many variables of interest are ordinal. That is, you can rank the values, but the real distance between categories is unknown. Diseases are graded on scales from least severe
Chapter 10: Basic Linear Unobserved Effects Panel Data. Models:
Chapter 10: Basic Linear Unobserved Effects Panel Data Models: Microeconomic Econometrics I Spring 2010 10.1 Motivation: The Omitted Variables Problem We are interested in the partial effects of the observable
Chapter 7: Dummy variable regression
Chapter 7: Dummy variable regression Why include a qualitative independent variable?........................................ 2 Simplest model 3 Simplest case.............................................................
Simple Linear Regression Inference
Simple Linear Regression Inference 1 Inference requirements The Normality assumption of the stochastic term e is needed for inference even if it is not a OLS requirement. Therefore we have: Interpretation
Econometrics Simple Linear Regression
Econometrics Simple Linear Regression Burcu Eke UC3M Linear equations with one variable Recall what a linear equation is: y = b 0 + b 1 x is a linear equation with one variable, or equivalently, a straight
Hypothesis testing - Steps
Hypothesis testing - Steps Steps to do a two-tailed test of the hypothesis that β 1 0: 1. Set up the hypotheses: H 0 : β 1 = 0 H a : β 1 0. 2. Compute the test statistic: t = b 1 0 Std. error of b 1 =
UNIVERSITY OF WAIKATO. Hamilton New Zealand
UNIVERSITY OF WAIKATO Hamilton New Zealand Can We Trust Cluster-Corrected Standard Errors? An Application of Spatial Autocorrelation with Exact Locations Known John Gibson University of Waikato Bonggeun
Chapter 10. Key Ideas Correlation, Correlation Coefficient (r),
Chapter 0 Key Ideas Correlation, Correlation Coefficient (r), Section 0-: Overview We have already explored the basics of describing single variable data sets. However, when two quantitative variables
DETERMINANTS OF CAPITAL ADEQUACY RATIO IN SELECTED BOSNIAN BANKS
DETERMINANTS OF CAPITAL ADEQUACY RATIO IN SELECTED BOSNIAN BANKS Nađa DRECA International University of Sarajevo [email protected] Abstract The analysis of a data set of observation for 10
IAPRI Quantitative Analysis Capacity Building Series. Multiple regression analysis & interpreting results
IAPRI Quantitative Analysis Capacity Building Series Multiple regression analysis & interpreting results How important is R-squared? R-squared Published in Agricultural Economics 0.45 Best article of the
Association Between Variables
Contents 11 Association Between Variables 767 11.1 Introduction............................ 767 11.1.1 Measure of Association................. 768 11.1.2 Chapter Summary.................... 769 11.2 Chi
SIMPLE LINEAR CORRELATION. r can range from -1 to 1, and is independent of units of measurement. Correlation can be done on two dependent variables.
SIMPLE LINEAR CORRELATION Simple linear correlation is a measure of the degree to which two variables vary together, or a measure of the intensity of the association between two variables. Correlation
12.5: CHI-SQUARE GOODNESS OF FIT TESTS
125: Chi-Square Goodness of Fit Tests CD12-1 125: CHI-SQUARE GOODNESS OF FIT TESTS In this section, the χ 2 distribution is used for testing the goodness of fit of a set of data to a specific probability
SPSS Guide: Regression Analysis
SPSS Guide: Regression Analysis I put this together to give you a step-by-step guide for replicating what we did in the computer lab. It should help you run the tests we covered. The best way to get familiar
Wooldridge, Introductory Econometrics, 3d ed. Chapter 12: Serial correlation and heteroskedasticity in time series regressions
Wooldridge, Introductory Econometrics, 3d ed. Chapter 12: Serial correlation and heteroskedasticity in time series regressions What will happen if we violate the assumption that the errors are not serially
Econometric Methods for Panel Data
Based on the books by Baltagi: Econometric Analysis of Panel Data and by Hsiao: Analysis of Panel Data Robert M. Kunst [email protected] University of Vienna and Institute for Advanced Studies
Introduction to Regression Models for Panel Data Analysis. Indiana University Workshop in Methods October 7, 2011. Professor Patricia A.
Introduction to Regression Models for Panel Data Analysis Indiana University Workshop in Methods October 7, 2011 Professor Patricia A. McManus Panel Data Analysis October 2011 What are Panel Data? Panel
Nonlinear Regression Functions. SW Ch 8 1/54/
Nonlinear Regression Functions SW Ch 8 1/54/ The TestScore STR relation looks linear (maybe) SW Ch 8 2/54/ But the TestScore Income relation looks nonlinear... SW Ch 8 3/54/ Nonlinear Regression General
Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model
Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model 1 September 004 A. Introduction and assumptions The classical normal linear regression model can be written
Illustration (and the use of HLM)
Illustration (and the use of HLM) Chapter 4 1 Measurement Incorporated HLM Workshop The Illustration Data Now we cover the example. In doing so we does the use of the software HLM. In addition, we will
COURSES: 1. Short Course in Econometrics for the Practitioner (P000500) 2. Short Course in Econometric Analysis of Cointegration (P000537)
Get the latest knowledge from leading global experts. Financial Science Economics Economics Short Courses Presented by the Department of Economics, University of Pretoria WITH 2015 DATES www.ce.up.ac.za
1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96
1 Final Review 2 Review 2.1 CI 1-propZint Scenario 1 A TV manufacturer claims in its warranty brochure that in the past not more than 10 percent of its TV sets needed any repair during the first two years
Lab 5 Linear Regression with Within-subject Correlation. Goals: Data: Use the pig data which is in wide format:
Lab 5 Linear Regression with Within-subject Correlation Goals: Data: Fit linear regression models that account for within-subject correlation using Stata. Compare weighted least square, GEE, and random
Analyzing Intervention Effects: Multilevel & Other Approaches. Simplest Intervention Design. Better Design: Have Pretest
Analyzing Intervention Effects: Multilevel & Other Approaches Joop Hox Methodology & Statistics, Utrecht Simplest Intervention Design R X Y E Random assignment Experimental + Control group Analysis: t
Simple Regression Theory II 2010 Samuel L. Baker
SIMPLE REGRESSION THEORY II 1 Simple Regression Theory II 2010 Samuel L. Baker Assessing how good the regression equation is likely to be Assignment 1A gets into drawing inferences about how close the
Multinomial and Ordinal Logistic Regression
Multinomial and Ordinal Logistic Regression ME104: Linear Regression Analysis Kenneth Benoit August 22, 2012 Regression with categorical dependent variables When the dependent variable is categorical,
Wooldridge, Introductory Econometrics, 4th ed. Chapter 7: Multiple regression analysis with qualitative information: Binary (or dummy) variables
Wooldridge, Introductory Econometrics, 4th ed. Chapter 7: Multiple regression analysis with qualitative information: Binary (or dummy) variables We often consider relationships between observed outcomes
The Effect of R&D Expenditures on Stock Returns, Price and Volatility
Master Degree Project in Finance The Effect of R&D Expenditures on Stock Returns, Price and Volatility A study on biotechnological and pharmaceutical industry in the US market Aleksandra Titi Supervisor:
Multiple Linear Regression
Multiple Linear Regression A regression with two or more explanatory variables is called a multiple regression. Rather than modeling the mean response as a straight line, as in simple regression, it is
MODELS FOR PANEL DATA Q
Greene-2140242 book November 23, 2010 12:28 11 MODELS FOR PANEL DATA Q 11.1 INTRODUCTION Data sets that combine time series and cross sections are common in economics. The published statistics of the OECD
Clustering in the Linear Model
Short Guides to Microeconometrics Fall 2014 Kurt Schmidheiny Universität Basel Clustering in the Linear Model 2 1 Introduction Clustering in the Linear Model This handout extends the handout on The Multiple
SYSTEMS OF REGRESSION EQUATIONS
SYSTEMS OF REGRESSION EQUATIONS 1. MULTIPLE EQUATIONS y nt = x nt n + u nt, n = 1,...,N, t = 1,...,T, x nt is 1 k, and n is k 1. This is a version of the standard regression model where the observations
Study Guide for the Final Exam
Study Guide for the Final Exam When studying, remember that the computational portion of the exam will only involve new material (covered after the second midterm), that material from Exam 1 will make
2. Linear regression with multiple regressors
2. Linear regression with multiple regressors Aim of this section: Introduction of the multiple regression model OLS estimation in multiple regression Measures-of-fit in multiple regression Assumptions
2 Sample t-test (unequal sample sizes and unequal variances)
Variations of the t-test: Sample tail Sample t-test (unequal sample sizes and unequal variances) Like the last example, below we have ceramic sherd thickness measurements (in cm) of two samples representing
Module 5: Multiple Regression Analysis
Using Statistical Data Using to Make Statistical Decisions: Data Multiple to Make Regression Decisions Analysis Page 1 Module 5: Multiple Regression Analysis Tom Ilvento, University of Delaware, College
E(y i ) = x T i β. yield of the refined product as a percentage of crude specific gravity vapour pressure ASTM 10% point ASTM end point in degrees F
Random and Mixed Effects Models (Ch. 10) Random effects models are very useful when the observations are sampled in a highly structured way. The basic idea is that the error associated with any linear,
HYPOTHESIS TESTING: CONFIDENCE INTERVALS, T-TESTS, ANOVAS, AND REGRESSION
HYPOTHESIS TESTING: CONFIDENCE INTERVALS, T-TESTS, ANOVAS, AND REGRESSION HOD 2990 10 November 2010 Lecture Background This is a lightning speed summary of introductory statistical methods for senior undergraduate
Scatter Plot, Correlation, and Regression on the TI-83/84
Scatter Plot, Correlation, and Regression on the TI-83/84 Summary: When you have a set of (x,y) data points and want to find the best equation to describe them, you are performing a regression. This page
ESTIMATING AN ECONOMIC MODEL OF CRIME USING PANEL DATA FROM NORTH CAROLINA BADI H. BALTAGI*
JOURNAL OF APPLIED ECONOMETRICS J. Appl. Econ. 21: 543 547 (2006) Published online in Wiley InterScience (www.interscience.wiley.com). DOI: 10.1002/jae.861 ESTIMATING AN ECONOMIC MODEL OF CRIME USING PANEL
CALCULATIONS & STATISTICS
CALCULATIONS & STATISTICS CALCULATION OF SCORES Conversion of 1-5 scale to 0-100 scores When you look at your report, you will notice that the scores are reported on a 0-100 scale, even though respondents
A Panel Data Analysis of Foreign Trade Determinants of Nepal: Gravity Model Approach
2012 Nepal Rastra Bank NRB Working Paper No. 13 November 2012 A Panel Data Analysis of Foreign Trade Determinants of Nepal: Gravity Model Approach Subash Acharya * ABSTRACT This study aims to identify
Spatial panel models
Spatial panel models J Paul Elhorst University of Groningen, Department of Economics, Econometrics and Finance PO Box 800, 9700 AV Groningen, the Netherlands Phone: +31 50 3633893, Fax: +31 50 3637337,
Please follow the directions once you locate the Stata software in your computer. Room 114 (Business Lab) has computers with Stata software
STATA Tutorial Professor Erdinç Please follow the directions once you locate the Stata software in your computer. Room 114 (Business Lab) has computers with Stata software 1.Wald Test Wald Test is used
Longitudinal (Panel and Time Series Cross-Section) Data
Longitudinal (Panel and Time Series Cross-Section) Data Nathaniel Beck Department of Politics NYU New York, NY 10012 [email protected] http://www.nyu.edu/gsas/dept/politics/faculty/beck/beck home.html
Research Methods & Experimental Design
Research Methods & Experimental Design 16.422 Human Supervisory Control April 2004 Research Methods Qualitative vs. quantitative Understanding the relationship between objectives (research question) and
Is Infrastructure Capital Productive? A Dynamic Heterogeneous Approach.
Is Infrastructure Capital Productive? A Dynamic Heterogeneous Approach. César Calderón a, Enrique Moral-Benito b, Luis Servén a a The World Bank b CEMFI International conference on Infrastructure Economics
Causal Forecasting Models
CTL.SC1x -Supply Chain & Logistics Fundamentals Causal Forecasting Models MIT Center for Transportation & Logistics Causal Models Used when demand is correlated with some known and measurable environmental
IMPACT EVALUATION: INSTRUMENTAL VARIABLE METHOD
REPUBLIC OF SOUTH AFRICA GOVERNMENT-WIDE MONITORING & IMPACT EVALUATION SEMINAR IMPACT EVALUATION: INSTRUMENTAL VARIABLE METHOD SHAHID KHANDKER World Bank June 2006 ORGANIZED BY THE WORLD BANK AFRICA IMPACT
Lecture 3: Differences-in-Differences
Lecture 3: Differences-in-Differences Fabian Waldinger Waldinger () 1 / 55 Topics Covered in Lecture 1 Review of fixed effects regression models. 2 Differences-in-Differences Basics: Card & Krueger (1994).
Introduction to Regression and Data Analysis
Statlab Workshop Introduction to Regression and Data Analysis with Dan Campbell and Sherlock Campbell October 28, 2008 I. The basics A. Types of variables Your variables may take several forms, and it
Outline. Topic 4 - Analysis of Variance Approach to Regression. Partitioning Sums of Squares. Total Sum of Squares. Partitioning sums of squares
Topic 4 - Analysis of Variance Approach to Regression Outline Partitioning sums of squares Degrees of freedom Expected mean squares General linear test - Fall 2013 R 2 and the coefficient of correlation
Additional sources Compilation of sources: http://lrs.ed.uiuc.edu/tseportal/datacollectionmethodologies/jin-tselink/tselink.htm
Mgt 540 Research Methods Data Analysis 1 Additional sources Compilation of sources: http://lrs.ed.uiuc.edu/tseportal/datacollectionmethodologies/jin-tselink/tselink.htm http://web.utk.edu/~dap/random/order/start.htm
Multiple Regression: What Is It?
Multiple Regression Multiple Regression: What Is It? Multiple regression is a collection of techniques in which there are multiple predictors of varying kinds and a single outcome We are interested in
Descriptive Statistics
Descriptive Statistics Primer Descriptive statistics Central tendency Variation Relative position Relationships Calculating descriptive statistics Descriptive Statistics Purpose to describe or summarize
Financial Risk Management Exam Sample Questions/Answers
Financial Risk Management Exam Sample Questions/Answers Prepared by Daniel HERLEMONT 1 2 3 4 5 6 Chapter 3 Fundamentals of Statistics FRM-99, Question 4 Random walk assumes that returns from one time period
Forecast. Forecast is the linear function with estimated coefficients. Compute with predict command
Forecast Forecast is the linear function with estimated coefficients T T + h = b0 + b1timet + h Compute with predict command Compute residuals Forecast Intervals eˆ t = = y y t+ h t+ h yˆ b t+ h 0 b Time
Mgmt 469. Fixed Effects Models. Suppose you want to learn the effect of price on the demand for back massages. You
Mgmt 469 Fixed Effects Models Suppose you want to learn the effect of price on the demand for back massages. You have the following data from four Midwest locations: Table 1: A Single Cross-section of
Mgmt 469. Regression Basics. You have all had some training in statistics and regression analysis. Still, it is useful to review
Mgmt 469 Regression Basics You have all had some training in statistics and regression analysis. Still, it is useful to review some basic stuff. In this note I cover the following material: What is a regression
problem arises when only a non-random sample is available differs from censored regression model in that x i is also unobserved
4 Data Issues 4.1 Truncated Regression population model y i = x i β + ε i, ε i N(0, σ 2 ) given a random sample, {y i, x i } N i=1, then OLS is consistent and efficient problem arises when only a non-random
" Y. Notation and Equations for Regression Lecture 11/4. Notation:
Notation: Notation and Equations for Regression Lecture 11/4 m: The number of predictor variables in a regression Xi: One of multiple predictor variables. The subscript i represents any number from 1 through
Lesson 1: Comparison of Population Means Part c: Comparison of Two- Means
Lesson : Comparison of Population Means Part c: Comparison of Two- Means Welcome to lesson c. This third lesson of lesson will discuss hypothesis testing for two independent means. Steps in Hypothesis
Part 2: Analysis of Relationship Between Two Variables
Part 2: Analysis of Relationship Between Two Variables Linear Regression Linear correlation Significance Tests Multiple regression Linear Regression Y = a X + b Dependent Variable Independent Variable
Note 2 to Computer class: Standard mis-specification tests
Note 2 to Computer class: Standard mis-specification tests Ragnar Nymoen September 2, 2013 1 Why mis-specification testing of econometric models? As econometricians we must relate to the fact that the
1 Introduction. 2 The Econometric Model. Panel Data: Fixed and Random Effects. Short Guides to Microeconometrics Fall 2015
Short Guides to Microeconometrics Fall 2015 Kurt Schmidheiny Unversität Basel Panel Data: Fixed and Random Effects 2 Panel Data: Fixed and Random Effects 1 Introduction In panel data, individuals (persons,
What s New in Econometrics? Lecture 8 Cluster and Stratified Sampling
What s New in Econometrics? Lecture 8 Cluster and Stratified Sampling Jeff Wooldridge NBER Summer Institute, 2007 1. The Linear Model with Cluster Effects 2. Estimation with a Small Number of Groups and
2. What is the general linear model to be used to model linear trend? (Write out the model) = + + + or
Simple and Multiple Regression Analysis Example: Explore the relationships among Month, Adv.$ and Sales $: 1. Prepare a scatter plot of these data. The scatter plots for Adv.$ versus Sales, and Month versus
A Review of Cross Sectional Regression for Financial Data You should already know this material from previous study
A Review of Cross Sectional Regression for Financial Data You should already know this material from previous study But I will offer a review, with a focus on issues which arise in finance 1 TYPES OF FINANCIAL
Econometrics I: Econometric Methods
Econometrics I: Econometric Methods Jürgen Meinecke Research School of Economics, Australian National University 24 May, 2016 Housekeeping Assignment 2 is now history The ps tute this week will go through
Chapter 7 Notes - Inference for Single Samples. You know already for a large sample, you can invoke the CLT so:
Chapter 7 Notes - Inference for Single Samples You know already for a large sample, you can invoke the CLT so: X N(µ, ). Also for a large sample, you can replace an unknown σ by s. You know how to do a
CHAPTER 13 SIMPLE LINEAR REGRESSION. Opening Example. Simple Regression. Linear Regression
Opening Example CHAPTER 13 SIMPLE LINEAR REGREION SIMPLE LINEAR REGREION! Simple Regression! Linear Regression Simple Regression Definition A regression model is a mathematical equation that descries the
2. Simple Linear Regression
Research methods - II 3 2. Simple Linear Regression Simple linear regression is a technique in parametric statistics that is commonly used for analyzing mean response of a variable Y which changes according
Correlation. What Is Correlation? Perfect Correlation. Perfect Correlation. Greg C Elvers
Correlation Greg C Elvers What Is Correlation? Correlation is a descriptive statistic that tells you if two variables are related to each other E.g. Is your related to how much you study? When two variables
Performance Related Pay and Labor Productivity
DISCUSSION PAPER SERIES IZA DP No. 2211 Performance Related Pay and Labor Productivity Anne C. Gielen Marcel J.M. Kerkhofs Jan C. van Ours July 2006 Forschungsinstitut zur Zukunft der Arbeit Institute
Notes on Applied Linear Regression
Notes on Applied Linear Regression Jamie DeCoster Department of Social Psychology Free University Amsterdam Van der Boechorststraat 1 1081 BT Amsterdam The Netherlands phone: +31 (0)20 444-8935 email:
5. Linear Regression
5. Linear Regression Outline.................................................................... 2 Simple linear regression 3 Linear model............................................................. 4
14.74 Lecture 7: The effect of school buildings on schooling: A naturalexperiment
14.74 Lecture 7: The effect of school buildings on schooling: A naturalexperiment Esther Duflo February 25-March 1 1 The question we try to answer Does the availability of more schools cause an increase
Regression step-by-step using Microsoft Excel
Step 1: Regression step-by-step using Microsoft Excel Notes prepared by Pamela Peterson Drake, James Madison University Type the data into the spreadsheet The example used throughout this How to is a regression
Standard errors of marginal effects in the heteroskedastic probit model
Standard errors of marginal effects in the heteroskedastic probit model Thomas Cornelißen Discussion Paper No. 320 August 2005 ISSN: 0949 9962 Abstract In non-linear regression models, such as the heteroskedastic
Econometric analysis of the Belgian car market
Econometric analysis of the Belgian car market By: Prof. dr. D. Czarnitzki/ Ms. Céline Arts Tim Verheyden Introduction In contrast to typical examples from microeconomics textbooks on homogeneous goods
Final Exam Practice Problem Answers
Final Exam Practice Problem Answers The following data set consists of data gathered from 77 popular breakfast cereals. The variables in the data set are as follows: Brand: The brand name of the cereal
Discussion Section 4 ECON 139/239 2010 Summer Term II
Discussion Section 4 ECON 139/239 2010 Summer Term II 1. Let s use the CollegeDistance.csv data again. (a) An education advocacy group argues that, on average, a person s educational attainment would increase
DEPARTMENT OF PSYCHOLOGY UNIVERSITY OF LANCASTER MSC IN PSYCHOLOGICAL RESEARCH METHODS ANALYSING AND INTERPRETING DATA 2 PART 1 WEEK 9
DEPARTMENT OF PSYCHOLOGY UNIVERSITY OF LANCASTER MSC IN PSYCHOLOGICAL RESEARCH METHODS ANALYSING AND INTERPRETING DATA 2 PART 1 WEEK 9 Analysis of covariance and multiple regression So far in this course,
Categorical Data Analysis
Richard L. Scheaffer University of Florida The reference material and many examples for this section are based on Chapter 8, Analyzing Association Between Categorical Variables, from Statistical Methods
HYPOTHESIS TESTING: POWER OF THE TEST
HYPOTHESIS TESTING: POWER OF THE TEST The first 6 steps of the 9-step test of hypothesis are called "the test". These steps are not dependent on the observed data values. When planning a research project,
Sample Size Calculation for Longitudinal Studies
Sample Size Calculation for Longitudinal Studies Phil Schumm Department of Health Studies University of Chicago August 23, 2004 (Supported by National Institute on Aging grant P01 AG18911-01A1) Introduction
