Welcome! Lecture 1: Introduction. Course structure. Examination. Econometrics, 7.5hp. Textbook. Chapter 1: The Nature and Scope of Econometrics

Size: px
Start display at page:

Download "Welcome! Lecture 1: Introduction. Course structure. Examination. Econometrics, 7.5hp. Textbook. Chapter 1: The Nature and Scope of Econometrics"

Transcription

1 Basic Econometrics Lecture 1: Introduction Welcome! This is the first lecture on the course Econometrics, 7.5hp STGA02 & NEGB22 Iris wang Textbook Gujarati, D. N. (2003) Basic Econometrics (fifth edition),mcgraw-hill. Course structure 11 lectures, 4 computer classes, work on jointly with your group. SPSS It slearning: The course will be assessed as follows: Lab/Two assignments: 1,5 hp/ects (only G) Written exam: 6 hp/ects Examination: max 20 points (G 10p, VG 15p.) Examination The overall assessment is based on a written exam. The written exam is a closed book exam. Statistical tables and formula sheet will be supplied. The maximum score on the written exam is 20. The grade on each lab/assignment is Pass, hence the total overall mark on the labs/2 assignments is Pass (the labs are not compulsory and solutions have to be handed in on time for points to be awarded). The final marks are set according to the following principles: High Pass: A total score (labs/assignments+exam) higher than or equal to 15. Pass: A total score (exam) higher than or equal to 10 and lower than 15. Chapter 1: The Nature and Scope of Econometrics 1

2 What is Econometrics? The measurement of economic relationships the application of mathematical statistics to economic data to lend empirical support to models constructed by mathematical economics and to obtain numerical estimates (Samuelson et al., Econometrica, 1954) aims of econometric modelling explanation policy evaluation forecasting Econometrics is used for: Estimating economic relationships Testing economic theories Evaluating & implementing policy Forecasting For example: What is the effect of education on wages? How do training programs impact productivity? Types of data All empirical analysis requires data. We will now discuss a few different structures of data that you may come across if you do empirical analysis in economics: 1. Cross-sectional data 2. Time series data 3. Pooled cross sections 4. Panel (or longitudinal) data 1. Cross-sectional data A cross-sectional dataset consists of a sample of individuals, households, firms, taken at a given point in time Cross-sectional datasets are often obtained from random sampling from the underlying population. If the sample has not been drawn randomly, our methods may have to be adjusted. For now, we assume random sampling unless I say otherwise. Cross-sectional data (cont d) The dataset stored in the file WAGE1.SAV is a cross-section dataset. Illustration: Observation number. In cross-section datasets, the ordering of observations doesn t matter. Binary /dummy variables (yes = 1, no = 0) 2. Time series data A time series data set consists of observations on one or several variables over time. Unlike the arrangement of cross-sectional data, the chronological ordering of observations in a time series is important. A key feature of time series data that makes them more difficult to analyze than cross-sectional data is that observations are unlikely to be independent over time. Special methodological problems arise when we analyze time series data. 2

3 A time series dataset 3. Pooled cross-sections Some datasets have both cross-sectional and time series features. Example: household surveys from 1985 and 1990 which are combined to yield one dataset containing observations from both years. May be a useful basis for analysis of change of policy, for example, we often include time (year) as an additional explanatory variable in regressions based on pooled cross-sections. 3. Pooled cross-sections (cont d) Table 1: Two Years of Housing Prices 4. Panel (longitudinal) data A panel dataset consists of a time series for each cross-sectional member in the data set. Example: Key feature: the same crosssectional units are followed over time. This is the big difference compared to a pooled crosssection. Several advantages, e.g. a. Enables the researcher to control for certain unobserved characteristics of individuals (firms, cities ) that might otherwise lead to problems. b. Can analyze dynamics. Brief summary Key ingredient: Data typically in the form of large samples. Data = information. 1. Cross-sectional data are data on one or more variables collected at one point in time. 2. Time series data are collected over a period of time. 3. Pooled data a combination of time series and cross-section. Panel (longitudinal)data is a special type of pooled data, in which the same cross-sectional unit, say, a family or firm, is surveyed over time. Econometrics = a method for processing data and learn about general patterns in the population of interest. For example, what is the effect of education on labor market outcomes in the US? Example: Data: wage1.sav These data were originally obtained from the 1976 Current Population Survey in the US. You can obtain the data from the course website. I very much encourage you to do so. First let s look at summary statistics for this sample. To this end we will use a program called SPSS. 3

4 SPSS version 18 looks like this: And I can generate summary statistics by Analyze >Descriptive Statistics >Descriptives Here is the summary statistics table from the SPSS output windows: Specifying the Mathematical Model of labor market outcomes in the US To see the relationship between education and wage, the first thing we should do is to plot the data for these two variables in a sctter diagram or scattergram, as shown below. Plot scattergram via SPSS Here is a graph showing the relationship between wages and education in the data: (each point in the graph shows wage & education for a particular individual) What do we learn from the graph? Broad pattern: High levels of education are associated with high wages. But there are lots of individual exceptions! Wages vary a lot across individuals with the same level of education. The latter finding is particularly pronounced among people with high levels of education. (Try to generate this graph yourself. Hint: as I showed on the previous slide.) 4

5 Capturing the broad pattern in the data by means of regression The previous graph shows that high wages go hand in hand with high levels of education at least on average. Regression is a statistical technique enabling us (under certain assumptions that we will study carefully later on) to quantify by how much the average or expected - wage increases as education increases by (say) 1 year. Education and Earnings Suppose you want to evaluate the effects of years of education on worker earnings. Do you need a theory? A plausible economic model: wage = f ( educ, exper) where wage = hourly wage educ = years of formal education exper = years of work experience An econometric model An econometric model where: wage = average hourly earnings; education = years of education; WAGE is the dependent variable in the model. In econometrics, the dependent variable is almost always to the left of the equal sign ( on the left-hand side ). The dependent variable is the outcome of interest, to be explained by other variables. Education is the explanatory (or independent) variable in the model. Explanatory variables are typically written to the right of the equal sign ( on the right-hand side ). An econometric model The simplest wage model you will ever see We observe the dependent and the explanatory variables i.e. we have data on these. Econometric jargon: we regress wage on education this means wage is the dependent variable. Consider the following mathematical model of wages: Variables: wage and education. These are observed in the data. We know values of wage and education for each individual. Parameters: β 0 and β 1. These are unobserved their values are unknown. They are constant across individuals (in this model) The error term: ε i. This is unobserved we don t know the values. 5

6 OLS Estimation by SPSS Under a set of assumptions that we will study carefully throughout this course, we can estimate the unknown model parameters β 0 and β 1 using the wage1.sav dataset The two most important assumptions: the expected value of the residual is zero the residual is uncorrelated with education Why we need these assumptions will become clearer later on. I obtain OLS results in SPSS using Analyze > Regression > Linear Regression results: Simple wage model Regression results: Simple wage model These are the OLS estimates of the parameters β 0 and β 1. The constants β 0, β 1 are parameters (or coefficients) of the econometric model. β 0 is often referred to as the intercept. Unless you have a compelling reason not to, you should always include an intercept in your models. β 1 is constant describing the direction and strenght of the relationship between wage and the explanatory variable in the model. It is sometimes referred to as slope coefficient. Finally: ε = is an error term (or a disturbance term) containing unobserved factors. Dealing with the error term ε is a very important component of econometric analysis. (By the end of this course, you will understand what this statement means in practice.) The parameters of the model are typically unknown. Under certain assumptions, we can estimate these parameters using econometric methods. 6

7 Steps in empirical economic analysis Econometric methods are used in virtually every branch of applied economics An empirical analysis uses data to test a theory or to estimate a relationship The first step in any empirical analysis: formulating the question of interest Testing a theory Evaluating a policy etc. Econometrics & Economics Econometrics should always be linked to economic reasoning Exactly how formal and tight this link is varies across studies Researchers setting out to test an economic theory usually begin by developing a formal economic model by which I mean a set of mathematical equations that describe various relationships Causality Once an econometric model has been specified, various hypotheses of interest can be stated in terms of the unknown parameters. For example, we may hypothesize that education has no effect on wage; in the context of our model, this would be equivalent to hypothesizing that β 1 =0. Right? A common goal for applied economists is to estimate the causal effect of one variable on some outcome of interest. Ceteris paribus: other relevant factors being equal, what is the effect of a price increase on consumer demand training on worker productivity Causality (cont d) If a) we succeed in holding all other relevant determinants of (say) years of working experience constant; and b) find a link between years of education and earnings, then we can conclude that education has a causal effect on earnings. Causality (cont d) Ideal setting is experimental: laboratory administer treatment to half the sample and use the other half as control. Much of the research economists do use nonexperimental data A key challenge in econometrics is to condition on enough other factors, so that a case for causality can be made. 7

8 Causality: Example Suppose we want to estimate the causal effect of education on wages We ve already seen that the two variables are positively correlated in the WAGE1 data set: Causality: Example This of course doesn t imply that education causes wages Wages are determined by many other factors except education for example, innate ability High ability => high wages High ability => high education (e.g. intelligent individuals choose high education) Perhaps the correlation between education and wages visible in the graph is driven by ability rather than education? To credibly estimate the causal effect of education, we must find a way of determining the link between education and wages holding innate ability constant! 8

9 Two-Variable Regression Model (Chapters 2&3) Chapters 2&3: The Simple Linear Regression Model Iris Wang Suppose we want to explain y in terms of x. Three issues: 1. Since there s never an exact relationship between two variables: how allow for other factors affecting y? 2. What is the functional form? 3. How can we be sure we are capturing a ceteris paribus (causal) relationship between y and x (if that is the goal)? The simple linear regression model Assume that, in the population, outcome variable y can be modeled as a function of x as follows: Simple regression: The functional relationship between y and x is linear: u: error term; disturbance term; residual; noise β 0, β 1 : parameters, coefficients, constants β 1 is the slope parameter a parameter of primary interest in applied economics The intercept parameter β 0 (sometimes called the constant term) is rarely central to an analysis. Examples: The regression of Y on X How interpret β 1 in these equations? How can we hope to learn about the ceteris paribus effect of x (education) on y (wage), holding other factors fixed, when we are ignoring all those other factors? Population Regression Line (PRL) 1

10 Model: Distribution of Y given X = x 3 Assumption: Implication: Conditional mean Years of education E(y x) is the Population Regression Function (PRF). It is a linear function of x. The Sample Regression Function (SRF) Example: p.43 Sample and population regression lines Figure 2.5 p.45 Chapter 3: OLS Estimation Why is this estimator called the Ordinary Least Squares (OLS) estimator? To see why, first define a fitted value for y when x=x i as Next, define the residual for observation i as Note that there are n such residuals. The OLS estimates minimize the sum of squared residuals: Least squares The method of Ordinary Least Squares is as small as possible. P.58 (3.1.6) & (3.1.7), more details see Appendix 3A Some related concepts The OLS regression line (or, the sample regression function; SRF): Interpretation: where and 2

11 The statistical properties of OLS estimators Three important properties: 1. The OLS estimators are easily computed. 2. They are point estimators ( In Chap. 5 we will intriduce interval estimators). 3. The OLS estimates are obtained from the sample data and the sample regression line has the following properties: The statistical properties of OLS estimators I. is always on the OLS regression line. P. 59 II. III. IV. V. The Assumptions underlying the method of least squeares Classical Linear Regression Model (CLRM) makes 7/10 assumptions, p.62/p.315. The meaning of linear regression Final point: Linear regression means linear in parameters. This would thus be a linear regression model: y = b0 + b1*sqrt(x) + u while this would not: y = 1/(b0 + b1*x) + u Nonlinear relationships between y and x can often be allowed for within the linear regression framework (Exercises) Assumptions underlyingols Assumption 1: The regression model is linear in parameters: y = β 0 + β 1 x + u. Assumption 3: Zero conditional mean - the error u i has an expected value of zero given x i : E(u i x i )=0 In deciding when the simple linear regression is going to produce unbiased estimators, Assumption 3 is crucial. Assumption 4: Homoskedasticity. The error u has the same variance given any value of the explanatory variable: Technical point: Combined with E(u x)=0, this implies If the variance of u depends on x, the error term is said to exhibit heteroskedasticity 3

12 Homoscedasticity Heteroskedasticity: Assumption 7: There is sample variation in the explanatory variable (the x are not all the same). 3.3 variances of the OLS estimates We need to keep in mind that and are not the population parameters β 0, β 1 The estimates are based on a random sample. Different random samples would give rise to different estimates of and. Therefore, and follow distributions. We are interested in the means and the variance of these. Variance of the OLS estimators In addition to knowing that the sampling distribution of the OLS estimator is centered on β 1, it is important to know how far we can expect to be away from β 1 on average. Of course, we want to use an estimator for which is expected to be as close to β 1 on average, as possible. So we need to measure the spread (dispersion) of. Key measure: the variance. Sampling variance of the OLS estimator Under Assumptions of OLS, This shows that a low residual variance and a high degree of variability in the explanatory variable x contributes to a low variance of. 4

13 Estimating the variance Note that σ 2 is an unknown parameter. If we want to estimate the variance of we need to estimate σ 2. This is done based on the OLS residuals. Once we have estimated the variance, we can construct confidence intervals and derive test statistics etc. that will enable us to do inference. We come back to this in Chapter The Gauss-Markov Theorem Theorem: Under assumptions, OLS is the Best Linear Unbiased Estimator (BLUE) of the population parameters. P.72 Best = smallest variance It s reassuring to know that, under assumptions, you cannot find a better estimator than OLS. If one or several of these assumptions fail, OLS is no longer BLUE. 3.5 Goodness of Fit Interpretation: Breaking y into two parts SRF = systematic part of y + unsystematic part of y Towards a goodness of fit measure For each i, write TSS, ESS and RSS TSS = Total Sum of Squares This is simply the total sample variation in y i : Thus, we can view OLS as decomposing y i into two parts: A fitted value And a residual where the fitted values and residuals are uncorrelated in the sample. 5

14 TSS, ESS and RSS ESS = Explained Sum of Squares This is the sample variation in : Recall the OLS decompositon introduced earlier: Using the earlier result that the fitted values and residuals are uncorrelated in the sample, we can show: RSS = Residual Sum of Squares This is the sample variation in the OLS residual This is the R-squared of the regression, sometimes called the coefficient of determination. How interpret the R-squared? Answer: As the fraction of the sample variation in y that is explained by x. Note: R-sq always between 0 and 1. Extreme cases: R-sq = 0; R-sq = 1. In words, what do these cases mean? Additional points on R-sq Low R-squareds in regressions are not uncommon. In particular, if we are working with noisy cross-sectional micro data, we often get R-sq lower than (say) In time series econometrics, we often get high R-squareds. Even simple models like y(t) = a + b*y(t-1) + u tend to give high R-sq. For forecasting models, having a good fit (high R-sq) is of course very central. But maximizing the R-sq is not a goal in most empirical studies. We, therefore, do not need to put too much weight on the size of the R-squared in evaluation regression equations. TSS, ESS, RSS in the SPSS regression output R-squared in the SPSS regression output ANOVA b ESS RSS TSS Sum of Model Squares df Mean Square F Sig. 1175, , ,879,000 1 Regression 1 a Residual 5976, ,428 Total 7152, a. Predictors: (Constant), educ b. Dependent Variable: wage Model Summary Square Adjusted Model R R R Square Std. Error of the Estimate 1,164,163 3,38054,405 a a. Predictors: (Constant), educ RSS/(n-1) These quantities are primarily used to calculate a measure of the goodness-of-fit of our model. 6

15 Units of measurement General point: Changing the units of measurement does not change the interpretation of the results (but of course it may change the estimates) It is very important to be clear on the units of measurement otherwise we can t interpret the results!! Does the R-squared change as a result of changing the units of measurement? Keep in mind Unbiasedness is a feature of the sampling distributions of the OLS estimators. Says nothing about the estimate that we get for a given sample. If the sample we obtain is somehow typical, our estimates should be near the population values. But we will never know for sure. How can unbiasedness fail? Unbiasedness fails if any of Assumptions fail (linearity, random sampling, variation in x, zero conditional mean). In practice, Assumption 3 is the most important one. The possibility that x is correlated with u is very often a concern in simple regression analysis with nonexperimental data (e.g. recall discussion about wage and education; omitted variables correlated with education) THANK YOU! 7

Simple linear regression

Simple linear regression Simple linear regression Introduction Simple linear regression is a statistical method for obtaining a formula to predict values of one variable from another where there is a causal relationship between

More information

2. Linear regression with multiple regressors

2. Linear regression with multiple regressors 2. Linear regression with multiple regressors Aim of this section: Introduction of the multiple regression model OLS estimation in multiple regression Measures-of-fit in multiple regression Assumptions

More information

Econometrics Simple Linear Regression

Econometrics Simple Linear Regression Econometrics Simple Linear Regression Burcu Eke UC3M Linear equations with one variable Recall what a linear equation is: y = b 0 + b 1 x is a linear equation with one variable, or equivalently, a straight

More information

Introduction to Regression and Data Analysis

Introduction to Regression and Data Analysis Statlab Workshop Introduction to Regression and Data Analysis with Dan Campbell and Sherlock Campbell October 28, 2008 I. The basics A. Types of variables Your variables may take several forms, and it

More information

Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model

Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model 1 September 004 A. Introduction and assumptions The classical normal linear regression model can be written

More information

What s New in Econometrics? Lecture 8 Cluster and Stratified Sampling

What s New in Econometrics? Lecture 8 Cluster and Stratified Sampling What s New in Econometrics? Lecture 8 Cluster and Stratified Sampling Jeff Wooldridge NBER Summer Institute, 2007 1. The Linear Model with Cluster Effects 2. Estimation with a Small Number of Groups and

More information

HYPOTHESIS TESTING: CONFIDENCE INTERVALS, T-TESTS, ANOVAS, AND REGRESSION

HYPOTHESIS TESTING: CONFIDENCE INTERVALS, T-TESTS, ANOVAS, AND REGRESSION HYPOTHESIS TESTING: CONFIDENCE INTERVALS, T-TESTS, ANOVAS, AND REGRESSION HOD 2990 10 November 2010 Lecture Background This is a lightning speed summary of introductory statistical methods for senior undergraduate

More information

ECON 523 Applied Econometrics I /Masters Level American University, Spring 2008. Description of the course

ECON 523 Applied Econometrics I /Masters Level American University, Spring 2008. Description of the course ECON 523 Applied Econometrics I /Masters Level American University, Spring 2008 Instructor: Maria Heracleous Lectures: M 8:10-10:40 p.m. WARD 202 Office: 221 Roper Phone: 202-885-3758 Office Hours: M W

More information

Chapter 7: Simple linear regression Learning Objectives

Chapter 7: Simple linear regression Learning Objectives Chapter 7: Simple linear regression Learning Objectives Reading: Section 7.1 of OpenIntro Statistics Video: Correlation vs. causation, YouTube (2:19) Video: Intro to Linear Regression, YouTube (5:18) -

More information

IAPRI Quantitative Analysis Capacity Building Series. Multiple regression analysis & interpreting results

IAPRI Quantitative Analysis Capacity Building Series. Multiple regression analysis & interpreting results IAPRI Quantitative Analysis Capacity Building Series Multiple regression analysis & interpreting results How important is R-squared? R-squared Published in Agricultural Economics 0.45 Best article of the

More information

Regression III: Advanced Methods

Regression III: Advanced Methods Lecture 16: Generalized Additive Models Regression III: Advanced Methods Bill Jacoby Michigan State University http://polisci.msu.edu/jacoby/icpsr/regress3 Goals of the Lecture Introduce Additive Models

More information

Calculating the Probability of Returning a Loan with Binary Probability Models

Calculating the Probability of Returning a Loan with Binary Probability Models Calculating the Probability of Returning a Loan with Binary Probability Models Associate Professor PhD Julian VASILEV (e-mail: vasilev@ue-varna.bg) Varna University of Economics, Bulgaria ABSTRACT The

More information

NCSS Statistical Software Principal Components Regression. In ordinary least squares, the regression coefficients are estimated using the formula ( )

NCSS Statistical Software Principal Components Regression. In ordinary least squares, the regression coefficients are estimated using the formula ( ) Chapter 340 Principal Components Regression Introduction is a technique for analyzing multiple regression data that suffer from multicollinearity. When multicollinearity occurs, least squares estimates

More information

Simple Regression Theory II 2010 Samuel L. Baker

Simple Regression Theory II 2010 Samuel L. Baker SIMPLE REGRESSION THEORY II 1 Simple Regression Theory II 2010 Samuel L. Baker Assessing how good the regression equation is likely to be Assignment 1A gets into drawing inferences about how close the

More information

Answer: C. The strength of a correlation does not change if units change by a linear transformation such as: Fahrenheit = 32 + (5/9) * Centigrade

Answer: C. The strength of a correlation does not change if units change by a linear transformation such as: Fahrenheit = 32 + (5/9) * Centigrade Statistics Quiz Correlation and Regression -- ANSWERS 1. Temperature and air pollution are known to be correlated. We collect data from two laboratories, in Boston and Montreal. Boston makes their measurements

More information

Multiple Linear Regression in Data Mining

Multiple Linear Regression in Data Mining Multiple Linear Regression in Data Mining Contents 2.1. A Review of Multiple Linear Regression 2.2. Illustration of the Regression Process 2.3. Subset Selection in Linear Regression 1 2 Chap. 2 Multiple

More information

MULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS

MULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS MULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS MSR = Mean Regression Sum of Squares MSE = Mean Squared Error RSS = Regression Sum of Squares SSE = Sum of Squared Errors/Residuals α = Level of Significance

More information

Topic 1 - Introduction to Labour Economics. Professor H.J. Schuetze Economics 370. What is Labour Economics?

Topic 1 - Introduction to Labour Economics. Professor H.J. Schuetze Economics 370. What is Labour Economics? Topic 1 - Introduction to Labour Economics Professor H.J. Schuetze Economics 370 What is Labour Economics? Let s begin by looking at what economics is in general Study of interactions between decision

More information

Section 14 Simple Linear Regression: Introduction to Least Squares Regression

Section 14 Simple Linear Regression: Introduction to Least Squares Regression Slide 1 Section 14 Simple Linear Regression: Introduction to Least Squares Regression There are several different measures of statistical association used for understanding the quantitative relationship

More information

DEPARTMENT OF PSYCHOLOGY UNIVERSITY OF LANCASTER MSC IN PSYCHOLOGICAL RESEARCH METHODS ANALYSING AND INTERPRETING DATA 2 PART 1 WEEK 9

DEPARTMENT OF PSYCHOLOGY UNIVERSITY OF LANCASTER MSC IN PSYCHOLOGICAL RESEARCH METHODS ANALYSING AND INTERPRETING DATA 2 PART 1 WEEK 9 DEPARTMENT OF PSYCHOLOGY UNIVERSITY OF LANCASTER MSC IN PSYCHOLOGICAL RESEARCH METHODS ANALYSING AND INTERPRETING DATA 2 PART 1 WEEK 9 Analysis of covariance and multiple regression So far in this course,

More information

Simple Linear Regression Inference

Simple Linear Regression Inference Simple Linear Regression Inference 1 Inference requirements The Normality assumption of the stochastic term e is needed for inference even if it is not a OLS requirement. Therefore we have: Interpretation

More information

Section Format Day Begin End Building Rm# Instructor. 001 Lecture Tue 6:45 PM 8:40 PM Silver 401 Ballerini

Section Format Day Begin End Building Rm# Instructor. 001 Lecture Tue 6:45 PM 8:40 PM Silver 401 Ballerini NEW YORK UNIVERSITY ROBERT F. WAGNER GRADUATE SCHOOL OF PUBLIC SERVICE Course Syllabus Spring 2016 Statistical Methods for Public, Nonprofit, and Health Management Section Format Day Begin End Building

More information

Chapter Seven. Multiple regression An introduction to multiple regression Performing a multiple regression on SPSS

Chapter Seven. Multiple regression An introduction to multiple regression Performing a multiple regression on SPSS Chapter Seven Multiple regression An introduction to multiple regression Performing a multiple regression on SPSS Section : An introduction to multiple regression WHAT IS MULTIPLE REGRESSION? Multiple

More information

Marketing Mix Modelling and Big Data P. M Cain

Marketing Mix Modelling and Big Data P. M Cain 1) Introduction Marketing Mix Modelling and Big Data P. M Cain Big data is generally defined in terms of the volume and variety of structured and unstructured information. Whereas structured data is stored

More information

Regression Analysis (Spring, 2000)

Regression Analysis (Spring, 2000) Regression Analysis (Spring, 2000) By Wonjae Purposes: a. Explaining the relationship between Y and X variables with a model (Explain a variable Y in terms of Xs) b. Estimating and testing the intensity

More information

APPLICATION OF LINEAR REGRESSION MODEL FOR POISSON DISTRIBUTION IN FORECASTING

APPLICATION OF LINEAR REGRESSION MODEL FOR POISSON DISTRIBUTION IN FORECASTING APPLICATION OF LINEAR REGRESSION MODEL FOR POISSON DISTRIBUTION IN FORECASTING Sulaimon Mutiu O. Department of Statistics & Mathematics Moshood Abiola Polytechnic, Abeokuta, Ogun State, Nigeria. Abstract

More information

2. Simple Linear Regression

2. Simple Linear Regression Research methods - II 3 2. Simple Linear Regression Simple linear regression is a technique in parametric statistics that is commonly used for analyzing mean response of a variable Y which changes according

More information

Chapter 13 Introduction to Linear Regression and Correlation Analysis

Chapter 13 Introduction to Linear Regression and Correlation Analysis Chapter 3 Student Lecture Notes 3- Chapter 3 Introduction to Linear Regression and Correlation Analsis Fall 2006 Fundamentals of Business Statistics Chapter Goals To understand the methods for displaing

More information

Introduction to Fixed Effects Methods

Introduction to Fixed Effects Methods Introduction to Fixed Effects Methods 1 1.1 The Promise of Fixed Effects for Nonexperimental Research... 1 1.2 The Paired-Comparisons t-test as a Fixed Effects Method... 2 1.3 Costs and Benefits of Fixed

More information

Chapter 10. Key Ideas Correlation, Correlation Coefficient (r),

Chapter 10. Key Ideas Correlation, Correlation Coefficient (r), Chapter 0 Key Ideas Correlation, Correlation Coefficient (r), Section 0-: Overview We have already explored the basics of describing single variable data sets. However, when two quantitative variables

More information

2013 MBA Jump Start Program. Statistics Module Part 3

2013 MBA Jump Start Program. Statistics Module Part 3 2013 MBA Jump Start Program Module 1: Statistics Thomas Gilbert Part 3 Statistics Module Part 3 Hypothesis Testing (Inference) Regressions 2 1 Making an Investment Decision A researcher in your firm just

More information

Module 5: Multiple Regression Analysis

Module 5: Multiple Regression Analysis Using Statistical Data Using to Make Statistical Decisions: Data Multiple to Make Regression Decisions Analysis Page 1 Module 5: Multiple Regression Analysis Tom Ilvento, University of Delaware, College

More information

1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96

1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96 1 Final Review 2 Review 2.1 CI 1-propZint Scenario 1 A TV manufacturer claims in its warranty brochure that in the past not more than 10 percent of its TV sets needed any repair during the first two years

More information

Statistical Models in R

Statistical Models in R Statistical Models in R Some Examples Steven Buechler Department of Mathematics 276B Hurley Hall; 1-6233 Fall, 2007 Outline Statistical Models Linear Models in R Regression Regression analysis is the appropriate

More information

ECON 142 SKETCH OF SOLUTIONS FOR APPLIED EXERCISE #2

ECON 142 SKETCH OF SOLUTIONS FOR APPLIED EXERCISE #2 University of California, Berkeley Prof. Ken Chay Department of Economics Fall Semester, 005 ECON 14 SKETCH OF SOLUTIONS FOR APPLIED EXERCISE # Question 1: a. Below are the scatter plots of hourly wages

More information

SPSS Guide: Regression Analysis

SPSS Guide: Regression Analysis SPSS Guide: Regression Analysis I put this together to give you a step-by-step guide for replicating what we did in the computer lab. It should help you run the tests we covered. The best way to get familiar

More information

Example: Boats and Manatees

Example: Boats and Manatees Figure 9-6 Example: Boats and Manatees Slide 1 Given the sample data in Table 9-1, find the value of the linear correlation coefficient r, then refer to Table A-6 to determine whether there is a significant

More information

Session 7 Bivariate Data and Analysis

Session 7 Bivariate Data and Analysis Session 7 Bivariate Data and Analysis Key Terms for This Session Previously Introduced mean standard deviation New in This Session association bivariate analysis contingency table co-variation least squares

More information

LAGUARDIA COMMUNITY COLLEGE CITY UNIVERSITY OF NEW YORK DEPARTMENT OF MATHEMATICS, ENGINEERING, AND COMPUTER SCIENCE

LAGUARDIA COMMUNITY COLLEGE CITY UNIVERSITY OF NEW YORK DEPARTMENT OF MATHEMATICS, ENGINEERING, AND COMPUTER SCIENCE LAGUARDIA COMMUNITY COLLEGE CITY UNIVERSITY OF NEW YORK DEPARTMENT OF MATHEMATICS, ENGINEERING, AND COMPUTER SCIENCE MAT 119 STATISTICS AND ELEMENTARY ALGEBRA 5 Lecture Hours, 2 Lab Hours, 3 Credits Pre-

More information

MISSING DATA TECHNIQUES WITH SAS. IDRE Statistical Consulting Group

MISSING DATA TECHNIQUES WITH SAS. IDRE Statistical Consulting Group MISSING DATA TECHNIQUES WITH SAS IDRE Statistical Consulting Group ROAD MAP FOR TODAY To discuss: 1. Commonly used techniques for handling missing data, focusing on multiple imputation 2. Issues that could

More information

COMP6053 lecture: Relationship between two variables: correlation, covariance and r-squared. jn2@ecs.soton.ac.uk

COMP6053 lecture: Relationship between two variables: correlation, covariance and r-squared. jn2@ecs.soton.ac.uk COMP6053 lecture: Relationship between two variables: correlation, covariance and r-squared jn2@ecs.soton.ac.uk Relationships between variables So far we have looked at ways of characterizing the distribution

More information

Lecture 15. Endogeneity & Instrumental Variable Estimation

Lecture 15. Endogeneity & Instrumental Variable Estimation Lecture 15. Endogeneity & Instrumental Variable Estimation Saw that measurement error (on right hand side) means that OLS will be biased (biased toward zero) Potential solution to endogeneity instrumental

More information

ANALYSIS OF TREND CHAPTER 5

ANALYSIS OF TREND CHAPTER 5 ANALYSIS OF TREND CHAPTER 5 ERSH 8310 Lecture 7 September 13, 2007 Today s Class Analysis of trends Using contrasts to do something a bit more practical. Linear trends. Quadratic trends. Trends in SPSS.

More information

CHAPTER 13 SIMPLE LINEAR REGRESSION. Opening Example. Simple Regression. Linear Regression

CHAPTER 13 SIMPLE LINEAR REGRESSION. Opening Example. Simple Regression. Linear Regression Opening Example CHAPTER 13 SIMPLE LINEAR REGREION SIMPLE LINEAR REGREION! Simple Regression! Linear Regression Simple Regression Definition A regression model is a mathematical equation that descries the

More information

Business Statistics. Successful completion of Introductory and/or Intermediate Algebra courses is recommended before taking Business Statistics.

Business Statistics. Successful completion of Introductory and/or Intermediate Algebra courses is recommended before taking Business Statistics. Business Course Text Bowerman, Bruce L., Richard T. O'Connell, J. B. Orris, and Dawn C. Porter. Essentials of Business, 2nd edition, McGraw-Hill/Irwin, 2008, ISBN: 978-0-07-331988-9. Required Computing

More information

A Review of Cross Sectional Regression for Financial Data You should already know this material from previous study

A Review of Cross Sectional Regression for Financial Data You should already know this material from previous study A Review of Cross Sectional Regression for Financial Data You should already know this material from previous study But I will offer a review, with a focus on issues which arise in finance 1 TYPES OF FINANCIAL

More information

4. Simple regression. QBUS6840 Predictive Analytics. https://www.otexts.org/fpp/4

4. Simple regression. QBUS6840 Predictive Analytics. https://www.otexts.org/fpp/4 4. Simple regression QBUS6840 Predictive Analytics https://www.otexts.org/fpp/4 Outline The simple linear model Least squares estimation Forecasting with regression Non-linear functional forms Regression

More information

3.2. Solving quadratic equations. Introduction. Prerequisites. Learning Outcomes. Learning Style

3.2. Solving quadratic equations. Introduction. Prerequisites. Learning Outcomes. Learning Style Solving quadratic equations 3.2 Introduction A quadratic equation is one which can be written in the form ax 2 + bx + c = 0 where a, b and c are numbers and x is the unknown whose value(s) we wish to find.

More information

SYSTEMS OF REGRESSION EQUATIONS

SYSTEMS OF REGRESSION EQUATIONS SYSTEMS OF REGRESSION EQUATIONS 1. MULTIPLE EQUATIONS y nt = x nt n + u nt, n = 1,...,N, t = 1,...,T, x nt is 1 k, and n is k 1. This is a version of the standard regression model where the observations

More information

Regression Analysis: A Complete Example

Regression Analysis: A Complete Example Regression Analysis: A Complete Example This section works out an example that includes all the topics we have discussed so far in this chapter. A complete example of regression analysis. PhotoDisc, Inc./Getty

More information

A Primer on Forecasting Business Performance

A Primer on Forecasting Business Performance A Primer on Forecasting Business Performance There are two common approaches to forecasting: qualitative and quantitative. Qualitative forecasting methods are important when historical data is not available.

More information

Empirical Methods in Applied Economics

Empirical Methods in Applied Economics Empirical Methods in Applied Economics Jörn-Ste en Pischke LSE October 2005 1 Observational Studies and Regression 1.1 Conditional Randomization Again When we discussed experiments, we discussed already

More information

Point and Interval Estimates

Point and Interval Estimates Point and Interval Estimates Suppose we want to estimate a parameter, such as p or µ, based on a finite sample of data. There are two main methods: 1. Point estimate: Summarize the sample by a single number

More information

Review of Fundamental Mathematics

Review of Fundamental Mathematics Review of Fundamental Mathematics As explained in the Preface and in Chapter 1 of your textbook, managerial economics applies microeconomic theory to business decision making. The decision-making tools

More information

Course Objective This course is designed to give you a basic understanding of how to run regressions in SPSS.

Course Objective This course is designed to give you a basic understanding of how to run regressions in SPSS. SPSS Regressions Social Science Research Lab American University, Washington, D.C. Web. www.american.edu/provost/ctrl/pclabs.cfm Tel. x3862 Email. SSRL@American.edu Course Objective This course is designed

More information

A Statistical Analysis of the Prices of. Personalised Number Plates in Britain

A Statistical Analysis of the Prices of. Personalised Number Plates in Britain Number Plate Pricing Equation Regression 1 A Statistical Analysis of the Prices of Personalised Number Plates in Britain Mr Matthew Corder and Professor Andrew Oswald Warwick University Department of Economics.

More information

Research Methods & Experimental Design

Research Methods & Experimental Design Research Methods & Experimental Design 16.422 Human Supervisory Control April 2004 Research Methods Qualitative vs. quantitative Understanding the relationship between objectives (research question) and

More information

Testing for Granger causality between stock prices and economic growth

Testing for Granger causality between stock prices and economic growth MPRA Munich Personal RePEc Archive Testing for Granger causality between stock prices and economic growth Pasquale Foresti 2006 Online at http://mpra.ub.uni-muenchen.de/2962/ MPRA Paper No. 2962, posted

More information

UNDERSTANDING ANALYSIS OF COVARIANCE (ANCOVA)

UNDERSTANDING ANALYSIS OF COVARIANCE (ANCOVA) UNDERSTANDING ANALYSIS OF COVARIANCE () In general, research is conducted for the purpose of explaining the effects of the independent variable on the dependent variable, and the purpose of research design

More information

Analyzing Intervention Effects: Multilevel & Other Approaches. Simplest Intervention Design. Better Design: Have Pretest

Analyzing Intervention Effects: Multilevel & Other Approaches. Simplest Intervention Design. Better Design: Have Pretest Analyzing Intervention Effects: Multilevel & Other Approaches Joop Hox Methodology & Statistics, Utrecht Simplest Intervention Design R X Y E Random assignment Experimental + Control group Analysis: t

More information

Do Supplemental Online Recorded Lectures Help Students Learn Microeconomics?*

Do Supplemental Online Recorded Lectures Help Students Learn Microeconomics?* Do Supplemental Online Recorded Lectures Help Students Learn Microeconomics?* Jennjou Chen and Tsui-Fang Lin Abstract With the increasing popularity of information technology in higher education, it has

More information

CURVE FITTING LEAST SQUARES APPROXIMATION

CURVE FITTING LEAST SQUARES APPROXIMATION CURVE FITTING LEAST SQUARES APPROXIMATION Data analysis and curve fitting: Imagine that we are studying a physical system involving two quantities: x and y Also suppose that we expect a linear relationship

More information

Part 2: Analysis of Relationship Between Two Variables

Part 2: Analysis of Relationship Between Two Variables Part 2: Analysis of Relationship Between Two Variables Linear Regression Linear correlation Significance Tests Multiple regression Linear Regression Y = a X + b Dependent Variable Independent Variable

More information

MULTIPLE REGRESSION WITH CATEGORICAL DATA

MULTIPLE REGRESSION WITH CATEGORICAL DATA DEPARTMENT OF POLITICAL SCIENCE AND INTERNATIONAL RELATIONS Posc/Uapp 86 MULTIPLE REGRESSION WITH CATEGORICAL DATA I. AGENDA: A. Multiple regression with categorical variables. Coding schemes. Interpreting

More information

An Introduction to Regression Analysis

An Introduction to Regression Analysis The Inaugural Coase Lecture An Introduction to Regression Analysis Alan O. Sykes * Regression analysis is a statistical tool for the investigation of relationships between variables. Usually, the investigator

More information

Correlation key concepts:

Correlation key concepts: CORRELATION Correlation key concepts: Types of correlation Methods of studying correlation a) Scatter diagram b) Karl pearson s coefficient of correlation c) Spearman s Rank correlation coefficient d)

More information

Economics of Strategy (ECON 4550) Maymester 2015 Applications of Regression Analysis

Economics of Strategy (ECON 4550) Maymester 2015 Applications of Regression Analysis Economics of Strategy (ECON 4550) Maymester 015 Applications of Regression Analysis Reading: ACME Clinic (ECON 4550 Coursepak, Page 47) and Big Suzy s Snack Cakes (ECON 4550 Coursepak, Page 51) Definitions

More information

Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression

Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression Objectives: To perform a hypothesis test concerning the slope of a least squares line To recognize that testing for a

More information

" Y. Notation and Equations for Regression Lecture 11/4. Notation:

 Y. Notation and Equations for Regression Lecture 11/4. Notation: Notation: Notation and Equations for Regression Lecture 11/4 m: The number of predictor variables in a regression Xi: One of multiple predictor variables. The subscript i represents any number from 1 through

More information

Premaster Statistics Tutorial 4 Full solutions

Premaster Statistics Tutorial 4 Full solutions Premaster Statistics Tutorial 4 Full solutions Regression analysis Q1 (based on Doane & Seward, 4/E, 12.7) a. Interpret the slope of the fitted regression = 125,000 + 150. b. What is the prediction for

More information

Fairfield Public Schools

Fairfield Public Schools Mathematics Fairfield Public Schools AP Statistics AP Statistics BOE Approved 04/08/2014 1 AP STATISTICS Critical Areas of Focus AP Statistics is a rigorous course that offers advanced students an opportunity

More information

Description. Textbook. Grading. Objective

Description. Textbook. Grading. Objective EC151.02 Statistics for Business and Economics (MWF 8:00-8:50) Instructor: Chiu Yu Ko Office: 462D, 21 Campenalla Way Phone: 2-6093 Email: kocb@bc.edu Office Hours: by appointment Description This course

More information

Chapter 5 Analysis of variance SPSS Analysis of variance

Chapter 5 Analysis of variance SPSS Analysis of variance Chapter 5 Analysis of variance SPSS Analysis of variance Data file used: gss.sav How to get there: Analyze Compare Means One-way ANOVA To test the null hypothesis that several population means are equal,

More information

Correlational Research. Correlational Research. Stephen E. Brock, Ph.D., NCSP EDS 250. Descriptive Research 1. Correlational Research: Scatter Plots

Correlational Research. Correlational Research. Stephen E. Brock, Ph.D., NCSP EDS 250. Descriptive Research 1. Correlational Research: Scatter Plots Correlational Research Stephen E. Brock, Ph.D., NCSP California State University, Sacramento 1 Correlational Research A quantitative methodology used to determine whether, and to what degree, a relationship

More information

Wooldridge, Introductory Econometrics, 3d ed. Chapter 12: Serial correlation and heteroskedasticity in time series regressions

Wooldridge, Introductory Econometrics, 3d ed. Chapter 12: Serial correlation and heteroskedasticity in time series regressions Wooldridge, Introductory Econometrics, 3d ed. Chapter 12: Serial correlation and heteroskedasticity in time series regressions What will happen if we violate the assumption that the errors are not serially

More information

Course Text. Required Computing Software. Course Description. Course Objectives. StraighterLine. Business Statistics

Course Text. Required Computing Software. Course Description. Course Objectives. StraighterLine. Business Statistics Course Text Business Statistics Lind, Douglas A., Marchal, William A. and Samuel A. Wathen. Basic Statistics for Business and Economics, 7th edition, McGraw-Hill/Irwin, 2010, ISBN: 9780077384470 [This

More information

Multiple Linear Regression

Multiple Linear Regression Multiple Linear Regression A regression with two or more explanatory variables is called a multiple regression. Rather than modeling the mean response as a straight line, as in simple regression, it is

More information

One-Way Analysis of Variance

One-Way Analysis of Variance One-Way Analysis of Variance Note: Much of the math here is tedious but straightforward. We ll skim over it in class but you should be sure to ask questions if you don t understand it. I. Overview A. We

More information

Chicago Booth BUSINESS STATISTICS 41000 Final Exam Fall 2011

Chicago Booth BUSINESS STATISTICS 41000 Final Exam Fall 2011 Chicago Booth BUSINESS STATISTICS 41000 Final Exam Fall 2011 Name: Section: I pledge my honor that I have not violated the Honor Code Signature: This exam has 34 pages. You have 3 hours to complete this

More information

Basic Statistics and Data Analysis for Health Researchers from Foreign Countries

Basic Statistics and Data Analysis for Health Researchers from Foreign Countries Basic Statistics and Data Analysis for Health Researchers from Foreign Countries Volkert Siersma siersma@sund.ku.dk The Research Unit for General Practice in Copenhagen Dias 1 Content Quantifying association

More information

Solución del Examen Tipo: 1

Solución del Examen Tipo: 1 Solución del Examen Tipo: 1 Universidad Carlos III de Madrid ECONOMETRICS Academic year 2009/10 FINAL EXAM May 17, 2010 DURATION: 2 HOURS 1. Assume that model (III) verifies the assumptions of the classical

More information

1.5 Oneway Analysis of Variance

1.5 Oneway Analysis of Variance Statistics: Rosie Cornish. 200. 1.5 Oneway Analysis of Variance 1 Introduction Oneway analysis of variance (ANOVA) is used to compare several means. This method is often used in scientific or medical experiments

More information

Outline. Topic 4 - Analysis of Variance Approach to Regression. Partitioning Sums of Squares. Total Sum of Squares. Partitioning sums of squares

Outline. Topic 4 - Analysis of Variance Approach to Regression. Partitioning Sums of Squares. Total Sum of Squares. Partitioning sums of squares Topic 4 - Analysis of Variance Approach to Regression Outline Partitioning sums of squares Degrees of freedom Expected mean squares General linear test - Fall 2013 R 2 and the coefficient of correlation

More information

Wooldridge, Introductory Econometrics, 4th ed. Chapter 10: Basic regression analysis with time series data

Wooldridge, Introductory Econometrics, 4th ed. Chapter 10: Basic regression analysis with time series data Wooldridge, Introductory Econometrics, 4th ed. Chapter 10: Basic regression analysis with time series data We now turn to the analysis of time series data. One of the key assumptions underlying our analysis

More information

Techniques of Statistical Analysis II second group

Techniques of Statistical Analysis II second group Techniques of Statistical Analysis II second group Bruno Arpino Office: 20.182; Building: Jaume I bruno.arpino@upf.edu 1. Overview The course is aimed to provide advanced statistical knowledge for the

More information

Association Between Variables

Association Between Variables Contents 11 Association Between Variables 767 11.1 Introduction............................ 767 11.1.1 Measure of Association................. 768 11.1.2 Chapter Summary.................... 769 11.2 Chi

More information

Experiment #1, Analyze Data using Excel, Calculator and Graphs.

Experiment #1, Analyze Data using Excel, Calculator and Graphs. Physics 182 - Fall 2014 - Experiment #1 1 Experiment #1, Analyze Data using Excel, Calculator and Graphs. 1 Purpose (5 Points, Including Title. Points apply to your lab report.) Before we start measuring

More information

ANALYSIS OF FACTOR BASED DATA MINING TECHNIQUES

ANALYSIS OF FACTOR BASED DATA MINING TECHNIQUES Advances in Information Mining ISSN: 0975 3265 & E-ISSN: 0975 9093, Vol. 3, Issue 1, 2011, pp-26-32 Available online at http://www.bioinfo.in/contents.php?id=32 ANALYSIS OF FACTOR BASED DATA MINING TECHNIQUES

More information

Note 2 to Computer class: Standard mis-specification tests

Note 2 to Computer class: Standard mis-specification tests Note 2 to Computer class: Standard mis-specification tests Ragnar Nymoen September 2, 2013 1 Why mis-specification testing of econometric models? As econometricians we must relate to the fact that the

More information

Statistical Models in R

Statistical Models in R Statistical Models in R Some Examples Steven Buechler Department of Mathematics 276B Hurley Hall; 1-6233 Fall, 2007 Outline Statistical Models Structure of models in R Model Assessment (Part IA) Anova

More information

Study Guide for the Final Exam

Study Guide for the Final Exam Study Guide for the Final Exam When studying, remember that the computational portion of the exam will only involve new material (covered after the second midterm), that material from Exam 1 will make

More information

AP Physics 1 and 2 Lab Investigations

AP Physics 1 and 2 Lab Investigations AP Physics 1 and 2 Lab Investigations Student Guide to Data Analysis New York, NY. College Board, Advanced Placement, Advanced Placement Program, AP, AP Central, and the acorn logo are registered trademarks

More information

Module 3: Correlation and Covariance

Module 3: Correlation and Covariance Using Statistical Data to Make Decisions Module 3: Correlation and Covariance Tom Ilvento Dr. Mugdim Pašiƒ University of Delaware Sarajevo Graduate School of Business O ften our interest in data analysis

More information

Note on growth and growth accounting

Note on growth and growth accounting CHAPTER 0 Note on growth and growth accounting 1. Growth and the growth rate In this section aspects of the mathematical concept of the rate of growth used in growth models and in the empirical analysis

More information

From the help desk: Bootstrapped standard errors

From the help desk: Bootstrapped standard errors The Stata Journal (2003) 3, Number 1, pp. 71 80 From the help desk: Bootstrapped standard errors Weihua Guan Stata Corporation Abstract. Bootstrapping is a nonparametric approach for evaluating the distribution

More information

Schools Value-added Information System Technical Manual

Schools Value-added Information System Technical Manual Schools Value-added Information System Technical Manual Quality Assurance & School-based Support Division Education Bureau 2015 Contents Unit 1 Overview... 1 Unit 2 The Concept of VA... 2 Unit 3 Control

More information

Wooldridge, Introductory Econometrics, 4th ed. Chapter 7: Multiple regression analysis with qualitative information: Binary (or dummy) variables

Wooldridge, Introductory Econometrics, 4th ed. Chapter 7: Multiple regression analysis with qualitative information: Binary (or dummy) variables Wooldridge, Introductory Econometrics, 4th ed. Chapter 7: Multiple regression analysis with qualitative information: Binary (or dummy) variables We often consider relationships between observed outcomes

More information