AN INTRODUCTION TO ECONOMETRICS. Oxbridge Economics; Mo Tanweer

Similar documents
Econometrics Simple Linear Regression

Simple Regression Theory II 2010 Samuel L. Baker

Introduction to Regression and Data Analysis

Simple linear regression

IMPACT EVALUATION: INSTRUMENTAL VARIABLE METHOD

Univariate Regression

17. SIMPLE LINEAR REGRESSION II

APPLICATION OF LINEAR REGRESSION MODEL FOR POISSON DISTRIBUTION IN FORECASTING

CHAPTER 13 SIMPLE LINEAR REGRESSION. Opening Example. Simple Regression. Linear Regression

Intro to Data Analysis, Economic Statistics and Econometrics

Linear Regression. Chapter 5. Prediction via Regression Line Number of new birds and Percent returning. Least Squares

A Primer on Forecasting Business Performance

2013 MBA Jump Start Program. Statistics Module Part 3

NCSS Statistical Software Principal Components Regression. In ordinary least squares, the regression coefficients are estimated using the formula ( )

Solución del Examen Tipo: 1

Course Objective This course is designed to give you a basic understanding of how to run regressions in SPSS.

16 : Demand Forecasting

Correlation key concepts:

Lecture 15. Endogeneity & Instrumental Variable Estimation

Regression Analysis (Spring, 2000)

2. Linear regression with multiple regressors

2. Simple Linear Regression

Multiple Regression Used in Macro-economic Analysis

Chapter 4: Vector Autoregressive Models

Forecast. Forecast is the linear function with estimated coefficients. Compute with predict command

Section 14 Simple Linear Regression: Introduction to Least Squares Regression

Module 5: Multiple Regression Analysis

The Life-Cycle Motive and Money Demand: Further Evidence. Abstract

MGT 267 PROJECT. Forecasting the United States Retail Sales of the Pharmacies and Drug Stores. Done by: Shunwei Wang & Mohammad Zainal

ANNUITY LAPSE RATE MODELING: TOBIT OR NOT TOBIT? 1. INTRODUCTION

Note 2 to Computer class: Standard mis-specification tests

Scatter Plot, Correlation, and Regression on the TI-83/84

MULTIPLE REGRESSION ANALYSIS OF MAIN ECONOMIC INDICATORS IN TOURISM. R, analysis of variance, Student test, multivariate analysis

Chapter 5 Estimating Demand Functions

Practical. I conometrics. data collection, analysis, and application. Christiana E. Hilmer. Michael J. Hilmer San Diego State University

Calculating the Probability of Returning a Loan with Binary Probability Models

HYPOTHESIS TESTING: CONFIDENCE INTERVALS, T-TESTS, ANOVAS, AND REGRESSION

Simple Linear Regression Inference

An Analysis of the Telecommunications Business in China by Linear Regression

1) Write the following as an algebraic expression using x as the variable: Triple a number subtracted from the number

Chapter 5. Conditional CAPM. 5.1 Conditional CAPM: Theory Risk According to the CAPM. The CAPM is not a perfect model of expected returns.

Example: Boats and Manatees

Not Your Dad s Magic Eight Ball

Testing for Granger causality between stock prices and economic growth

Chapter 10. Key Ideas Correlation, Correlation Coefficient (r),

Module 6: Introduction to Time Series Forecasting

Chapter 13 Introduction to Linear Regression and Correlation Analysis

A Review of Cross Sectional Regression for Financial Data You should already know this material from previous study

The Impact of Local Economic Conditions on Casinos Revenues

Answer: C. The strength of a correlation does not change if units change by a linear transformation such as: Fahrenheit = 32 + (5/9) * Centigrade

Week TSX Index

Analyzing the Effect of Change in Money Supply on Stock Prices

Marketing Mix Modelling and Big Data P. M Cain

COURSES: 1. Short Course in Econometrics for the Practitioner (P000500) 2. Short Course in Econometric Analysis of Cointegration (P000537)

Chapter Seven. Multiple regression An introduction to multiple regression Performing a multiple regression on SPSS

X X X a) perfect linear correlation b) no correlation c) positive correlation (r = 1) (r = 0) (0 < r < 1)

Topic 1 - Introduction to Labour Economics. Professor H.J. Schuetze Economics 370. What is Labour Economics?

Data Mining and Data Warehousing. Henryk Maciejewski. Data Mining Predictive modelling: regression

LOGIT AND PROBIT ANALYSIS

Introduction to Quantitative Methods

MULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS

Integrated Resource Plan

Introduction to time series analysis

4. Simple regression. QBUS6840 Predictive Analytics.

MULTIPLE REGRESSION WITH CATEGORICAL DATA

= C + I + G + NX ECON 302. Lecture 4: Aggregate Expenditures/Keynesian Model: Equilibrium in the Goods Market/Loanable Funds Market

SPSS Guide: Regression Analysis

DOES A STRONG DOLLAR INCREASE DEMAND FOR BOTH DOMESTIC AND IMPORTED GOODS? John J. Heim, Rensselaer Polytechnic Institute, Troy, New York, USA

Mathematical Model of Income Tax Revenue on the UK example. Financial University under the Government of Russian Federation

THE IMPACT OF MACROECONOMIC FACTORS ON NON-PERFORMING LOANS IN THE REPUBLIC OF MOLDOVA

Association Between Variables

DEPARTMENT OF PSYCHOLOGY UNIVERSITY OF LANCASTER MSC IN PSYCHOLOGICAL RESEARCH METHODS ANALYSING AND INTERPRETING DATA 2 PART 1 WEEK 9

Part 2: Analysis of Relationship Between Two Variables

Causal Forecasting Models

TEMPORAL CAUSAL RELATIONSHIP BETWEEN STOCK MARKET CAPITALIZATION, TRADE OPENNESS AND REAL GDP: EVIDENCE FROM THAILAND

Teaching Multivariate Analysis to Business-Major Students

ijcrb.com INTERDISCIPLINARY JOURNAL OF CONTEMPORARY RESEARCH IN BUSINESS AUGUST 2014 VOL 6, NO 4

Correlational Research. Correlational Research. Stephen E. Brock, Ph.D., NCSP EDS 250. Descriptive Research 1. Correlational Research: Scatter Plots

Do Commodity Price Spikes Cause Long-Term Inflation?

Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression

Is the Forward Exchange Rate a Useful Indicator of the Future Exchange Rate?

Introductory Microeconomics

Regression III: Advanced Methods

Getting Correct Results from PROC REG

Homework 8 Solutions

ANSWERS TO END-OF-CHAPTER QUESTIONS

Forecasting in STATA: Tools and Tricks

Introduction to Macroeconomics TOPIC 2: The Goods Market

Econometrics Problem Set #2

1. The parameters to be estimated in the simple linear regression model Y=α+βx+ε ε~n(0,σ) are: a) α, β, σ b) α, β, ε c) a, b, s d) ε, 0, σ

Research Methods & Experimental Design

Forecasting in supply chains

A Short review of steel demand forecasting methods

8. Time Series and Prediction

OLS Examples. OLS Regression

Section A. Index. Section A. Planning, Budgeting and Forecasting Section A.2 Forecasting techniques Page 1 of 11. EduPristine CMA - Part I

2. What is the general linear model to be used to model linear trend? (Write out the model) = or

Premaster Statistics Tutorial 4 Full solutions

Transcription:

AN INTRODUCTION TO ECONOMETRICS Oxbridge Economics; Mo Tanweer Mohammed.Tanweer@cantab.net

Econometrics What is econometrics? Econometrics means economic measurement Economics + Statistics = Econometrics Econometrics is concerned with the tasks of developing and applying quantitative or statistical methods to the study of economic principles Econometrics is the use of statistical techniques to analyse economic data and compare with economic theory One of its aims is to give empirical content to economic theory What makes Econometrics different to Statistics? Economic theories tend to be qualitative in nature Economic data tends to be observational and more complex

If a government decreases the duration of time it offers unemployment benefit, does this lead to a lower unemployment rate? Do fertility decisions in Pakistan exhibit socio-economic conforming behaviour? Examples of econometrics As economists, we would like to understand the relationship between economic variables. Is human capital a fundamental cause of growth? Do improvements in educational spending by governments improve academic performance? Can increases in the price of oil lead to reductions in national income? Does an increase in national savings lead to an increase in investment?

Regression Analysis When we consider the nature and form of a relationship between any two or more variables, the analysis is referred to as regression analysis. Theoretical econometrics: considers questions about the statistical properties of estimators and tests, Applied econometrics: is concerned with the application of econometric methods to assess economic theories Descriptive Forecasting Causal - How does the stock market and interest rate move together? - Will we have a recession next year? - If we raise the minimum wage, would unemployment soar?

Data types E.g. Incomes for a country Time-series - A data set containing observations on a single phenomenon observed over multiple time periods is called time series - In time series data, both the values and the ordering of the data points have meaning Cross-section - A data set containing observations on multiple phenomena observed at a single point in time is called cross-sectional - In cross-sectional data sets, the values of the data points have meaning, but the ordering of the data points does not. Panel / Longtitudinal / TSCS - Two-dimensional data - A data set containing observations on multiple phenomena observed over multiple time periods is called panel data

Time-series Forecasting Chronology matters Interdependency issue Seasonal data

Time-series Forecasting Chronology matters Interdependency issue Seasonal data

Cross-sectional

Panel/Longitudinal The Panel Study of Income Dynamics The PSID had collected information on more than 70,000 individuals spanning as many as 4 decades of their lives. measures economic, social, and health factors over the life course and across generations Do fertility decisions in Pakistan exhibit socio-economic conforming behaviour? Write down a list of data variables you would like in your model

Statistical Packages EViews SPSS Microfit Stata PcGive Minitab Shazam Excel

Correlation vs Causation Correlation is not necessarily causation. Post hoc ergo propter hoc / Cum hoc ergo propter hoc

Correlation vs Causation Be careful what you infer from your statistical analyses. Be sure your relationship makes sense. A occurs in correlation with B. Therefore, A causes B. This is a logical fallacy because there are at least four other possibilities: B may be the cause of A some unknown third factor C is actually the cause of both A and B B may be the cause of A at the same time as A is the cause of B coincidence

Econometrics in practice The relationship between Income and Consumption. Economic theory: Keynes postulated a positive relationship between consumption and income General Theory of Employment, Interest and Money. The fundamental psychological law is that men [women] are disposed, as a rule and on average, to increase their consumption as their income increases, but not as much as the increase in their income Statistics : Let s get the data and look at the causal relationship

Income and Consumption Data on Personal Consumption Expenditure And Gross Domestic Product;1982-1996) all in 1992 billions of dollars C GDP 1982 3081.5 4620.3 1983 3240.6 4803.7 1984 3407.6 5140.1 1985 3566.5 5323.5 1986 3708.7 5487.7 1987 3822.3 5649.5 1988 3972.7 5865.2 1989 4064.6 6062 1990 4132.2 6136.3 1991 4105.8 6079.4 1992 4219.8 6244.4 1993 4343.6 6389.6 1994 4486 6610.7 1995 4595.3 6742.1 1996 4714.1 6928.4 Casual observation suggests that the relationship between income and consumption is positive in that consumption rises as income rises. However, we want to analyse more formally if a relationship exists. We use regression analysis to look at this relationship formally.

Income and Consumption Consumption is the dependent variable ) 7500 7000 6500 6000 5500 5000 4500 4000 3500 3000 Consumption vs Income 3000 3200 3400 3600 3800 4000 4200 4400 4600 4800 In regression analysis an important issue is the direction of causation between variables. Consumption is the dependent variable (Y) Disposable Income is the Income independent variable (X) This confirms the positive relationship but we need to be more rigorous in our analysis Disposable Income is the independent variable

The Theory: Keynes Let us assume that the relationship between consumption and income takes the form of the Keynesian consumption function: Where is between 0-1 and are known as the PARAMETERS of the model (a.k.a the intercept and slope coefficients ) Y is the dependent variable, in this case, consumption X is the independent variable, in this case, income is a measure of what?

Terminology Dependent variable Explained variable Predictand Regressand Endogenous Controlled variable Independent variable Explanatory variable Predictor Regressor Exogenous Control variable

Econometrics is the POPULATION regression equation The actual consumption Y of a household will not always equal its expected value E(Y). Actual consumption of a household may be disturbed from its expected value by any one of innumerable factors, and we shall therefore write actual consumption as: The disturbance (u) (or, e, error term) represents the effect on household consumption of all variables other than income.

Econometrics The population regression equation is unknown to any investigator, and remains unknown. Therefore have to fit a straight line to the scatter points. This line can then be regarded as an estimate of the population equation. Y Yˆ ˆ X ˆ The fitted line we write as: Yˆ ˆ X ˆ The sample regression equation represents a straight line with intercept alpha-hat and slope beta-hat. Y-hat is known as the predicted value (or estimate )of Y X

Econometrics Y X Y X ˆ ˆ ˆ Y Residual u X Y ˆ ˆ ˆ ˆ So the residuals (or error terms) are simply the differences between the actual and estimated Y values. So we minimise the sum of residuals Sum of U-hat i = SUM(Y i Y-hat i ) making it as small as possible u Y Y ˆ ˆ Y Y u ˆ ˆ B X Y u 1 ˆ ˆ ˆ

Econometrics But this won t work e.g. +10, -10; +2, -2 = sum of errors = 0. So we use the LEAST SQUARES METHOD Sum of (u-hat) 2 = sum (y yhat) 2 = Sum (y α-hat β 1 hat-x) 2 By squaring, you give more weights to the bigger errors. Y Yˆ ˆ X ˆ Residual So now it is not possible for the Sum of the u-hats (the error terms) to be small even if u is widely spread. OBVIOUSLY each time we change α and β 1 (hats), we will change the u-hat. So we need to pick / find a α and β 1 -hat such that the u-hat is as small as possible. = OLS X

OLS When we fit a sample regression line to a scatter of points, it makes sense to select a line (that is, choose values for alpha-hat and beta-hat) such that the residuals given by that result are in some sense small. The most popular and best known way of ensuring this is to choose alpha-hat and beta-hat so as to minimise the sum of the squares of the residuals. This method of estimating the parameters alpha and beta is known as the method of ordinary least squares (OLS). Provided that a whole series of further assumptions are valid, the OLS method can be shown to provide good estimators of alpha and beta. Computer software can compute OLS regressions automatically. The results may be reported as: Yˆ 30.71 0. 812 X Y Yˆ ˆ X ˆ Residual X

Measures of closeness of Fit The sample regression equation fits the scatter fairly closely But fairly closely is a vague expression, and it is often convenient to have a precise summary statistic (that is, a single number) by which we can assess and compare the closeness of fit of different scatters and different sample lines. We ask: What proportion of the variation in the consumption among our sample of households can we attribute to the variation in their incomes? We define the coefficient of determination (R 2 ) as the proportion of the sample variation in Y that can be attributed to the sample variation in X. The closer the coefficient is to one, the better the line will fit the points. Y Y R 2 = 0.86 X R 2 = 0.58 X

Testing for significance OLS regression will always try to fit a line through the points However we want to be able to test if our coefficient is significantly different from zero. Does household disposable income have a STATISTICALLY SIGNIFICANT effect on household consumption? To do this we calculate a t-value: t = β/s.e. That is, the t-value is the coefficient value divided by it s standard error (a measure of variance) This value can be looked up in statistical tables. However, as a rule of thumb a variable is significant if the t-value is greater than 2.

Modelling Economic Theory Mathematical model of theory Econometric model of theory Data Estimation of econometric model Hypothesis Testing Forecasting or prediction Using the model for policy purposes

Mis-specification Once you have an economic theory, collected the data, you still have to come up with a mathematical model to test. Getting the mathematical model is CRUCIAL E.g. Data on Demand for Apples From microeconomic theory, it is known that the demand for a commodity generally depends on the real income of the consumer, the real price of the commodity, and the real prices of competing or complementary commodities. Specification of the model what shape will this demand curve take? what variables do I need to put into this specification? E.g. real income; real price of apples; real price of oranges; real price of bananas Do I model it: Demand for Apples = α + β 1 (Price of Apples) + β 2 (Price of oranges) + β 3 (Real income) Demand for Apples = α + β 1 Ln(Price of Apples) + β 2 (Price of oranges+price of bananas) + β 3 (Real Income) 2

A word to the Econometrician It should be clear that model building is an art as well as a science: The Ten Commandments of Applied Econometrics (Feldstein) 1) Thou shalt use common sense and economic theory 2) Thou shalt ask the right questions (i.e. put prevalence before mathematical elegance) 3) Thou shalt know the context (do not perform ignorant statistical analysis) 4) Thou shalt inspect the data 5) Thou shalt not worship complexity 6) Thou shalt look long and hard at the results 7) Thou shalt beware the costs of data mining 8) Thou shalt not confuse statistical significance with practical signifiance 9) Thou shalt use common sense and economic theory 10) Thou shalt not underestimate the task of data collection

AN INTRODUCTION TO ECONOMETRICS Oxbridge Economics; Mo Tanweer Mohammed.Tanweer@cantab.net