Ordinary Least Squares: the univariate case
|
|
- Rebecca Owens
- 7 years ago
- Views:
Transcription
1 : the univariate case Majeure Economie September 2011
2 1 Introduction 2 The OLS method Objective and principles of OLS Deriving the OLS estimates Do OLS keep their promises? 3 The linear causal model Assumptions Identification and estimation Limits 4 A simulation & applications OLS do not always yield good estimates... But things can be improved... Empirical applications 5 Conclusion and exercises
3 Objectives Objective 1 : to make the best possible guess on a variable Y based on X. Find a function of X which yields good predictions for Y. Given cigarette prices, what will be cigarettes sales in September 2010 in France? Objective 2 : to determine the causal mechanism by which X influences Y. Cetebus paribus type of analysis. Everything else being equal, how a change in X affects Y? By how much one more year of education increases an individual s wage? By how much the hiring of more policemen would decrease the crime rate in Paris? The tool we use = a data set, in which we have the wages and number of years of education of N individuals.
4 Objective and principles of OLS What we have and what we want For each individual in our data set we observe his wage and his number of years of education. Assume we have a graph such as the one below. Relationship between the two variable seems to be linear. We want to find the line which describes best the relationship between these variables Wage Years of Schooling
5 Objective and principles of OLS The principle of OLS A line is characterized by a slope and by an intercept that we denote α and β. Idea = choose for α and β the values which minimize (Yi α β X i ) 2. Estimates. Let us denote Ŷi = α + β X i. It represents the wage of individual i as predicted by our model. We also denote ε i = Y i Ŷi. The ε i are called the estimated residuals and represent the mistake made by our model when predicting individual i s wage based on his number of years of schooling. => the principle of OLS is merely to minimize the sum of the mistakes we make when we use an affine function of X i to predict Y i. Why do we take the square of ε i? Could we have used another function?
6 Objective and principles of OLS A graphical example Wage Years of Schooling
7 Deriving the OLS estimates Finding α and β (Theorem 1.1) We denote Y = 1 N Yi the empirical mean of (Y i ), X the empirical mean of (X i ), V e (X ) = 1 N X 2 i ( 1 ) 2 N Xi the empirical variance of (X i ) and finally cov e (X, Y ) = 1 N Xi Y i X Y the empirical covariance of (X i ) and (Y i ). We want to minimize f ( α, β) = (Y i α β X i ) 2. Solution: β = cov e(x,y ) cove(x,y ) V e(x ) and α = Y V e(x ) X. Can we compute β from the sample? Any problem with the computations? Any idea to interpret this result?
8 Deriving the OLS estimates An example Compute β in this simple example: Individual Years of Schooling Wage
9 Do OLS keep their promises? Do OLS attain objectives 1 and 2? Objective 1: find the best prediction for Y based on X / find a function P(X i ) of X i which yields good predictions for Y i. Objective 2: determine the causal mechanism by which X influences Y.
10 Do OLS keep their promises? OLS partially reach objective 1. Once agreed that a good prediction is a prediction which minimizes the square of errors, OLS yield by construction the best prediction function for Y, among all affine functions of X. But: the criterion can be challenged: minimize ε i instead of 2 εi. This is not so big an issue. Quantile regression models minimize ε i and results usually close from OLS. even if the criterion is accepted, OLS yield the best prediction function among all affine functions of X, not among all functions of X. There might for instance exist a polynomial function of X : α + β X + γ X 2 which yields errors ε i such that ( ε i ) 2 < εi 2. Not so big an issue neither, see next chapter. How to measure the extent to which Objective 1 is reached?
11 Do OLS keep their promises? The R 2 : a measure of the quality of our predictions SST = (Y i Ȳ ) 2 : the dispersion of wages. SSE = (Ŷi Ȳ ) 2 : the dispersion of predicted wages. SSR = (Y i Ŷi) 2 : the sum of the square of the errors. SST = (Y i Ȳ ) 2 = (Y i Ŷi + Ŷi Ȳ ) 2 = (Yi Ŷi) 2 + (Ŷi Ȳ ) ε i (Ŷi Ȳ ) = SSE + SSR + 2 α ε i + 2 β ε i X i 2Y ε i. According to FOC1, ε i = 0, according to FOC2, εi x i = 0. Therefore, SST = SSE + SSR. R 2 = SSE SST. The R2 is always included between 0 and 1 (why?). It is a measure of the share of the variance observed in the sample our model is able to account for, of the quality of our predictions for Y based on X. However, a model with a low R-square can still be helpful and models with high R-squared can be helpless.
12 Do OLS keep their promises? But OLS do not necessarily reach objective Wage Years of Schooling Individuals with more schooling have higher wages. Does it imply that schooling has a causal impact on wages?
13 Do OLS keep their promises? But OLS do not necessarily reach objective 2. The line can be inverted causality goes in the other direction. Reverse causality. Here, not an issue: higher wages cannot cause longer education because schooling takes place before labor market participation. Individuals with many years of schooling make more money than those with few years of schooling. But do those two groups only differ on their number of years of schooling? Probably not. For instance, those with more years of schooling might have richer parents, or might also be more clever. this correlation between wages and education, is it only due to the effect of education on wages, or to the fact that those with more education are also more clever and have richer parents? Omitted variable bias.
14 Do OLS keep their promises? A causal framework Parents wage Well paid parents can afford sending their children to school, then to college and finally to university Well paid parents have good networking skills, know how to get good positions => can help their children Children s education Education increases children s productivity + ability to find a well paid job (signalling theory) Children s wage True causal impact of education on wages = green cell. If this framework is true, does β, i.e. the correlation between children s education and wage measures the green cell only? Does it overestimate or underestimate the green cell?
15 Assumptions Positing a linear causal model We assume that for every individual, his income is generated according to the following model: Income = α + β Number of Years of Education + ε More formally: Y i = α + β X i + ε i. Y i is the dependent variable, X i the explanatory variable, and ε i the error term: all other determinants of income (cleverness, gender...). Assumption 1. β measures by how much wage changes when education of an individual increases by one year and all the other determinants of income (ε) remain unchanged (cetebus paribus impact of education), i.e. the causal impact of education on income. Assuming that education has an influence on income does not seem to be too big an assumption. However, we assume that this influence is linear, when the number of years of education is increased by 1, wage increases by β. Realistic? Moreover, we assume that this influence is the same for everyone: β does not depend on i. Realistic?
16 Assumptions Why is linearity not so stupid an assumption... If the relationship between the data does not look linear at all, you can try to estimate a different equation: Y i = α + βxi 2 + ε i for instance if the relationship is quadratic. If the data looks as in the graph below, which relationship do you want to estimate?
17 Assumptions Other assumptions Assumption 2 : random sampling. (X i, ε i ) is independent from (X j, ε j ). This amounts to say that the number of years of education completed by Mr Dupont, or his marital status, is not related to Mr Duchamp s who lives fifty kilometers from him and whom he does not know. This seems fairly credible. Assumption 3 : sample variation. In our sample, not all the X i are equal. Trivial assumption: if it is not verified, that is to say if all the individuals in our sample have the same number of years of education, it is impossible to determine the impact of education on wage from our data. This implies that V e (X i ) > 0. Assumption 4: ε i X i Question: in our example of wage and education, do you believe that ε i X i?
18 Identification and estimation What is identification? Identification amounts to finding a formula relating an unknown parameter (here this unknown parameter will be β, the causal impact of education on wages) to quantities that we can estimate from the data.
19 Identification and estimation Identification of the linear model Theorem: under assumption 1 to 4, β is identified. Proof: cov(y i, X i ) = cov(α + β X i + ε i, X i ) according to assumption 1 = cov(α, X i ) + βcov(x i, X i ) + cov(ε i, X i ) according to the properties of covariance = βv (X i ) since cov(α, X i ) = 0 and cov(ε i, X i ) = 0 according to assumption 5. Therefore, β = cov(y i,x i ) V (X i ).
20 Identification and estimation How to estimate β? As shown above, β = cov(y i,x i ) V (X i ). Any idea on a good estimator β?
21 Identification and estimation Consistency of β β = cove(y i,x i ) V e(x i ). Law of large numbers: cov e (Y i, X i ) cov(y i, X i ) and V e (X i ) V (X i ). Therefore: β β = cov(y i,x i ) V (X i ) when the number of observations in the sample goes to infinity.
22 Identification and estimation Asymptotic normality of β The OLS estimators are asymptotically normal, in the sense that σ N( β β) N(0, 2 V (X )) (central limit theorem) The meaning of this is that when the size of the sample is large, we can state that N( β β) is approximately normally distributed. Proof at page 177 of your text book. This result is important to build up confidence intervals for β.
23 Identification and estimation Variance of β Let us denote σ 2 = V (ε i ). The variance of β is equal to σ 2 (Xi X ) 2 (you can find a proof at page 55 of the textbook): It is increasing with σ 2. The more the error term is spread, the harder it is to estimate precisely β. For instance, assume that unobserved determinants of wage (ambition, ability, age...) play an important role in wage setting. For some individuals, ε i will take very high positive values, and for others it will take very low negative values. We will therefore be likely to be faced to individuals with low levels of education and high wages and conversely, which will make the estimation of β difficult. The more X i is volatile in our sample, the more precisely we estimate β. Finally, (X i X ) 2 is increasing with N, the number of people in our sample.
24 Identification and estimation Estimating σ 2 In next session we will need to use an estimator of the variance of the error term. Usually, to estimate for instance a theoretical mean, we use the empirical one. Here, we use the same idea: to estimate the variance of the error term, a natural idea would be to use the empirical 1 variance of the estimated residuals: 2 N εi. This estimator indeed converges to σ 2 (LLN). However it is biased: one can show that E( 1 2 N εi ) = N 2 N σ2. Thus, we prefer to use the following unbiased estimator σ 2 = 1 2 N 2 εi. It is easy to show that this estimator also converges to σ 2.
25 Limits Link with OLS In the linear model, β represents the causal impact of X on Y. Under various (very strong) assumptions, one can show that β = cov(y i,x i ) V (X i ), which can be estimated from the sample by the quantity β = cove(x,y ) V e(x ). As you may have noticed, this estimator β is the same as the quantity we derived in section 2 with the OLS method. => if the linear model assumptions are verified, then predictions based on OLS are not only the best predictions for Y based on X, but β also describes the causal impact of X on Y. But are the linear model assumptions credible?
26 Limits Review of the assumptions of the linear model Assumption 1: fairly credible up to the linear approximation (impact of education on wage might not be linear) and to the constant effect assumption Assumption 2 and 3: credible. Assumption 4: extremely strong assumption. Amounts to stating that X is not correlated to all other determinants of Y. Credible in the wage / education example?
27 Limits What happens if assumption 4 is not verified? Theorem: If assumption 5 is not verified, then the OLS estimator β is not a consistent estimator of β, the causal impact of X on Y. Proof: cov(y i, X i ) = cov(α + β X i + ε i, X i ) = βcov(x i, X i ) + cov(ε i, X i ) Therefore, β = cov(y i,x i ) V (X i ) cov(ε i,x i ) V (X i ). Since β cov(y i,x i ) V (X i ), β is not consistent. The asymptotic bias, that is to say the difference between the limit of β and β is equal to cov(ε i,x i ) V (X i ) : the stronger the correlation between ε and X, the larger the bias. If X and ε are positively (resp. negatively) related, β overestimates (resp. underestimates) β. In the wage / education example, do you think β over or underestimates β?
28 OLS do not always yield good estimates... Generating 18 random pairs for wage and education (1/2) Open an Excel file, write in cell A1 to A18 = 2000 (alea() 0, 5) if you have the French version of Excel. The 18 random numbers you have generated thus stand for the ε in our model. They are supposed to be independent. Do they verify the other assumptions we made on the ε? What kind of distribution do they follow? What is their expectation and their variance? Then, write from cell B1 to B18 = ent(10 + alea() 10). These 18 random numbers stand for the number of schooling years. Do they verify the assumptions we imposed on the X i? Finally, write in cell C1 = B1 + A1, and extend this formula until C18. What do these 18 numbers stand for? Do the X i truly have a causal impact on the Y i here? In this experiment, what are the true values of α and β?
29 OLS do not always yield good estimates... Generating 18 random pairs for wage and education (2/2) Select cell B1 to C18, go to the assistant graphique and make a graph, choosing the option nuage de points. Once this is done, select your graph and go to the graphic menu, select the Ajouter une courbe de tendance option. Choose the linear type of curve and go to options. Select Afficher l équation sur le graphique and Afficher le coefficient de détermination sur le graphique. Once this is done, write down on a sheet of paper the values for β that appears on the graphic. Is it close to the trueβ? Any idea of why it is the case?
30 OLS do not always yield good estimates... What I get y = 32,612x ,6 R 2 = 0,0369 Wage Years of Schooling
31 But things can be improved... Illustrating some points of the course In the first column, write = 200 (alea() 0, 5) instead of = 2000 (alea() 0, 5). Is your new estimate β closer from the true β? What is your intuition to explain this result? Now write = 4000 (alea() 0, 5) in cell A1 and extend the formulas in cells A1, B1 and C1 up to A200, B200 and C200. Draw a new graph similar to the previous one but selecting cells from B1 to C200. Is your new estimate β closer from the true β? What is your intuition to explain this result?
32 But things can be improved... What I get y = 101,08x ,8 R 2 = 0,9661 Wage Years of Schooling
33 But things can be improved... What I get Wage y = 97,233x ,2 R 2 = 0, Years of Schooling
34 Empirical applications Consequences of smoking when pregnant In a sample of American mothers who gave birth to a child in 1988, we estimate the following relationship: weight of the child in grams = α + β daily cigarettes smoked by mother during pregnancy + ε. Results: α = 3395, β = 14, 57. How to interpret β? Are the various assumptions needed for OLS to be unbiased etc. verified here according to you?
35 Empirical applications Consequences of attending a class on exam grade Assume we want to estimate the following model among students attending an econometric course: final grade = α + β number of classes attended + ε. Do you think that the estimated value β would estimate properly the true causal impact of attendance on final grade?
36 Conclusion Today, we have seen the OLS technique to make a prediction for Y based on X. We have seen that up to two small limits, this prediction is the best we can make => our first goal was reached. However, we have seen that OLS estimators also describe the causal impact of X on Y iif a very restrictive assumption is made, which is that X is uncorrelated to all other determinants of Y. But in many situations, unlikely to hold => in most cases we will not be able to achieve our second goal with OLS. Finally, we have seen with some simulations that even in situations where all OLS assumptions are verified (which we can be sure of because we used data generated by the computer), OLS estimators can be far from the true values when the sample size is small. => do not do statistics with small samples! References Clément for this de Chaisemartin chapter: chapter Ordinary2Least and Squares 5 of your textbook.
Econometrics Simple Linear Regression
Econometrics Simple Linear Regression Burcu Eke UC3M Linear equations with one variable Recall what a linear equation is: y = b 0 + b 1 x is a linear equation with one variable, or equivalently, a straight
More information2. Linear regression with multiple regressors
2. Linear regression with multiple regressors Aim of this section: Introduction of the multiple regression model OLS estimation in multiple regression Measures-of-fit in multiple regression Assumptions
More informationIAPRI Quantitative Analysis Capacity Building Series. Multiple regression analysis & interpreting results
IAPRI Quantitative Analysis Capacity Building Series Multiple regression analysis & interpreting results How important is R-squared? R-squared Published in Agricultural Economics 0.45 Best article of the
More informationOverview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model
Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model 1 September 004 A. Introduction and assumptions The classical normal linear regression model can be written
More informationFinancial Risk Management Exam Sample Questions/Answers
Financial Risk Management Exam Sample Questions/Answers Prepared by Daniel HERLEMONT 1 2 3 4 5 6 Chapter 3 Fundamentals of Statistics FRM-99, Question 4 Random walk assumes that returns from one time period
More informationSection 14 Simple Linear Regression: Introduction to Least Squares Regression
Slide 1 Section 14 Simple Linear Regression: Introduction to Least Squares Regression There are several different measures of statistical association used for understanding the quantitative relationship
More informationECON 142 SKETCH OF SOLUTIONS FOR APPLIED EXERCISE #2
University of California, Berkeley Prof. Ken Chay Department of Economics Fall Semester, 005 ECON 14 SKETCH OF SOLUTIONS FOR APPLIED EXERCISE # Question 1: a. Below are the scatter plots of hourly wages
More informationAnswer: C. The strength of a correlation does not change if units change by a linear transformation such as: Fahrenheit = 32 + (5/9) * Centigrade
Statistics Quiz Correlation and Regression -- ANSWERS 1. Temperature and air pollution are known to be correlated. We collect data from two laboratories, in Boston and Montreal. Boston makes their measurements
More informationMultiple Linear Regression in Data Mining
Multiple Linear Regression in Data Mining Contents 2.1. A Review of Multiple Linear Regression 2.2. Illustration of the Regression Process 2.3. Subset Selection in Linear Regression 1 2 Chap. 2 Multiple
More informationEmpirical Methods in Applied Economics
Empirical Methods in Applied Economics Jörn-Ste en Pischke LSE October 2005 1 Observational Studies and Regression 1.1 Conditional Randomization Again When we discussed experiments, we discussed already
More informationSimple Linear Regression Inference
Simple Linear Regression Inference 1 Inference requirements The Normality assumption of the stochastic term e is needed for inference even if it is not a OLS requirement. Therefore we have: Interpretation
More informationEconometrics Problem Set #2
Econometrics Problem Set #2 Nathaniel Higgins nhiggins@jhu.edu Assignment The homework assignment was to read chapter 2 and hand in answers to the following problems at the end of the chapter: 2.1 2.5
More informationSimple Regression Theory II 2010 Samuel L. Baker
SIMPLE REGRESSION THEORY II 1 Simple Regression Theory II 2010 Samuel L. Baker Assessing how good the regression equation is likely to be Assignment 1A gets into drawing inferences about how close the
More informationChapter 10. Key Ideas Correlation, Correlation Coefficient (r),
Chapter 0 Key Ideas Correlation, Correlation Coefficient (r), Section 0-: Overview We have already explored the basics of describing single variable data sets. However, when two quantitative variables
More informationCALCULATIONS & STATISTICS
CALCULATIONS & STATISTICS CALCULATION OF SCORES Conversion of 1-5 scale to 0-100 scores When you look at your report, you will notice that the scores are reported on a 0-100 scale, even though respondents
More informationModule 5: Multiple Regression Analysis
Using Statistical Data Using to Make Statistical Decisions: Data Multiple to Make Regression Decisions Analysis Page 1 Module 5: Multiple Regression Analysis Tom Ilvento, University of Delaware, College
More informationUnit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression
Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression Objectives: To perform a hypothesis test concerning the slope of a least squares line To recognize that testing for a
More informationSession 7 Bivariate Data and Analysis
Session 7 Bivariate Data and Analysis Key Terms for This Session Previously Introduced mean standard deviation New in This Session association bivariate analysis contingency table co-variation least squares
More informationSimple linear regression
Simple linear regression Introduction Simple linear regression is a statistical method for obtaining a formula to predict values of one variable from another where there is a causal relationship between
More informationChapter 13 Introduction to Nonlinear Regression( 非 線 性 迴 歸 )
Chapter 13 Introduction to Nonlinear Regression( 非 線 性 迴 歸 ) and Neural Networks( 類 神 經 網 路 ) 許 湘 伶 Applied Linear Regression Models (Kutner, Nachtsheim, Neter, Li) hsuhl (NUK) LR Chap 10 1 / 35 13 Examples
More informationSection 1: Simple Linear Regression
Section 1: Simple Linear Regression Carlos M. Carvalho The University of Texas McCombs School of Business http://faculty.mccombs.utexas.edu/carlos.carvalho/teaching/ 1 Regression: General Introduction
More informationCovariance and Correlation
Covariance and Correlation ( c Robert J. Serfling Not for reproduction or distribution) We have seen how to summarize a data-based relative frequency distribution by measures of location and spread, such
More informationCAPM, Arbitrage, and Linear Factor Models
CAPM, Arbitrage, and Linear Factor Models CAPM, Arbitrage, Linear Factor Models 1/ 41 Introduction We now assume all investors actually choose mean-variance e cient portfolios. By equating these investors
More information4. Continuous Random Variables, the Pareto and Normal Distributions
4. Continuous Random Variables, the Pareto and Normal Distributions A continuous random variable X can take any value in a given range (e.g. height, weight, age). The distribution of a continuous random
More informationLecture 15. Endogeneity & Instrumental Variable Estimation
Lecture 15. Endogeneity & Instrumental Variable Estimation Saw that measurement error (on right hand side) means that OLS will be biased (biased toward zero) Potential solution to endogeneity instrumental
More information3.1 Least squares in matrix form
118 3 Multiple Regression 3.1 Least squares in matrix form E Uses Appendix A.2 A.4, A.6, A.7. 3.1.1 Introduction More than one explanatory variable In the foregoing chapter we considered the simple regression
More informationSolución del Examen Tipo: 1
Solución del Examen Tipo: 1 Universidad Carlos III de Madrid ECONOMETRICS Academic year 2009/10 FINAL EXAM May 17, 2010 DURATION: 2 HOURS 1. Assume that model (III) verifies the assumptions of the classical
More informationChapter 1. Vector autoregressions. 1.1 VARs and the identi cation problem
Chapter Vector autoregressions We begin by taking a look at the data of macroeconomics. A way to summarize the dynamics of macroeconomic data is to make use of vector autoregressions. VAR models have become
More informationMULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS
MULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS MSR = Mean Regression Sum of Squares MSE = Mean Squared Error RSS = Regression Sum of Squares SSE = Sum of Squared Errors/Residuals α = Level of Significance
More informationChapter 13 Introduction to Linear Regression and Correlation Analysis
Chapter 3 Student Lecture Notes 3- Chapter 3 Introduction to Linear Regression and Correlation Analsis Fall 2006 Fundamentals of Business Statistics Chapter Goals To understand the methods for displaing
More informationThe Method of Least Squares
Hervé Abdi 1 1 Introduction The least square methods (LSM) is probably the most popular technique in statistics. This is due to several factors. First, most common estimators can be casted within this
More informationEconomics of Strategy (ECON 4550) Maymester 2015 Applications of Regression Analysis
Economics of Strategy (ECON 4550) Maymester 015 Applications of Regression Analysis Reading: ACME Clinic (ECON 4550 Coursepak, Page 47) and Big Suzy s Snack Cakes (ECON 4550 Coursepak, Page 51) Definitions
More informationZero: If P is a polynomial and if c is a number such that P (c) = 0 then c is a zero of P.
MATH 11011 FINDING REAL ZEROS KSU OF A POLYNOMIAL Definitions: Polynomial: is a function of the form P (x) = a n x n + a n 1 x n 1 + + a x + a 1 x + a 0. The numbers a n, a n 1,..., a 1, a 0 are called
More informationWhat s New in Econometrics? Lecture 8 Cluster and Stratified Sampling
What s New in Econometrics? Lecture 8 Cluster and Stratified Sampling Jeff Wooldridge NBER Summer Institute, 2007 1. The Linear Model with Cluster Effects 2. Estimation with a Small Number of Groups and
More informationFairfield Public Schools
Mathematics Fairfield Public Schools AP Statistics AP Statistics BOE Approved 04/08/2014 1 AP STATISTICS Critical Areas of Focus AP Statistics is a rigorous course that offers advanced students an opportunity
More informationMicroeconomics Sept. 16, 2010 NOTES ON CALCULUS AND UTILITY FUNCTIONS
DUSP 11.203 Frank Levy Microeconomics Sept. 16, 2010 NOTES ON CALCULUS AND UTILITY FUNCTIONS These notes have three purposes: 1) To explain why some simple calculus formulae are useful in understanding
More informationRegression III: Advanced Methods
Lecture 16: Generalized Additive Models Regression III: Advanced Methods Bill Jacoby Michigan State University http://polisci.msu.edu/jacoby/icpsr/regress3 Goals of the Lecture Introduce Additive Models
More informationStandard Deviation Estimator
CSS.com Chapter 905 Standard Deviation Estimator Introduction Even though it is not of primary interest, an estimate of the standard deviation (SD) is needed when calculating the power or sample size of
More informationStatistics 151 Practice Midterm 1 Mike Kowalski
Statistics 151 Practice Midterm 1 Mike Kowalski Statistics 151 Practice Midterm 1 Multiple Choice (50 minutes) Instructions: 1. This is a closed book exam. 2. You may use the STAT 151 formula sheets and
More informationPartial Fractions. Combining fractions over a common denominator is a familiar operation from algebra:
Partial Fractions Combining fractions over a common denominator is a familiar operation from algebra: From the standpoint of integration, the left side of Equation 1 would be much easier to work with than
More information1 Another method of estimation: least squares
1 Another method of estimation: least squares erm: -estim.tex, Dec8, 009: 6 p.m. (draft - typos/writos likely exist) Corrections, comments, suggestions welcome. 1.1 Least squares in general Assume Y i
More informationInstitute of Actuaries of India Subject CT3 Probability and Mathematical Statistics
Institute of Actuaries of India Subject CT3 Probability and Mathematical Statistics For 2015 Examinations Aim The aim of the Probability and Mathematical Statistics subject is to provide a grounding in
More information1 Short Introduction to Time Series
ECONOMICS 7344, Spring 202 Bent E. Sørensen January 24, 202 Short Introduction to Time Series A time series is a collection of stochastic variables x,.., x t,.., x T indexed by an integer value t. The
More informationTime Series and Forecasting
Chapter 22 Page 1 Time Series and Forecasting A time series is a sequence of observations of a random variable. Hence, it is a stochastic process. Examples include the monthly demand for a product, the
More informationWooldridge, Introductory Econometrics, 3d ed. Chapter 12: Serial correlation and heteroskedasticity in time series regressions
Wooldridge, Introductory Econometrics, 3d ed. Chapter 12: Serial correlation and heteroskedasticity in time series regressions What will happen if we violate the assumption that the errors are not serially
More information1) Write the following as an algebraic expression using x as the variable: Triple a number subtracted from the number
1) Write the following as an algebraic expression using x as the variable: Triple a number subtracted from the number A. 3(x - x) B. x 3 x C. 3x - x D. x - 3x 2) Write the following as an algebraic expression
More informationCorrelation key concepts:
CORRELATION Correlation key concepts: Types of correlation Methods of studying correlation a) Scatter diagram b) Karl pearson s coefficient of correlation c) Spearman s Rank correlation coefficient d)
More informationForecasting Methods. What is forecasting? Why is forecasting important? How can we evaluate a future demand? How do we make mistakes?
Forecasting Methods What is forecasting? Why is forecasting important? How can we evaluate a future demand? How do we make mistakes? Prod - Forecasting Methods Contents. FRAMEWORK OF PLANNING DECISIONS....
More informationSome useful concepts in univariate time series analysis
Some useful concepts in univariate time series analysis Autoregressive moving average models Autocorrelation functions Model Estimation Diagnostic measure Model selection Forecasting Assumptions: 1. Non-seasonal
More information2013 MBA Jump Start Program. Statistics Module Part 3
2013 MBA Jump Start Program Module 1: Statistics Thomas Gilbert Part 3 Statistics Module Part 3 Hypothesis Testing (Inference) Regressions 2 1 Making an Investment Decision A researcher in your firm just
More informationZeros of Polynomial Functions
Zeros of Polynomial Functions The Rational Zero Theorem If f (x) = a n x n + a n-1 x n-1 + + a 1 x + a 0 has integer coefficients and p/q (where p/q is reduced) is a rational zero, then p is a factor of
More informationCoefficient of Determination
Coefficient of Determination The coefficient of determination R 2 (or sometimes r 2 ) is another measure of how well the least squares equation ŷ = b 0 + b 1 x performs as a predictor of y. R 2 is computed
More informationIntroduction to Regression and Data Analysis
Statlab Workshop Introduction to Regression and Data Analysis with Dan Campbell and Sherlock Campbell October 28, 2008 I. The basics A. Types of variables Your variables may take several forms, and it
More informationModule 3: Correlation and Covariance
Using Statistical Data to Make Decisions Module 3: Correlation and Covariance Tom Ilvento Dr. Mugdim Pašiƒ University of Delaware Sarajevo Graduate School of Business O ften our interest in data analysis
More informationAn Introduction to Regression Analysis
The Inaugural Coase Lecture An Introduction to Regression Analysis Alan O. Sykes * Regression analysis is a statistical tool for the investigation of relationships between variables. Usually, the investigator
More informationConcepts in Investments Risks and Returns (Relevant to PBE Paper II Management Accounting and Finance)
Concepts in Investments Risks and Returns (Relevant to PBE Paper II Management Accounting and Finance) Mr. Eric Y.W. Leung, CUHK Business School, The Chinese University of Hong Kong In PBE Paper II, students
More informationCURVE FITTING LEAST SQUARES APPROXIMATION
CURVE FITTING LEAST SQUARES APPROXIMATION Data analysis and curve fitting: Imagine that we are studying a physical system involving two quantities: x and y Also suppose that we expect a linear relationship
More informationExample: Boats and Manatees
Figure 9-6 Example: Boats and Manatees Slide 1 Given the sample data in Table 9-1, find the value of the linear correlation coefficient r, then refer to Table A-6 to determine whether there is a significant
More informationCORRELATION ANALYSIS
CORRELATION ANALYSIS Learning Objectives Understand how correlation can be used to demonstrate a relationship between two factors. Know how to perform a correlation analysis and calculate the coefficient
More informationIntroduction to Quantitative Methods
Introduction to Quantitative Methods October 15, 2009 Contents 1 Definition of Key Terms 2 2 Descriptive Statistics 3 2.1 Frequency Tables......................... 4 2.2 Measures of Central Tendencies.................
More informationMissing Data: Part 1 What to Do? Carol B. Thompson Johns Hopkins Biostatistics Center SON Brown Bag 3/20/13
Missing Data: Part 1 What to Do? Carol B. Thompson Johns Hopkins Biostatistics Center SON Brown Bag 3/20/13 Overview Missingness and impact on statistical analysis Missing data assumptions/mechanisms Conventional
More informationChapter 4: Vector Autoregressive Models
Chapter 4: Vector Autoregressive Models 1 Contents: Lehrstuhl für Department Empirische of Wirtschaftsforschung Empirical Research and und Econometrics Ökonometrie IV.1 Vector Autoregressive Models (VAR)...
More informationSecond Order Linear Nonhomogeneous Differential Equations; Method of Undetermined Coefficients. y + p(t) y + q(t) y = g(t), g(t) 0.
Second Order Linear Nonhomogeneous Differential Equations; Method of Undetermined Coefficients We will now turn our attention to nonhomogeneous second order linear equations, equations with the standard
More informationDescribing Relationships between Two Variables
Describing Relationships between Two Variables Up until now, we have dealt, for the most part, with just one variable at a time. This variable, when measured on many different subjects or objects, took
More informationMultiple Regression: What Is It?
Multiple Regression Multiple Regression: What Is It? Multiple regression is a collection of techniques in which there are multiple predictors of varying kinds and a single outcome We are interested in
More information1. Suppose that a score on a final exam depends upon attendance and unobserved factors that affect exam performance (such as student ability).
Examples of Questions on Regression Analysis: 1. Suppose that a score on a final exam depends upon attendance and unobserved factors that affect exam performance (such as student ability). Then,. When
More informationChapter Seven. Multiple regression An introduction to multiple regression Performing a multiple regression on SPSS
Chapter Seven Multiple regression An introduction to multiple regression Performing a multiple regression on SPSS Section : An introduction to multiple regression WHAT IS MULTIPLE REGRESSION? Multiple
More informationSo, using the new notation, P X,Y (0,1) =.08 This is the value which the joint probability function for X and Y takes when X=0 and Y=1.
Joint probabilit is the probabilit that the RVs & Y take values &. like the PDF of the two events, and. We will denote a joint probabilit function as P,Y (,) = P(= Y=) Marginal probabilit of is the probabilit
More informationChapter 5. Conditional CAPM. 5.1 Conditional CAPM: Theory. 5.1.1 Risk According to the CAPM. The CAPM is not a perfect model of expected returns.
Chapter 5 Conditional CAPM 5.1 Conditional CAPM: Theory 5.1.1 Risk According to the CAPM The CAPM is not a perfect model of expected returns. In the 40+ years of its history, many systematic deviations
More informationDifference in differences and Regression Discontinuity Design
Difference in differences and Regression Discontinuity Design Majeure Economie September 2011 1 Difference in differences Intuition Identification of a causal effect Discussion of the assumption Examples
More informationSlope-Intercept Equation. Example
1.4 Equations of Lines and Modeling Find the slope and the y intercept of a line given the equation y = mx + b, or f(x) = mx + b. Graph a linear equation using the slope and the y-intercept. Determine
More information2. Simple Linear Regression
Research methods - II 3 2. Simple Linear Regression Simple linear regression is a technique in parametric statistics that is commonly used for analyzing mean response of a variable Y which changes according
More informationEarnings in private jobs after participation to post-doctoral programs : an assessment using a treatment effect model. Isabelle Recotillet
Earnings in private obs after participation to post-doctoral programs : an assessment using a treatment effect model Isabelle Recotillet Institute of Labor Economics and Industrial Sociology, UMR 6123,
More information5. Linear Regression
5. Linear Regression Outline.................................................................... 2 Simple linear regression 3 Linear model............................................................. 4
More informationThe Bivariate Normal Distribution
The Bivariate Normal Distribution This is Section 4.7 of the st edition (2002) of the book Introduction to Probability, by D. P. Bertsekas and J. N. Tsitsiklis. The material in this section was not included
More informationZeros of Polynomial Functions
Zeros of Polynomial Functions Objectives: 1.Use the Fundamental Theorem of Algebra to determine the number of zeros of polynomial functions 2.Find rational zeros of polynomial functions 3.Find conjugate
More informationUniversity of Ljubljana Doctoral Programme in Statistics Methodology of Statistical Research Written examination February 14 th, 2014.
University of Ljubljana Doctoral Programme in Statistics ethodology of Statistical Research Written examination February 14 th, 2014 Name and surname: ID number: Instructions Read carefully the wording
More informationPenalized regression: Introduction
Penalized regression: Introduction Patrick Breheny August 30 Patrick Breheny BST 764: Applied Statistical Modeling 1/19 Maximum likelihood Much of 20th-century statistics dealt with maximum likelihood
More informationPART A: For each worker, determine that worker's marginal product of labor.
ECON 3310 Homework #4 - Solutions 1: Suppose the following indicates how many units of output y you can produce per hour with different levels of labor input (given your current factory capacity): PART
More informationLecture 2. Marginal Functions, Average Functions, Elasticity, the Marginal Principle, and Constrained Optimization
Lecture 2. Marginal Functions, Average Functions, Elasticity, the Marginal Principle, and Constrained Optimization 2.1. Introduction Suppose that an economic relationship can be described by a real-valued
More informationLinear and quadratic Taylor polynomials for functions of several variables.
ams/econ 11b supplementary notes ucsc Linear quadratic Taylor polynomials for functions of several variables. c 010, Yonatan Katznelson Finding the extreme (minimum or maximum) values of a function, is
More informationMULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.
Module 7 Test Name MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. You are given information about a straight line. Use two points to graph the equation.
More informationQuadratic forms Cochran s theorem, degrees of freedom, and all that
Quadratic forms Cochran s theorem, degrees of freedom, and all that Dr. Frank Wood Frank Wood, fwood@stat.columbia.edu Linear Regression Models Lecture 1, Slide 1 Why We Care Cochran s theorem tells us
More information1 Teaching notes on GMM 1.
Bent E. Sørensen January 23, 2007 1 Teaching notes on GMM 1. Generalized Method of Moment (GMM) estimation is one of two developments in econometrics in the 80ies that revolutionized empirical work in
More informationCorrelational Research. Correlational Research. Stephen E. Brock, Ph.D., NCSP EDS 250. Descriptive Research 1. Correlational Research: Scatter Plots
Correlational Research Stephen E. Brock, Ph.D., NCSP California State University, Sacramento 1 Correlational Research A quantitative methodology used to determine whether, and to what degree, a relationship
More informationExpression. Variable Equation Polynomial Monomial Add. Area. Volume Surface Space Length Width. Probability. Chance Random Likely Possibility Odds
Isosceles Triangle Congruent Leg Side Expression Equation Polynomial Monomial Radical Square Root Check Times Itself Function Relation One Domain Range Area Volume Surface Space Length Width Quantitative
More informationU.C. Berkeley CS276: Cryptography Handout 0.1 Luca Trevisan January, 2009. Notes on Algebra
U.C. Berkeley CS276: Cryptography Handout 0.1 Luca Trevisan January, 2009 Notes on Algebra These notes contain as little theory as possible, and most results are stated without proof. Any introductory
More informationLean Six Sigma Analyze Phase Introduction. TECH 50800 QUALITY and PRODUCTIVITY in INDUSTRY and TECHNOLOGY
TECH 50800 QUALITY and PRODUCTIVITY in INDUSTRY and TECHNOLOGY Before we begin: Turn on the sound on your computer. There is audio to accompany this presentation. Audio will accompany most of the online
More informationMULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.
Review MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. 1) All but one of these statements contain a mistake. Which could be true? A) There is a correlation
More informationSYSTEMS OF REGRESSION EQUATIONS
SYSTEMS OF REGRESSION EQUATIONS 1. MULTIPLE EQUATIONS y nt = x nt n + u nt, n = 1,...,N, t = 1,...,T, x nt is 1 k, and n is k 1. This is a version of the standard regression model where the observations
More informationINTRODUCTION TO MULTIPLE CORRELATION
CHAPTER 13 INTRODUCTION TO MULTIPLE CORRELATION Chapter 12 introduced you to the concept of partialling and how partialling could assist you in better interpreting the relationship between two primary
More informationMISSING DATA TECHNIQUES WITH SAS. IDRE Statistical Consulting Group
MISSING DATA TECHNIQUES WITH SAS IDRE Statistical Consulting Group ROAD MAP FOR TODAY To discuss: 1. Commonly used techniques for handling missing data, focusing on multiple imputation 2. Issues that could
More informationChapter 3: The Multiple Linear Regression Model
Chapter 3: The Multiple Linear Regression Model Advanced Econometrics - HEC Lausanne Christophe Hurlin University of Orléans November 23, 2013 Christophe Hurlin (University of Orléans) Advanced Econometrics
More informationGood luck! BUSINESS STATISTICS FINAL EXAM INSTRUCTIONS. Name:
Glo bal Leadership M BA BUSINESS STATISTICS FINAL EXAM Name: INSTRUCTIONS 1. Do not open this exam until instructed to do so. 2. Be sure to fill in your name before starting the exam. 3. You have two hours
More informationLOGIT AND PROBIT ANALYSIS
LOGIT AND PROBIT ANALYSIS A.K. Vasisht I.A.S.R.I., Library Avenue, New Delhi 110 012 amitvasisht@iasri.res.in In dummy regression variable models, it is assumed implicitly that the dependent variable Y
More informationSummary of Formulas and Concepts. Descriptive Statistics (Ch. 1-4)
Summary of Formulas and Concepts Descriptive Statistics (Ch. 1-4) Definitions Population: The complete set of numerical information on a particular quantity in which an investigator is interested. We assume
More informationNCSS Statistical Software Principal Components Regression. In ordinary least squares, the regression coefficients are estimated using the formula ( )
Chapter 340 Principal Components Regression Introduction is a technique for analyzing multiple regression data that suffer from multicollinearity. When multicollinearity occurs, least squares estimates
More information1 Mathematical Models of Cost, Revenue and Profit
Section 1.: Mathematical Modeling Math 14 Business Mathematics II Minh Kha Goals: to understand what a mathematical model is, and some of its examples in business. Definition 0.1. Mathematical Modeling
More informationAP Physics 1 and 2 Lab Investigations
AP Physics 1 and 2 Lab Investigations Student Guide to Data Analysis New York, NY. College Board, Advanced Placement, Advanced Placement Program, AP, AP Central, and the acorn logo are registered trademarks
More information