3. Hypothesis tests and confidence intervals in multiple regression

Size: px
Start display at page:

Download "3. Hypothesis tests and confidence intervals in multiple regression"

Transcription

1 3. Hypothesis tests and confidence intervals in multiple regression Contents of previous section: Definition of the multiple regression model OLS estimation of the coefficients Measures-of-fit (based on estimation results) Some problems in the regression model (omitted-variable bias, multicollinearity) Now: Statistical inference based on OLS estimation (hypothesis tests, confidence intervals) 39

2 3.1. Standard errors for the OLS estimators Recall: OLS estimators are subject to sampling uncertainty Given the OLS assumptions on Slide 18 the OLS estimators are normally distributed in large samples, that is ˆβ j N(β j, σ 2ˆβ j ) for j = 0,..., k Now: How can we estimate the (unknown) OLS estimator s variance σ 2ˆβ and its standard deviation σ 2ˆβ j j σˆβ j 40

3 Definition 3.1: (Standard error) We call an appropriately defined estimator of the standard deviation σˆβ j the standard error of ˆβ j and denote it by SE(ˆβ j ). Natural question: What constitutes a good estimator of σˆβ j? Answer: The analytical formula of a good estimator crucially hinges on whether the errors u i are homoskedastic or heteroskedastic (see Definition 2.2 on Slide 8) 41

4 Homoskedasticity / heteroskedasticity Important notes: The way we defined the terms homoskedasticity and heteroskedasticity in Definition 2.2 on Slide 8 implies that homoskedasticity is a special case of heteroskedasticity ( heteroskedasticity is more general than homoskedasticity ) Since the OLS assumptions on Slide 18 place no restrictions on the conditional variance of the error terms u i, they apply to both the general case of heteroskedasticity and the special case of homoskedasticity Theorem 2.4 on Slide 19 is valid under both concepts 42

5 Corollary 3.2: (To Theorem 2.4, Slide 19) Given the OLS assumptions on Slide 18, the OLS estimators are unbiased, consistent, and normally distributed in large samples (asymptotically normal) irrespective of whether the error terms are heteroskedastic or homoskedastic. Classical econometrics: In classical econometrics the default assumption is that the error terms are homoskedastic Given our OLS assumptions plus homoskedasticity, the OLS estimators are efficient (optimal) among all alternative linear and unbiased estimators of the regression coefficients β 0,..., β k (Gauss-Markov theorem) 43

6 Classical econometrics: [continued] Under heteroskedasticity there are more efficient estimators than OLS, namely the so-called (feasible) Generalized Least Squares (GLS) estimators (see the lectures Econometrics I+II) Mathematical aspects: There exist specific formulas for the standard errors SE(ˆβ j ) both under heteroskedasticity and homoskedasticity Since homoskedasticity is a special case of heteroskedasticity the standard errors under homoskedasticity have a simpler structural form (homoskedasticity-only standard errors) 44

7 Mathematical aspects: [continued] The homoskedasticity-only standard errors are valid only under homoskedasticity, but lead to invalid statistical inference under heteroskedasticity The more general standard errors under heteroskedasticity were proposed by Eicker (1967), Huber (1967), and White (1980) (Eicker-Huber-White standard errors) The Eicker-Huber-White standard errors produce valid statistical inference irrespective of whether the error terms are heteroskedastic or homoskedastic (heteroskedasticity-robust standard errors) 45

8 Homoskedasticity-only and heteroskedasticity-robust standard errors for the house-prices dataset Dependent Variable: SALEPRICE Method: Least Squares Date: 07/02/12 Time: 16:50 Sample: Included observations: 546 Variable Coefficient Std. Error t-statistic Prob. C LOTSIZE BEDROOMS BATHROOMS STOREYS R-squared Mean dependent var Adjusted R-squared S.D. dependent var S.E. of regression Akaike info criterion Sum squared resid 1.80E+11 Schwarz criterion Log likelihood Hannan-Quinn criter F-statistic Durbin-Watson stat Prob(F-statistic) Dependent Variable: SALEPRICE Method: Least Squares Date: 28/02/12 Time: 09:43 Sample: Included observations: 546 White heteroskedasticity-consistent standard errors & covariance Variable Coefficient Std. Error t-statistic Prob. C LOTSIZE BEDROOMS BATHROOMS STOREYS R-squared Mean dependent var Adjusted R-squared S.D. dependent var S.E. of regression Akaike info criterion Sum squared resid 1.80E+11 Schwarz criterion Log likelihood Hannan-Quinn criter F-statistic Durbin-Watson stat Prob(F-statistic)

9 Practical issues: Heteroskedasticity arises in many econometric applications It is prudent to assume heteroskedastic errors unless you have compelling reasons to believe otherwise Rule of thumb: to be on the safe side, always use heteroskedasticity-robust standard errors Many software packages (like EViews) report homoskedasticity-only standard errors as their default setting It is up to the user to activate the option of heteroskedasticity-robust standard errors (in EViews: use the command ls(cov=white)) 47

10 Autocorrelated errors Problem: Particularly in time-series regressions (that is when the index i represents distinct points in time) we often encounter autocorrelated error terms: Corr(u i, u j ) 0 for some i j Under autocorrelation the OLS coefficient estimators are still consistent, but the usual OLS standard errors become inconsistent Statistical inference based on the usual OLS standard errors becomes invalid 48

11 Solution: Standard errors should be computed using a heteroskedasticity- and autocorrelation-consistent (HAC) estimator of the variance Such HAC standard errors become relevant in the Sections 6 and 9 A well-known (special) HAC estimator is the so-called Newey- West variance estimator (see Newey and West, 1987) For a more formal discussion of HAC estimators see Stock and Watson (2011, Section 15.4) 49

12 3.2. Hypothesis tests and confidence intervals for a single coefficient Testing problem: Consider one of the k regressors, say X j, and the corresponding regression coefficient β j We aim at testing the two-sided problem that the unknown β j takes on some specific value β j,0 In technical terms: H 0 : β j = β j,0 vs. H 1 : β j β j,0 50

13 Testing procedure: Compute the standard error of ˆβ j, SE(ˆβ j ) Compute the so-called t-statistic: t = ˆβ j β j,0 SE(ˆβ j ) (3.1) Compute the p-value: p-value = 2 Φ ( t act ), (3.2) where t act is the value of the t-statistic actually computed and Φ( ) is the cdf of the standard normal distribution Reject H 0 at the 5% significance level if p-value < 0.05 (or, equivalently, if t act > 1.96) 51

14 Remarks: Our testing procedure makes use of the result that the sampling distribution of the OLS estimator ˆβ j is approximately normal for moderate and large sample sizes Under H 0 the mean of this distribution is β j,0 The t-statistic (3.1) is approximately N(0, 1) distributed The phrasing t-statistic stems from the fact that for finite sample sizes and under some additional (classical) assumptions on the multiple regression model the t-statistic (3.1) follows the t-distribution with n k 1 degrees of freedom Given these restrictive assumptions the p-values should be computed from the quantiles of the t n k 1 -distribution 52

15 Remarks: [continued] Since these additional assumptions are rarely met in realworld applications and since sample sizes are typically moderate or even large, we base inference on the p-values (3.2) computed from the normal distribution Attention: Many software packages (like EViews) assume the validity of the classical assumptions and report p-values based on the t n k 1 -distribution in the default setting p-values should be corrected manually (see the following example) 53

16 p-values for the house-prices dataset based on the t- and the normal distribution, respectively Dependent Variable: SALEPRICE Method: Least Squares Date: 28/02/12 Time: 09:43 Sample: Included observations: 546 White heteroskedasticity-consistent standard errors & covariance Variable Coefficient Std. Error t-statistic Prob. C LOTSIZE BEDROOMS BATHROOMS STOREYS R-squared Mean dependent var Adjusted R-squared S.D. dependent var S.E. of regression Akaike info criterion Sum squared resid 1.80E+11 Schwarz criterion Log likelihood Hannan-Quinn criter F-statistic Durbin-Watson stat Prob(F-statistic) Dependent Variable: SALEPRICE Method: Least Squares Date: 28/02/12 Time: 09:43 Sample: Included observations: 546 White heteroskedasticity-consistent standard errors & covariance Variable Coefficient Std. Error t-statistic Prob. C LOTSIZE BEDROOMS BATHROOMS STOREYS R-squared Mean dependent var Adjusted R-squared S.D. dependent var S.E. of regression Akaike info criterion Sum squared resid 1.80E+11 Schwarz criterion Log likelihood Hannan-Quinn criter F-statistic Durbin-Watson stat Prob(F-statistic)

17 Remarks: [continued] A (1 α) two-sided confidence interval for the coefficient β j is an interval that contains the true value of β j with a (1 α) probability It contains the true value of β j in 100 (1 α)% of all possible randomly drawn samples Equivalently, it is the set of values of β j,0 that cannot be rejected by an α-level hypothesis test H 0 : β j = β j,0 vs. H 1 : β j β j,0 When the sample size is large, we approximate the (1 α) confidence interval for β j by [ˆβ j u 1 α/2 SE(ˆβ j ), ˆβ j + u 1 α/2 SE(ˆβ j ) ], (3.3) where u α denotes the α-quantile of the N(0, 1)-distribution 55

18 3.3. Tests of joint hypotheses Now: Testing hypotheses on two or more regression coefficients (joint hypotheses) Example: H 0 : β 1 = 0 and β 2 = 0 vs. H 1 : β 1 0 and/or β 2 0 (two restrictions) 56

19 General form of joint hypothesis: Consider the k + 1 regression coefficients β 0, β 1,..., β k and k + 1 prespecified real numbers β 0,0, β 1,0,..., β k,0 For q out of the k + 1 coefficients we consider joint null and alternative hypotheses of the form H 0 : β j = β j,0, β m = β m,0,..., for a total of q restrictions, H 1 : one or more of the q restrictions under H 0 does or do not hold 57

20 Remarks: A special case is given by considering the k restrictions with β 1,0 = 0, β 2,0 = 0,..., β k,0 = 0, that is H 0 : β 1 = 0,..., β k = 0 H 1 : at least one of the β m is nonzero for m = 1,..., k (overall-significance test of the regression model) It is tempting to conduct the test by using the usual t- statistics to test the k restrictions one at a time: Test #1: H 0 : β 1 = 0 vs. H 1 : β 1 0 Test #2: H 0 : β 2 = 0 vs. H 1 : β 2 = Test #k: H 0 : β k = 0 vs. H 1 : β k 0 58

21 Remarks: [continued] This approach is unreliable:... testing a series of single hypotheses is not equivalent to testing those same hypotheses jointly. The intuitive reason for this is that in a joint test of several hypotheses any single hypothesis is affected by the information in the other hypotheses. (Gujarati and Porter, 2009, p. 238) Solution: Joint-hypotheses testing on the basis of the F -statistic EViews-example: overall-significance test for the house-prices dataset 59

22 F -test (overall-significance test) for the house-prices dataset Dependent Variable: SALEPRICE Method: Least Squares Date: 28/02/12 Time: 09:43 Sample: Included observations: 546 White heteroskedasticity-consistent standard errors & covariance Variable Coefficient Std. Error t-statistic Prob. C LOTSIZE BEDROOMS BATHROOMS STOREYS R-squared Mean dependent var Adjusted R-squared S.D. dependent var S.E. of regression Akaike info criterion Sum squared resid 1.80E+11 Schwarz criterion Log likelihood Hannan-Quinn criter F-statistic Durbin-Watson stat Prob(F-statistic)

23 Form and null-distribution of the F -statistic: The exact formula of the F -statistic used for testing the (general) problem of q restrictions given on Slide 57 depends on the specific assumptions imposed on the multiple regression model Also, the exact null-distribution of the F -statistic (F -distribution with exactly specified degree-of-freedom parameters n 1 and n 2 ) also depends on the these assumptions In EViews, the F -statistic and its null-distribution are computed under the classical assumptions of (1) normally distributed and (2) homoskedastic error terms u i In contrast, Stock and Watson (2011) do not assume normally distributed errors u i and consider a heteroskedasticityrobust F -statistic (Stock and Watson, 2011, pp ) 61

24 3.4. Testing single restrictions involving multiple coefficients General F -testing: Consider again the multiple regression model: Y i = β 0 + β 1 X 1i + β 2 X 2i β k X ki + u i (3.4) We aim at testing hypotheses involving some linear restrictions on the parameters of the k-variable model such as H 0 : β 2 = β 3 vs. H 1 : β 2 β 3 H 0 : β 3 + β 4 + β 5 = 3 vs. H 1 : β 3 + β 4 + β 5 3 H 0 : β 3 = β 4 = β 5 = 0 vs. H 1 : β 3 0 and/or β 4 0 and/or β 5 0 (the regressors X 3, X 4, X 5 are absent from the model) 62

25 General F -testing: [continued] All these hypotheses can be tested using a general F -statistic This general testing strategy distinguishes sharply between the so-called unrestricted regression model (3.4) and the restricted regression obtained from plugging the restriction specified under H 0 into the unrestricted regression (3.4) The general F -statistic then compares the sum of squared residuals obtained from the unrestricted regression (3.4), denoted by SSR UR, with the sum of squared residuals obtained from the restricted regression, denoted by SSR R (see Gujarati and Porter, 2009, pp ) 63

26 General F -testing: [continued] As in Section 3.3., the exact null-distribution of this general F -statistic (F -distribution with exactly specified degree-offreedom parameters) again depends on the assumptions imposed on the multiple regression model (in particular on the normality/nonnormality and the homoskedasticity/heteroskedasticity of the error terms u i ) EViews provides a fully-fledged framework for performing these general F -tests under the classical assumptions of normally distributed and homoskedastic error terms A thorough discussion will be given in the class 64

27 3.5. Confidence sets for multiple coefficients Definition 3.3: (Confidence set) A 95% confidence set for two or more coefficients is the set of numbers that contains the true population values of these coefficients in 95% of randomly drawn samples. Remarks: A confidence set is the generalization to two or more coefficients of a confidence interval for a single coefficient Recall Formula (3.3) on Slide 55 for constructing a confidence interval for the single coefficient β j 65

28 Remarks: [continued] Instead of using Formula (3.3), an equivalent way of constructing a say 95% confidence interval for the single coefficient β j consists in determining the set of all values β j,0 that cannot be rejected by a two-sided hypothesis test H 0 : β j = β j,0 vs. H 1 : β j β j,0 at the 5% significance level based on the t-statistic (3.1) on Slide 51 This approach can be extended to the case of multiple coefficients using the general F -testing approach described in Section

29 Example: [continued] Suppose you are interested in constructing a confidence set for the two coefficients β j and β m (for j, m = 0,..., k, j m) In line with Slide 57, consider testing a joint null hypothesis with the 2 restrictions H 0 : β j = β j,0, β m = β m,0 at the 5% level using the appropriate F -statistic The set of all pairs (β j,0, β m,0 ) for which you cannot reject H 0 at the 5% level constitutes a 95% confidence set for β j and β m 67

30 Remarks: In line with the confidence-interval formula (3.3), there are also analytical formulas for constructing confidence sets for multiple coefficients (not to be discussed here) EViews provides a fully-fledged framework for constructing confidence intervals and confidence sets (see class for details) Example: Confidence intervals and two-dimensional confidence sets for the house-prices dataset in EViews 68

31 Single-coefficient confidence intervals for the house-prices dataset Coefficient Confidence Intervals Date: 20/03/12 Time: 12:39 Sample: Included observations: % CI 95% CI 99% CI Variable Coefficient Low High Low High Low High C LOTSIZE BEDROOMS BATHROOMS STOREYS

32 95% two-dimensional confidence sets for the house-prices dataset LOTSIZE BEDROOMS 4,000 2, ,000 BATHROOMS 20,000 18,000 16,000 14,000 12,000 9,000 STOREYS 8,000 7,000 6, ,000-5, ,000 4,000 12,000 16,000 20,000 C LOTSIZE BEDROOMS BATHROOMS

33 Remarks: The vertical and horizontal dotted lines show the corresponding 95% confidence intervals for the single coefficients β j, β m The orientation of the ellipse indicates the estimated correlation between the OLS estimators ˆβ j and ˆβ m If the OLS estimators ˆβ j and ˆβ m were independent, the ellipses would be exact circles 71

2. Linear regression with multiple regressors

2. Linear regression with multiple regressors 2. Linear regression with multiple regressors Aim of this section: Introduction of the multiple regression model OLS estimation in multiple regression Measures-of-fit in multiple regression Assumptions

More information

Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model

Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model 1 September 004 A. Introduction and assumptions The classical normal linear regression model can be written

More information

Solución del Examen Tipo: 1

Solución del Examen Tipo: 1 Solución del Examen Tipo: 1 Universidad Carlos III de Madrid ECONOMETRICS Academic year 2009/10 FINAL EXAM May 17, 2010 DURATION: 2 HOURS 1. Assume that model (III) verifies the assumptions of the classical

More information

What s New in Econometrics? Lecture 8 Cluster and Stratified Sampling

What s New in Econometrics? Lecture 8 Cluster and Stratified Sampling What s New in Econometrics? Lecture 8 Cluster and Stratified Sampling Jeff Wooldridge NBER Summer Institute, 2007 1. The Linear Model with Cluster Effects 2. Estimation with a Small Number of Groups and

More information

Simple Linear Regression Inference

Simple Linear Regression Inference Simple Linear Regression Inference 1 Inference requirements The Normality assumption of the stochastic term e is needed for inference even if it is not a OLS requirement. Therefore we have: Interpretation

More information

Multiple Linear Regression in Data Mining

Multiple Linear Regression in Data Mining Multiple Linear Regression in Data Mining Contents 2.1. A Review of Multiple Linear Regression 2.2. Illustration of the Regression Process 2.3. Subset Selection in Linear Regression 1 2 Chap. 2 Multiple

More information

Wooldridge, Introductory Econometrics, 3d ed. Chapter 12: Serial correlation and heteroskedasticity in time series regressions

Wooldridge, Introductory Econometrics, 3d ed. Chapter 12: Serial correlation and heteroskedasticity in time series regressions Wooldridge, Introductory Econometrics, 3d ed. Chapter 12: Serial correlation and heteroskedasticity in time series regressions What will happen if we violate the assumption that the errors are not serially

More information

Introduction to Regression and Data Analysis

Introduction to Regression and Data Analysis Statlab Workshop Introduction to Regression and Data Analysis with Dan Campbell and Sherlock Campbell October 28, 2008 I. The basics A. Types of variables Your variables may take several forms, and it

More information

Competition as an Effective Tool in Developing Social Marketing Programs: Driving Behavior Change through Online Activities

Competition as an Effective Tool in Developing Social Marketing Programs: Driving Behavior Change through Online Activities Competition as an Effective Tool in Developing Social Marketing Programs: Driving Behavior Change through Online Activities Corina ŞERBAN 1 ABSTRACT Nowadays, social marketing practices represent an important

More information

ECON 142 SKETCH OF SOLUTIONS FOR APPLIED EXERCISE #2

ECON 142 SKETCH OF SOLUTIONS FOR APPLIED EXERCISE #2 University of California, Berkeley Prof. Ken Chay Department of Economics Fall Semester, 005 ECON 14 SKETCH OF SOLUTIONS FOR APPLIED EXERCISE # Question 1: a. Below are the scatter plots of hourly wages

More information

Please follow the directions once you locate the Stata software in your computer. Room 114 (Business Lab) has computers with Stata software

Please follow the directions once you locate the Stata software in your computer. Room 114 (Business Lab) has computers with Stata software STATA Tutorial Professor Erdinç Please follow the directions once you locate the Stata software in your computer. Room 114 (Business Lab) has computers with Stata software 1.Wald Test Wald Test is used

More information

SYSTEMS OF REGRESSION EQUATIONS

SYSTEMS OF REGRESSION EQUATIONS SYSTEMS OF REGRESSION EQUATIONS 1. MULTIPLE EQUATIONS y nt = x nt n + u nt, n = 1,...,N, t = 1,...,T, x nt is 1 k, and n is k 1. This is a version of the standard regression model where the observations

More information

MULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS

MULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS MULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS MSR = Mean Regression Sum of Squares MSE = Mean Squared Error RSS = Regression Sum of Squares SSE = Sum of Squared Errors/Residuals α = Level of Significance

More information

Nonlinear Regression Functions. SW Ch 8 1/54/

Nonlinear Regression Functions. SW Ch 8 1/54/ Nonlinear Regression Functions SW Ch 8 1/54/ The TestScore STR relation looks linear (maybe) SW Ch 8 2/54/ But the TestScore Income relation looks nonlinear... SW Ch 8 3/54/ Nonlinear Regression General

More information

Chapter 6: Multivariate Cointegration Analysis

Chapter 6: Multivariate Cointegration Analysis Chapter 6: Multivariate Cointegration Analysis 1 Contents: Lehrstuhl für Department Empirische of Wirtschaftsforschung Empirical Research and und Econometrics Ökonometrie VI. Multivariate Cointegration

More information

Multivariate Normal Distribution

Multivariate Normal Distribution Multivariate Normal Distribution Lecture 4 July 21, 2011 Advanced Multivariate Statistical Methods ICPSR Summer Session #2 Lecture #4-7/21/2011 Slide 1 of 41 Last Time Matrices and vectors Eigenvalues

More information

Estimation of σ 2, the variance of ɛ

Estimation of σ 2, the variance of ɛ Estimation of σ 2, the variance of ɛ The variance of the errors σ 2 indicates how much observations deviate from the fitted surface. If σ 2 is small, parameters β 0, β 1,..., β k will be reliably estimated

More information

HYPOTHESIS TESTING: CONFIDENCE INTERVALS, T-TESTS, ANOVAS, AND REGRESSION

HYPOTHESIS TESTING: CONFIDENCE INTERVALS, T-TESTS, ANOVAS, AND REGRESSION HYPOTHESIS TESTING: CONFIDENCE INTERVALS, T-TESTS, ANOVAS, AND REGRESSION HOD 2990 10 November 2010 Lecture Background This is a lightning speed summary of introductory statistical methods for senior undergraduate

More information

The relationship between stock market parameters and interbank lending market: an empirical evidence

The relationship between stock market parameters and interbank lending market: an empirical evidence Magomet Yandiev Associate Professor, Department of Economics, Lomonosov Moscow State University mag2097@mail.ru Alexander Pakhalov, PG student, Department of Economics, Lomonosov Moscow State University

More information

Statistics 104: Section 6!

Statistics 104: Section 6! Page 1 Statistics 104: Section 6! TF: Deirdre (say: Dear-dra) Bloome Email: dbloome@fas.harvard.edu Section Times Thursday 2pm-3pm in SC 109, Thursday 5pm-6pm in SC 705 Office Hours: Thursday 6pm-7pm SC

More information

The Impact of Privatization in Insurance Industry on Insurance Efficiency in Iran

The Impact of Privatization in Insurance Industry on Insurance Efficiency in Iran The Impact of Privatization in Insurance Industry on Insurance Efficiency in Iran Shahram Gilaninia 1, Hosein Ganjinia, Azadeh Asadian 3 * 1. Department of Industrial Management, Islamic Azad University,

More information

The VAR models discussed so fare are appropriate for modeling I(0) data, like asset returns or growth rates of macroeconomic time series.

The VAR models discussed so fare are appropriate for modeling I(0) data, like asset returns or growth rates of macroeconomic time series. Cointegration The VAR models discussed so fare are appropriate for modeling I(0) data, like asset returns or growth rates of macroeconomic time series. Economic theory, however, often implies equilibrium

More information

IMPACT OF WORKING CAPITAL MANAGEMENT ON PROFITABILITY

IMPACT OF WORKING CAPITAL MANAGEMENT ON PROFITABILITY IMPACT OF WORKING CAPITAL MANAGEMENT ON PROFITABILITY Hina Agha, Mba, Mphil Bahria University Karachi Campus, Pakistan Abstract The main purpose of this study is to empirically test the impact of working

More information

Regression III: Advanced Methods

Regression III: Advanced Methods Lecture 16: Generalized Additive Models Regression III: Advanced Methods Bill Jacoby Michigan State University http://polisci.msu.edu/jacoby/icpsr/regress3 Goals of the Lecture Introduce Additive Models

More information

Lesson 1: Comparison of Population Means Part c: Comparison of Two- Means

Lesson 1: Comparison of Population Means Part c: Comparison of Two- Means Lesson : Comparison of Population Means Part c: Comparison of Two- Means Welcome to lesson c. This third lesson of lesson will discuss hypothesis testing for two independent means. Steps in Hypothesis

More information

2013 MBA Jump Start Program. Statistics Module Part 3

2013 MBA Jump Start Program. Statistics Module Part 3 2013 MBA Jump Start Program Module 1: Statistics Thomas Gilbert Part 3 Statistics Module Part 3 Hypothesis Testing (Inference) Regressions 2 1 Making an Investment Decision A researcher in your firm just

More information

Testing for Granger causality between stock prices and economic growth

Testing for Granger causality between stock prices and economic growth MPRA Munich Personal RePEc Archive Testing for Granger causality between stock prices and economic growth Pasquale Foresti 2006 Online at http://mpra.ub.uni-muenchen.de/2962/ MPRA Paper No. 2962, posted

More information

Analysis of Variance ANOVA

Analysis of Variance ANOVA Analysis of Variance ANOVA Overview We ve used the t -test to compare the means from two independent groups. Now we ve come to the final topic of the course: how to compare means from more than two populations.

More information

The Relationship between Life Insurance and Economic Growth: Evidence from India

The Relationship between Life Insurance and Economic Growth: Evidence from India Global Journal of Management and Business Studies. ISSN 2248-9878 Volume 3, Number 4 (2013), pp. 413-422 Research India Publications http://www.ripublication.com/gjmbs.htm The Relationship between Life

More information

1 Another method of estimation: least squares

1 Another method of estimation: least squares 1 Another method of estimation: least squares erm: -estim.tex, Dec8, 009: 6 p.m. (draft - typos/writos likely exist) Corrections, comments, suggestions welcome. 1.1 Least squares in general Assume Y i

More information

Forecasting the US Dollar / Euro Exchange rate Using ARMA Models

Forecasting the US Dollar / Euro Exchange rate Using ARMA Models Forecasting the US Dollar / Euro Exchange rate Using ARMA Models LIUWEI (9906360) - 1 - ABSTRACT...3 1. INTRODUCTION...4 2. DATA ANALYSIS...5 2.1 Stationary estimation...5 2.2 Dickey-Fuller Test...6 3.

More information

QUANTITATIVE METHODS BIOLOGY FINAL HONOUR SCHOOL NON-PARAMETRIC TESTS

QUANTITATIVE METHODS BIOLOGY FINAL HONOUR SCHOOL NON-PARAMETRIC TESTS QUANTITATIVE METHODS BIOLOGY FINAL HONOUR SCHOOL NON-PARAMETRIC TESTS This booklet contains lecture notes for the nonparametric work in the QM course. This booklet may be online at http://users.ox.ac.uk/~grafen/qmnotes/index.html.

More information

Least Squares Estimation

Least Squares Estimation Least Squares Estimation SARA A VAN DE GEER Volume 2, pp 1041 1045 in Encyclopedia of Statistics in Behavioral Science ISBN-13: 978-0-470-86080-9 ISBN-10: 0-470-86080-4 Editors Brian S Everitt & David

More information

ECON 523 Applied Econometrics I /Masters Level American University, Spring 2008. Description of the course

ECON 523 Applied Econometrics I /Masters Level American University, Spring 2008. Description of the course ECON 523 Applied Econometrics I /Masters Level American University, Spring 2008 Instructor: Maria Heracleous Lectures: M 8:10-10:40 p.m. WARD 202 Office: 221 Roper Phone: 202-885-3758 Office Hours: M W

More information

MATHEMATICAL METHODS OF STATISTICS

MATHEMATICAL METHODS OF STATISTICS MATHEMATICAL METHODS OF STATISTICS By HARALD CRAMER TROFESSOK IN THE UNIVERSITY OF STOCKHOLM Princeton PRINCETON UNIVERSITY PRESS 1946 TABLE OF CONTENTS. First Part. MATHEMATICAL INTRODUCTION. CHAPTERS

More information

From the help desk: Bootstrapped standard errors

From the help desk: Bootstrapped standard errors The Stata Journal (2003) 3, Number 1, pp. 71 80 From the help desk: Bootstrapped standard errors Weihua Guan Stata Corporation Abstract. Bootstrapping is a nonparametric approach for evaluating the distribution

More information

Module 5: Multiple Regression Analysis

Module 5: Multiple Regression Analysis Using Statistical Data Using to Make Statistical Decisions: Data Multiple to Make Regression Decisions Analysis Page 1 Module 5: Multiple Regression Analysis Tom Ilvento, University of Delaware, College

More information

Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression

Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression Objectives: To perform a hypothesis test concerning the slope of a least squares line To recognize that testing for a

More information

Simple linear regression

Simple linear regression Simple linear regression Introduction Simple linear regression is a statistical method for obtaining a formula to predict values of one variable from another where there is a causal relationship between

More information

Quick Stata Guide by Liz Foster

Quick Stata Guide by Liz Foster by Liz Foster Table of Contents Part 1: 1 describe 1 generate 1 regress 3 scatter 4 sort 5 summarize 5 table 6 tabulate 8 test 10 ttest 11 Part 2: Prefixes and Notes 14 by var: 14 capture 14 use of the

More information

Institute of Actuaries of India Subject CT3 Probability and Mathematical Statistics

Institute of Actuaries of India Subject CT3 Probability and Mathematical Statistics Institute of Actuaries of India Subject CT3 Probability and Mathematical Statistics For 2015 Examinations Aim The aim of the Probability and Mathematical Statistics subject is to provide a grounding in

More information

The correlation coefficient

The correlation coefficient The correlation coefficient Clinical Biostatistics The correlation coefficient Martin Bland Correlation coefficients are used to measure the of the relationship or association between two quantitative

More information

Econometrics Simple Linear Regression

Econometrics Simple Linear Regression Econometrics Simple Linear Regression Burcu Eke UC3M Linear equations with one variable Recall what a linear equation is: y = b 0 + b 1 x is a linear equation with one variable, or equivalently, a straight

More information

Part 2: Analysis of Relationship Between Two Variables

Part 2: Analysis of Relationship Between Two Variables Part 2: Analysis of Relationship Between Two Variables Linear Regression Linear correlation Significance Tests Multiple regression Linear Regression Y = a X + b Dependent Variable Independent Variable

More information

Quantile Regression under misspecification, with an application to the U.S. wage structure

Quantile Regression under misspecification, with an application to the U.S. wage structure Quantile Regression under misspecification, with an application to the U.S. wage structure Angrist, Chernozhukov and Fernandez-Val Reading Group Econometrics November 2, 2010 Intro: initial problem The

More information

Multiple Linear Regression

Multiple Linear Regression Multiple Linear Regression A regression with two or more explanatory variables is called a multiple regression. Rather than modeling the mean response as a straight line, as in simple regression, it is

More information

UK GDP is the best predictor of UK GDP, literally.

UK GDP is the best predictor of UK GDP, literally. UK GDP IS THE BEST PREDICTOR OF UK GDP, LITERALLY ERIK BRITTON AND DANNY GABAY 6 NOVEMBER 2009 UK GDP is the best predictor of UK GDP, literally. The ONS s preliminary estimate of UK GDP for the third

More information

How To Check For Differences In The One Way Anova

How To Check For Differences In The One Way Anova MINITAB ASSISTANT WHITE PAPER This paper explains the research conducted by Minitab statisticians to develop the methods and data checks used in the Assistant in Minitab 17 Statistical Software. One-Way

More information

Statistics 305: Introduction to Biostatistical Methods for Health Sciences

Statistics 305: Introduction to Biostatistical Methods for Health Sciences Statistics 305: Introduction to Biostatistical Methods for Health Sciences Modelling the Log Odds Logistic Regression (Chap 20) Instructor: Liangliang Wang Statistics and Actuarial Science, Simon Fraser

More information

Coefficient of Determination

Coefficient of Determination Coefficient of Determination The coefficient of determination R 2 (or sometimes r 2 ) is another measure of how well the least squares equation ŷ = b 0 + b 1 x performs as a predictor of y. R 2 is computed

More information

Chicago Booth BUSINESS STATISTICS 41000 Final Exam Fall 2011

Chicago Booth BUSINESS STATISTICS 41000 Final Exam Fall 2011 Chicago Booth BUSINESS STATISTICS 41000 Final Exam Fall 2011 Name: Section: I pledge my honor that I have not violated the Honor Code Signature: This exam has 34 pages. You have 3 hours to complete this

More information

CHAPTER 13 SIMPLE LINEAR REGRESSION. Opening Example. Simple Regression. Linear Regression

CHAPTER 13 SIMPLE LINEAR REGRESSION. Opening Example. Simple Regression. Linear Regression Opening Example CHAPTER 13 SIMPLE LINEAR REGREION SIMPLE LINEAR REGREION! Simple Regression! Linear Regression Simple Regression Definition A regression model is a mathematical equation that descries the

More information

Determinants of Stock Market Performance in Pakistan

Determinants of Stock Market Performance in Pakistan Determinants of Stock Market Performance in Pakistan Mehwish Zafar Sr. Lecturer Bahria University, Karachi campus Abstract Stock market performance, economic and political condition of a country is interrelated

More information

Chapter 5 Analysis of variance SPSS Analysis of variance

Chapter 5 Analysis of variance SPSS Analysis of variance Chapter 5 Analysis of variance SPSS Analysis of variance Data file used: gss.sav How to get there: Analyze Compare Means One-way ANOVA To test the null hypothesis that several population means are equal,

More information

Chapter 13 Introduction to Linear Regression and Correlation Analysis

Chapter 13 Introduction to Linear Regression and Correlation Analysis Chapter 3 Student Lecture Notes 3- Chapter 3 Introduction to Linear Regression and Correlation Analsis Fall 2006 Fundamentals of Business Statistics Chapter Goals To understand the methods for displaing

More information

Fairfield Public Schools

Fairfield Public Schools Mathematics Fairfield Public Schools AP Statistics AP Statistics BOE Approved 04/08/2014 1 AP STATISTICS Critical Areas of Focus AP Statistics is a rigorous course that offers advanced students an opportunity

More information

Example: Boats and Manatees

Example: Boats and Manatees Figure 9-6 Example: Boats and Manatees Slide 1 Given the sample data in Table 9-1, find the value of the linear correlation coefficient r, then refer to Table A-6 to determine whether there is a significant

More information

LAB 4 INSTRUCTIONS CONFIDENCE INTERVALS AND HYPOTHESIS TESTING

LAB 4 INSTRUCTIONS CONFIDENCE INTERVALS AND HYPOTHESIS TESTING LAB 4 INSTRUCTIONS CONFIDENCE INTERVALS AND HYPOTHESIS TESTING In this lab you will explore the concept of a confidence interval and hypothesis testing through a simulation problem in engineering setting.

More information

Chapter 7: Simple linear regression Learning Objectives

Chapter 7: Simple linear regression Learning Objectives Chapter 7: Simple linear regression Learning Objectives Reading: Section 7.1 of OpenIntro Statistics Video: Correlation vs. causation, YouTube (2:19) Video: Intro to Linear Regression, YouTube (5:18) -

More information

1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96

1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96 1 Final Review 2 Review 2.1 CI 1-propZint Scenario 1 A TV manufacturer claims in its warranty brochure that in the past not more than 10 percent of its TV sets needed any repair during the first two years

More information

13: Additional ANOVA Topics. Post hoc Comparisons

13: Additional ANOVA Topics. Post hoc Comparisons 13: Additional ANOVA Topics Post hoc Comparisons ANOVA Assumptions Assessing Group Variances When Distributional Assumptions are Severely Violated Kruskal-Wallis Test Post hoc Comparisons In the prior

More information

Chapter Four. Data Analyses and Presentation of the Findings

Chapter Four. Data Analyses and Presentation of the Findings Chapter Four Data Analyses and Presentation of the Findings The fourth chapter represents the focal point of the research report. Previous chapters of the report have laid the groundwork for the project.

More information

Econometric Principles and Data Analysis

Econometric Principles and Data Analysis Econometric Principles and Data Analysis product: 4339 course code: c230 c330 Econometric Principles and Data Analysis Centre for Financial and Management Studies SOAS, University of London 1999, revised

More information

Evaluating one-way and two-way cluster-robust covariance matrix estimates

Evaluating one-way and two-way cluster-robust covariance matrix estimates Evaluating one-way and two-way cluster-robust covariance matrix estimates Christopher F Baum 1 Austin Nichols 2 Mark E Schaffer 3 1 Boston College and DIW Berlin 2 Urban Institute 3 Heriot Watt University

More information

3.1 Least squares in matrix form

3.1 Least squares in matrix form 118 3 Multiple Regression 3.1 Least squares in matrix form E Uses Appendix A.2 A.4, A.6, A.7. 3.1.1 Introduction More than one explanatory variable In the foregoing chapter we considered the simple regression

More information

Hypothesis testing - Steps

Hypothesis testing - Steps Hypothesis testing - Steps Steps to do a two-tailed test of the hypothesis that β 1 0: 1. Set up the hypotheses: H 0 : β 1 = 0 H a : β 1 0. 2. Compute the test statistic: t = b 1 0 Std. error of b 1 =

More information

NCSS Statistical Software

NCSS Statistical Software Chapter 06 Introduction This procedure provides several reports for the comparison of two distributions, including confidence intervals for the difference in means, two-sample t-tests, the z-test, the

More information

Econometrics and Data Analysis I

Econometrics and Data Analysis I Econometrics and Data Analysis I Yale University ECON S131 (ONLINE) Summer Session A, 2014 June 2 July 4 Instructor: Doug McKee (douglas.mckee@yale.edu) Teaching Fellow: Yu Liu (dav.yu.liu@yale.edu) Classroom:

More information

MISSING DATA TECHNIQUES WITH SAS. IDRE Statistical Consulting Group

MISSING DATA TECHNIQUES WITH SAS. IDRE Statistical Consulting Group MISSING DATA TECHNIQUES WITH SAS IDRE Statistical Consulting Group ROAD MAP FOR TODAY To discuss: 1. Commonly used techniques for handling missing data, focusing on multiple imputation 2. Issues that could

More information

11. Analysis of Case-control Studies Logistic Regression

11. Analysis of Case-control Studies Logistic Regression Research methods II 113 11. Analysis of Case-control Studies Logistic Regression This chapter builds upon and further develops the concepts and strategies described in Ch.6 of Mother and Child Health:

More information

Two-sample inference: Continuous data

Two-sample inference: Continuous data Two-sample inference: Continuous data Patrick Breheny April 5 Patrick Breheny STA 580: Biostatistics I 1/32 Introduction Our next two lectures will deal with two-sample inference for continuous data As

More information

Chapter 1. Vector autoregressions. 1.1 VARs and the identi cation problem

Chapter 1. Vector autoregressions. 1.1 VARs and the identi cation problem Chapter Vector autoregressions We begin by taking a look at the data of macroeconomics. A way to summarize the dynamics of macroeconomic data is to make use of vector autoregressions. VAR models have become

More information

Experimental Design. Power and Sample Size Determination. Proportions. Proportions. Confidence Interval for p. The Binomial Test

Experimental Design. Power and Sample Size Determination. Proportions. Proportions. Confidence Interval for p. The Binomial Test Experimental Design Power and Sample Size Determination Bret Hanlon and Bret Larget Department of Statistics University of Wisconsin Madison November 3 8, 2011 To this point in the semester, we have largely

More information

TEMPORAL CAUSAL RELATIONSHIP BETWEEN STOCK MARKET CAPITALIZATION, TRADE OPENNESS AND REAL GDP: EVIDENCE FROM THAILAND

TEMPORAL CAUSAL RELATIONSHIP BETWEEN STOCK MARKET CAPITALIZATION, TRADE OPENNESS AND REAL GDP: EVIDENCE FROM THAILAND I J A B E R, Vol. 13, No. 4, (2015): 1525-1534 TEMPORAL CAUSAL RELATIONSHIP BETWEEN STOCK MARKET CAPITALIZATION, TRADE OPENNESS AND REAL GDP: EVIDENCE FROM THAILAND Komain Jiranyakul * Abstract: This study

More information

DETERMINANTS OF CAPITAL ADEQUACY RATIO IN SELECTED BOSNIAN BANKS

DETERMINANTS OF CAPITAL ADEQUACY RATIO IN SELECTED BOSNIAN BANKS DETERMINANTS OF CAPITAL ADEQUACY RATIO IN SELECTED BOSNIAN BANKS Nađa DRECA International University of Sarajevo nadja.dreca@students.ius.edu.ba Abstract The analysis of a data set of observation for 10

More information

Performing Unit Root Tests in EViews. Unit Root Testing

Performing Unit Root Tests in EViews. Unit Root Testing Página 1 de 12 Unit Root Testing The theory behind ARMA estimation is based on stationary time series. A series is said to be (weakly or covariance) stationary if the mean and autocovariances of the series

More information

Nonparametric statistics and model selection

Nonparametric statistics and model selection Chapter 5 Nonparametric statistics and model selection In Chapter, we learned about the t-test and its variations. These were designed to compare sample means, and relied heavily on assumptions of normality.

More information

Standard errors of marginal effects in the heteroskedastic probit model

Standard errors of marginal effects in the heteroskedastic probit model Standard errors of marginal effects in the heteroskedastic probit model Thomas Cornelißen Discussion Paper No. 320 August 2005 ISSN: 0949 9962 Abstract In non-linear regression models, such as the heteroskedastic

More information

European Journal of Business and Management ISSN 2222-1905 (Paper) ISSN 2222-2839 (Online) Vol.5, No.30, 2013

European Journal of Business and Management ISSN 2222-1905 (Paper) ISSN 2222-2839 (Online) Vol.5, No.30, 2013 The Impact of Stock Market Liquidity on Economic Growth in Jordan Shatha Abdul-Khaliq Assistant Professor,AlBlqa Applied University, Jordan * E-mail of the corresponding author: yshatha@gmail.com Abstract

More information

Basic Statistics and Data Analysis for Health Researchers from Foreign Countries

Basic Statistics and Data Analysis for Health Researchers from Foreign Countries Basic Statistics and Data Analysis for Health Researchers from Foreign Countries Volkert Siersma siersma@sund.ku.dk The Research Unit for General Practice in Copenhagen Dias 1 Content Quantifying association

More information

Missing Data: Part 1 What to Do? Carol B. Thompson Johns Hopkins Biostatistics Center SON Brown Bag 3/20/13

Missing Data: Part 1 What to Do? Carol B. Thompson Johns Hopkins Biostatistics Center SON Brown Bag 3/20/13 Missing Data: Part 1 What to Do? Carol B. Thompson Johns Hopkins Biostatistics Center SON Brown Bag 3/20/13 Overview Missingness and impact on statistical analysis Missing data assumptions/mechanisms Conventional

More information

STATISTICA Formula Guide: Logistic Regression. Table of Contents

STATISTICA Formula Guide: Logistic Regression. Table of Contents : Table of Contents... 1 Overview of Model... 1 Dispersion... 2 Parameterization... 3 Sigma-Restricted Model... 3 Overparameterized Model... 4 Reference Coding... 4 Model Summary (Summary Tab)... 5 Summary

More information

Simple Regression Theory II 2010 Samuel L. Baker

Simple Regression Theory II 2010 Samuel L. Baker SIMPLE REGRESSION THEORY II 1 Simple Regression Theory II 2010 Samuel L. Baker Assessing how good the regression equation is likely to be Assignment 1A gets into drawing inferences about how close the

More information

Regression with a Binary Dependent Variable

Regression with a Binary Dependent Variable Regression with a Binary Dependent Variable Chapter 9 Michael Ash CPPA Lecture 22 Course Notes Endgame Take-home final Distributed Friday 19 May Due Tuesday 23 May (Paper or emailed PDF ok; no Word, Excel,

More information

Comparing Means in Two Populations

Comparing Means in Two Populations Comparing Means in Two Populations Overview The previous section discussed hypothesis testing when sampling from a single population (either a single mean or two means from the same population). Now we

More information

Outline. Topic 4 - Analysis of Variance Approach to Regression. Partitioning Sums of Squares. Total Sum of Squares. Partitioning sums of squares

Outline. Topic 4 - Analysis of Variance Approach to Regression. Partitioning Sums of Squares. Total Sum of Squares. Partitioning sums of squares Topic 4 - Analysis of Variance Approach to Regression Outline Partitioning sums of squares Degrees of freedom Expected mean squares General linear test - Fall 2013 R 2 and the coefficient of correlation

More information

Regression step-by-step using Microsoft Excel

Regression step-by-step using Microsoft Excel Step 1: Regression step-by-step using Microsoft Excel Notes prepared by Pamela Peterson Drake, James Madison University Type the data into the spreadsheet The example used throughout this How to is a regression

More information

Recall this chart that showed how most of our course would be organized:

Recall this chart that showed how most of our course would be organized: Chapter 4 One-Way ANOVA Recall this chart that showed how most of our course would be organized: Explanatory Variable(s) Response Variable Methods Categorical Categorical Contingency Tables Categorical

More information

CALCULATIONS & STATISTICS

CALCULATIONS & STATISTICS CALCULATIONS & STATISTICS CALCULATION OF SCORES Conversion of 1-5 scale to 0-100 scores When you look at your report, you will notice that the scores are reported on a 0-100 scale, even though respondents

More information

Penalized regression: Introduction

Penalized regression: Introduction Penalized regression: Introduction Patrick Breheny August 30 Patrick Breheny BST 764: Applied Statistical Modeling 1/19 Maximum likelihood Much of 20th-century statistics dealt with maximum likelihood

More information

Integrating Financial Statement Modeling and Sales Forecasting

Integrating Financial Statement Modeling and Sales Forecasting Integrating Financial Statement Modeling and Sales Forecasting John T. Cuddington, Colorado School of Mines Irina Khindanova, University of Denver ABSTRACT This paper shows how to integrate financial statement

More information

Using instrumental variables techniques in economics and finance

Using instrumental variables techniques in economics and finance Using instrumental variables techniques in economics and finance Christopher F Baum 1 Boston College and DIW Berlin German Stata Users Group Meeting, Berlin, June 2008 1 Thanks to Mark Schaffer for a number

More information

From the help desk: Swamy s random-coefficients model

From the help desk: Swamy s random-coefficients model The Stata Journal (2003) 3, Number 3, pp. 302 308 From the help desk: Swamy s random-coefficients model Brian P. Poi Stata Corporation Abstract. This article discusses the Swamy (1970) random-coefficients

More information

Air passenger departures forecast models A technical note

Air passenger departures forecast models A technical note Ministry of Transport Air passenger departures forecast models A technical note By Haobo Wang Financial, Economic and Statistical Analysis Page 1 of 15 1. Introduction Sine 1999, the Ministry of Business,

More information

5.1 Identifying the Target Parameter

5.1 Identifying the Target Parameter University of California, Davis Department of Statistics Summer Session II Statistics 13 August 20, 2012 Date of latest update: August 20 Lecture 5: Estimation with Confidence intervals 5.1 Identifying

More information

A Primer on Mathematical Statistics and Univariate Distributions; The Normal Distribution; The GLM with the Normal Distribution

A Primer on Mathematical Statistics and Univariate Distributions; The Normal Distribution; The GLM with the Normal Distribution A Primer on Mathematical Statistics and Univariate Distributions; The Normal Distribution; The GLM with the Normal Distribution PSYC 943 (930): Fundamentals of Multivariate Modeling Lecture 4: September

More information

Non-Stationary Time Series andunitroottests

Non-Stationary Time Series andunitroottests Econometrics 2 Fall 2005 Non-Stationary Time Series andunitroottests Heino Bohn Nielsen 1of25 Introduction Many economic time series are trending. Important to distinguish between two important cases:

More information

International Statistical Institute, 56th Session, 2007: Phil Everson

International Statistical Institute, 56th Session, 2007: Phil Everson Teaching Regression using American Football Scores Everson, Phil Swarthmore College Department of Mathematics and Statistics 5 College Avenue Swarthmore, PA198, USA E-mail: peverso1@swarthmore.edu 1. Introduction

More information

Introduction. Hypothesis Testing. Hypothesis Testing. Significance Testing

Introduction. Hypothesis Testing. Hypothesis Testing. Significance Testing Introduction Hypothesis Testing Mark Lunt Arthritis Research UK Centre for Ecellence in Epidemiology University of Manchester 13/10/2015 We saw last week that we can never know the population parameters

More information