Economics 345 Applied Econometrics

Size: px
Start display at page:

Download "Economics 345 Applied Econometrics"


1 Economics 345 Applied Econometrics Lab 5: Hypothesis Testing in Multivariate Linear Regression II. Prof: Martin Farnham TAs: Joffré Leroux, Rebecca Wortzman, Yiying Yang Open EViews, and open the EViews workfile, bwght.wf1. These data, along with all of the data for this course, should be on the Network Drive sfgclients on uvic\storage (S:) Your computer should be mapped to this drive. Browse to this drive and to the folder \social sciences\economics\econ 345\Wooldridge Eviews Files\ Introduction: This week, we re going to add to our portfolio of hypothesis tests, by testing some non-zero null hypotheses, learning a bit about how to use p-values, and doing F-tests. 1) Some F-test preliminaries (your lecture notes may be useful here) i.a) What is the formula for the F-statistic, using sums of squared residuals? i.b) What is the formula for the F-statistic, using R-squared values? i.c) What is the intuition for the F-test? 2) Cigarette use and child birth weight. The following exercise is based on Example 4.9 in your text. We are going to examine the relationship between a woman s decision to smoke during pregnancy and the birth weight of her child. Low birth weight is considered an indicator of poor infant health, and can lead to health complications early in life. Load the dataset bwght.wf1, from the folder with Wooldridge datasets. ii.a) First, estimate the following model of child birth weight: bwght = β 0 + β 1 cigs + β 2 parity + β 3 faminc + β 4 motheduc + β 5 fatheduc + u The variables are defined as follows: bwght: birthweight in ounces

2 cigs: average number of cigarettes the mother smoked per day during pregnancy parity: birth order of child faminc: family income in thousands of dollars motheduc: years of schooling of mother fatheduc: years of schooling of father What sign would you expect on each of the coefficients in this model? For each, give a brief explanation. Do the signs you obtain match your expectations? Which don t? ii.b) Now let s perform the F-test proposed in Example 4.9. Formally, the test is set up as follows: :β 4 = 0,β 5 = 0 H 1 : At least one of these coefficients is non-zero. ii.c) Before we can proceed, we need to make sure the sample we work with will be the same for both our restricted and unrestricted models. Anytime you re-estimate a model omitting one or more variables, there s some chance your sample size will change. This is because one of the variables you omit may have some missing values. Observations that were tossed out by EViews due to missing data when you estimate an unrestricted model, may be automatically re-included by EViews once you estimate the restricted model. This causes the sample size to change. If you re trying to compare something like the SSR for two different models, you want the sample held constant for the comparison. I bring this up, because in this particular case that we re looking at, it turns out there are some missing data for fatheduc and motheduc. So the first thing we should do, to prevent the problem noted above, is to restrict our sample to observations for which values of fatheduc and motheduc are not missing. This will insure that our sample will remain constant when we compare the restricted and unrestricted models. To help with this, I ve created a dummy variable called no_educ that equals 1 for every observation that has missing data for fatheduc or motheduc. Let s limit our sample to observations for which no_educ=0. To do this, go to the Workfile window, and select Sample. In the IF condition window, type no_educ=0 Notice, that when you return to the Workfile window, the number of observations in the sample has dropped to Whew. Now we re ready to proceed with our correctly restricted sample. ii.d) First, we ll calculate the F-stat for the above hypothesis test. Then we ll see how to generate it in EViews. Write down the restricted model that is implied by this null hypothesis.

3 How many exclusion restrictions are implied by this null hypothesis? This gives you your value for q in calculating your F-stat. How many degrees of freedom are there in the unrestricted model? ii.d) Let s first calculate the F-stat for the above null hypothesis, using the formula that makes use of the SSR for each model. To do this, you need to estimate both the unrestricted and the restricted models, and find the SSR for each. This should be labeled Sum squared resid in your EViews output. Calculate the F-stat using this approach. To save time on the next step, record the R-squared value for each of the models you estimate. ii.e) Now calculate the F-stat for the same null, using the formula that makes use of the R-squared for each model. Your answer may differ slightly using the different methods, due to rounding error (they may also differ slightly from the value given in your text). But they should be quite close. ii.f) How is the F-stat that you ve calculated distributed? Recall that you state the distribution of the F-stat making reference to both numerator degrees of freedom and denominator degrees of freedom. Can you reject the null at the 5 percent significance level? To answer this, you will need to use Table G.3 in the back of your text, or the following online tabulation of the cdf for the F-statistic, to find the critical value of the F-distribution. Note that you will need to scroll down to the F-table and then make sure you pick the table that corresponds to an alpha of What is the critical value? Is your F-stat greater or less than this critical value? If it s greater, then you reject the null. If it s less, you fail to reject the null. ii.g) Now that you know you can perform the F-test by hand, let s do it the easy way. Let s ask EViews to do the F-test for us. To do this, first make sure to estimate the unrestricted model (so EViews has it in its memory). Then, in the Equation window, select View/Coefficient Tests/Wald-Coefficient Restrictions. This will give you a window that allows you to place restrictions on the coefficients in your model. The coefficients we re interested in restricting are the coefficients on motheduc and fatheduc, and we want to restrict their values to zero. The syntax for placing restrictions on coefficients is quite simple. In this case, we type in C(5)=0,C(6)=0 This just tells EViews that we want to restrict the fifth and sixth coefficients in the model to equal zero. Note that because EViews views the intercept as C(1), the coefficients on the fourth and fifth variables in the model are denoted as C(5) and C(6) respectively.

4 After issuing the above command, you will get a window with output in it. The first line of the output gives you your F-stat. Confirm that this number is close to what you previously calculated by hand. Notice that EViews gives you the df for the numerator and denominator, which allows you to quickly reference an F-table for the level of significance you re interested in. Wasn t that easy? Much less chance of slipping up on your arithmetic, when you let EViews calculate the thing for you. ii.h) Notice also, that EViews gives you a value labeled Probability. This is the p-value associated with your F-statistic. Since F-tables are somewhat cumbersome to consult, this is a very handy thing for EViews to give you. What does the p-value really tell you? It lets you know how far out in the F distribution your F-stat lies. To be precise, if you take (1-p)*100, which in this case is 76.2, this tells you what percentile your F-stat lies in, in the relevant F-distribution (in this case, the one with a df of (2, 1185)). Now, if you re choosing to perform your hypothesis test at the 5% significance level, you know this means that if you get an F-stat that lies at or above the 95 th percentile, you will reject the null. So when EViews automatically tells you that the p-value is , you can immediately think, Ah, so this F-stat lies at the 76.2 percentile, which is below the 95 percentile, hence I fail to reject the null at the 5 percent significance level. (Note that this interpretation doesn t always hold with the t-distribution, as it depends whether your alternative is one- or two-sided). Another way to interpret the p-value is that it tells you the lowest significance level at which you can reject the null. In this case, the lowest significance level at which you can reject the null is the 23.8% level. This is a very high significance level (corresponding to a very high probability of Type I error). Generally, social scientists don t conduct hypothesis tests at anything above the 10% level. So, given such a high p-value we re given a slam-dunk case of failing to reject the null. ii.i) What does it mean that we fail to reject the null? It means that the coefficient estimates on motheduc and fatheduc are jointly statistically insignificant. Another way of saying this is that mother s and father s education are jointly statistically insignificant determinants of child birth weight. Given that we ve found these to be jointly insignificant, we will probably want to omit them from the model, as their inclusion may increase the variance of the coefficient estimate on cigs, our key RHS variable of interest (see last week s lab for more detail on this problem). ii.j) Briefly confirm that last point in (ii.i) by checking that the standard error of the estimate of the coefficient on cigs is smaller when you omit motheduc and fatheduc. Given the joint insignificance of motheduc and fatheduc, are you concerned about the implications of their exclusion for omitted variables bias? ii.k) Looking at your estimates from the restricted model (having dropped motheduc and fatheduc), notice the p-value associated with the t-statistic for faminc. Keeping in mind that EViews assumes a null of zero and a 2-sided alternative, when calculating this p- value, what does this tell you about the statistical significance of the coefficient estimate

5 on faminc given a two-sided alternative to the null that the coefficient on faminc equals zero? Can you reject the null that the coefficient on faminc equals zero at the 5% level? What about the 1% level? Isn t that handy? No t-table needed (for the zero null)! Note that if you find this confusing, Figure 4.6 in your text may be helpful. 3) Does education from a junior college have a different impact on wages than education from a university? Cultural note: A junior college in the US is roughly equivalent to a 2-year college in Canada. Close your previous workfile and load the workfile twoyear.wf1 (from the Econ 345 folder). This follows the discussion on pages of your textbook. iii.a) Estimate the following model: log(wage) = β 0 + β 1 jc + β 2 univ + β 3 exper +u iii.b) Comment on the coefficient signs and magnitudes for jc and univ. Are the signs as expected? What do the magnitudes imply about the contribution of 1 extra year of junior college education to the wage? What do they imply about the contribution of 1 extra year of university to the wage? How do the relative magnitudes compare? Is this what you would expect? iii.c) At standard significance levels, would you say all the coefficients in this model are statistically significant? iii.d) Suppose a policymaker is interested in testing whether university education raises wages more than junior college education. They might set up the following hypothesis test: :β 1 = β 2 H 1 :β 1 < β 2 Write down the formula for the t-statistic for this null hypothesis. iii.e) Is the denominator of this t-statistic readily obtained from regular EViews output? iii.f) Here s one way to do this problem. Recall from lecture (and the text) that you can directly calculate t = ˆβ 1 ˆβ 2 se( ˆβ 1 ˆβ 2 )

6 but that obtaining the denominator can be a bit tricky. The expression for the denominator is given in your text as se( ˆβ 1 ˆβ 2 ) = se ˆβ 1 ( ) + se ( ˆβ2 ) { 2s } 1/2, where s 12 = Cov( ˆβ 1, ˆβ 2 ) where the first two terms on the RHS are easily obtained, but s 12 is not immediately obvious in the EViews output. It turns out that s 12 is not difficult to obtain. After estimating the model, you simply go to the Equation window and select View/Covariance Matrix. From the covariance matrix, you select the element that corresponds to jc in the rows and univ in the columns (or univ in the rows and jc in the columns). This number should be 1.93E-06 (which is scientific notation for ). If you open a spreadsheet in Excel you can directly calculate the standard error of β 1. Due to rounding error, you may find you get a slightly different answer, but you should get something on the order of Calculate your t-stat for the above null. You should obtain a t-stat in the range of -1.4 to -1.5 (I m giving you a range, because rounding error may lead you to get a different estimate from me). Write down the t-stat you get for later reference. Given this, do you reject the null hypothesis. An online t-table is available here. iii.g) Given that this was a somewhat awkward way to obtain a t-stat for our null, let s consider respecifying the model in such a way as to make the hypothesis test easier to perform. Consider the following restatement of the null hypothesis: :β 1 = 0 Given this statement of the null, we can come up with the following rearrangement :θ 1 = β 1 = 0 H 1 :θ 1 < 0 We can use this to help respecify the model in a way that gives us a direct estimate of theta, and a direct estimate of the standard error of θ 1 (which is the same as the standard error of β 1, which is what we were having a hard time obtaining above). Substituting into the original equation, we get

7 log(wage) = β 0 + (θ 1 + β 2 ) jc + β 2 univ + β 3 exper +u log(wage) = β 0 +θ 1 jc + β 2 ( jc + univ) + β 3 exper +u. Let (jc+univ)=totcoll. Then log(wage) = β 0 +θ 1 jc + β 2 totcoll + β 3 exper +u The variable totcoll is already defined in your dataset, so go ahead and estimate this new specification. iii.h) Do you reject the null that β 1 = β 2? Write down the t-stat for your null hypothesis. Is it roughly the same as what you obtained above? iii.i) Finally, here s the simplest way to test the null that β 1 = β 2. Estimate your original model with jc and univ (not totcoll) on the RHS. In the Equation window, go to View/Coefficient Tests/Wald-Coefficient Restrictions and enter C(2)=C(3) This will give you some output similar to what you received when you did the F-test above. Here, you want to look at the bottom line of output where it gives the value of the coefficient and the standard error for C(2)-C(3). You can calculate the t-stat for your null hypothesis using these figures. Simply divide the estimated value of β 1 by its standard error, and confirm that the t-stat you generate matches the t-stat you obtained above. iii.j) Note that there s even a fourth way to obtain your t-statistic for the above null. Recall from lecture that if you conduct an F-test with one linear restriction, the F-statistic you obtain for that restriction equals the t-stat (for that single restriction) squared. So, in looking at the output that EViews generated when you imposed the coefficient restrictions, look at the top line and the F-stat that was generated. Take its square root. Voila! The t-stat for your null hypothesis above! The nice thing about using the F-stat, is that the p-value for the F-stat is the same as the 2-sided p-value for the t-stat. You can use this p-value to quickly assess whether you would reject the null at whatever significance level you are conducting your test. You have to be careful of one thing though. The p-value you will have is for a two-sided alternative. In our test, our alternative is one sided. For a one-sided test (see the lecture notes for a reminder on this) you divide the two-sided p-value by 2. This gives you a one-sided p-value of In other words, you can reject the null at the 10 percent level, but not at the 5 percent level. The lowest significance level at which you can reject the null is the 7.11 percent level. Note: There was a lot covered in this lab. If your head is spinning a bit at the end, you might consider going through it a little more slowly on your own time. I think you ll find that it reinforces some material from lecture, and hopefully clarifies the implementation of some things you ve only read about or seen in lecture.

Rockefeller College University at Albany

Rockefeller College University at Albany Rockefeller College University at Albany PAD 705 Handout: Hypothesis Testing on Multiple Parameters In many cases we may wish to know whether two or more variables are jointly significant in a regression.

More information

Regression step-by-step using Microsoft Excel

Regression step-by-step using Microsoft Excel Step 1: Regression step-by-step using Microsoft Excel Notes prepared by Pamela Peterson Drake, James Madison University Type the data into the spreadsheet The example used throughout this How to is a regression

More information

Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression

Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression Objectives: To perform a hypothesis test concerning the slope of a least squares line To recognize that testing for a

More information

Econometrics Problem Set #2

Econometrics Problem Set #2 Econometrics Problem Set #2 Nathaniel Higgins Assignment The homework assignment was to read chapter 2 and hand in answers to the following problems at the end of the chapter: 2.1 2.5

More information

Econometrics Problem Set #3

Econometrics Problem Set #3 Econometrics Problem Set #3 Nathaniel Higgins Assignment The assignment was to read chapter 3 and hand in answers to the following problems at the end of the chapter: C3.1 C3.8. C3.1 A

More information

2. Linear regression with multiple regressors

2. Linear regression with multiple regressors 2. Linear regression with multiple regressors Aim of this section: Introduction of the multiple regression model OLS estimation in multiple regression Measures-of-fit in multiple regression Assumptions

More information

Class 19: Two Way Tables, Conditional Distributions, Chi-Square (Text: Sections 2.5; 9.1)

Class 19: Two Way Tables, Conditional Distributions, Chi-Square (Text: Sections 2.5; 9.1) Spring 204 Class 9: Two Way Tables, Conditional Distributions, Chi-Square (Text: Sections 2.5; 9.) Big Picture: More than Two Samples In Chapter 7: We looked at quantitative variables and compared the

More information

KSTAT MINI-MANUAL. Decision Sciences 434 Kellogg Graduate School of Management

KSTAT MINI-MANUAL. Decision Sciences 434 Kellogg Graduate School of Management KSTAT MINI-MANUAL Decision Sciences 434 Kellogg Graduate School of Management Kstat is a set of macros added to Excel and it will enable you to do the statistics required for this course very easily. To

More information

Simple Regression Theory II 2010 Samuel L. Baker

Simple Regression Theory II 2010 Samuel L. Baker SIMPLE REGRESSION THEORY II 1 Simple Regression Theory II 2010 Samuel L. Baker Assessing how good the regression equation is likely to be Assignment 1A gets into drawing inferences about how close the

More information

" Y. Notation and Equations for Regression Lecture 11/4. Notation:

 Y. Notation and Equations for Regression Lecture 11/4. Notation: Notation: Notation and Equations for Regression Lecture 11/4 m: The number of predictor variables in a regression Xi: One of multiple predictor variables. The subscript i represents any number from 1 through

More information

This chapter will demonstrate how to perform multiple linear regression with IBM SPSS

This chapter will demonstrate how to perform multiple linear regression with IBM SPSS CHAPTER 7B Multiple Regression: Statistical Methods Using IBM SPSS This chapter will demonstrate how to perform multiple linear regression with IBM SPSS first using the standard method and then using the

More information

Lesson 1: Comparison of Population Means Part c: Comparison of Two- Means

Lesson 1: Comparison of Population Means Part c: Comparison of Two- Means Lesson : Comparison of Population Means Part c: Comparison of Two- Means Welcome to lesson c. This third lesson of lesson will discuss hypothesis testing for two independent means. Steps in Hypothesis

More information

Mgmt 469. Regression Basics. You have all had some training in statistics and regression analysis. Still, it is useful to review

Mgmt 469. Regression Basics. You have all had some training in statistics and regression analysis. Still, it is useful to review Mgmt 469 Regression Basics You have all had some training in statistics and regression analysis. Still, it is useful to review some basic stuff. In this note I cover the following material: What is a regression

More information

Testing for Granger causality between stock prices and economic growth

Testing for Granger causality between stock prices and economic growth MPRA Munich Personal RePEc Archive Testing for Granger causality between stock prices and economic growth Pasquale Foresti 2006 Online at MPRA Paper No. 2962, posted

More information

SPSS Guide: Regression Analysis

SPSS Guide: Regression Analysis SPSS Guide: Regression Analysis I put this together to give you a step-by-step guide for replicating what we did in the computer lab. It should help you run the tests we covered. The best way to get familiar

More information

Solución del Examen Tipo: 1

Solución del Examen Tipo: 1 Solución del Examen Tipo: 1 Universidad Carlos III de Madrid ECONOMETRICS Academic year 2009/10 FINAL EXAM May 17, 2010 DURATION: 2 HOURS 1. Assume that model (III) verifies the assumptions of the classical

More information


MULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS MULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS MSR = Mean Regression Sum of Squares MSE = Mean Squared Error RSS = Regression Sum of Squares SSE = Sum of Squared Errors/Residuals α = Level of Significance

More information

1.5 Oneway Analysis of Variance

1.5 Oneway Analysis of Variance Statistics: Rosie Cornish. 200. 1.5 Oneway Analysis of Variance 1 Introduction Oneway analysis of variance (ANOVA) is used to compare several means. This method is often used in scientific or medical experiments

More information

Comparing Nested Models

Comparing Nested Models Comparing Nested Models ST 430/514 Two models are nested if one model contains all the terms of the other, and at least one additional term. The larger model is the complete (or full) model, and the smaller

More information

Module 5: Multiple Regression Analysis

Module 5: Multiple Regression Analysis Using Statistical Data Using to Make Statistical Decisions: Data Multiple to Make Regression Decisions Analysis Page 1 Module 5: Multiple Regression Analysis Tom Ilvento, University of Delaware, College

More information



More information

Recall this chart that showed how most of our course would be organized:

Recall this chart that showed how most of our course would be organized: Chapter 4 One-Way ANOVA Recall this chart that showed how most of our course would be organized: Explanatory Variable(s) Response Variable Methods Categorical Categorical Contingency Tables Categorical

More information

Testing Research and Statistical Hypotheses

Testing Research and Statistical Hypotheses Testing Research and Statistical Hypotheses Introduction In the last lab we analyzed metric artifact attributes such as thickness or width/thickness ratio. Those were continuous variables, which as you

More information

Section 13, Part 1 ANOVA. Analysis Of Variance

Section 13, Part 1 ANOVA. Analysis Of Variance Section 13, Part 1 ANOVA Analysis Of Variance Course Overview So far in this course we ve covered: Descriptive statistics Summary statistics Tables and Graphs Probability Probability Rules Probability

More information

Multiple Linear Regression

Multiple Linear Regression Multiple Linear Regression A regression with two or more explanatory variables is called a multiple regression. Rather than modeling the mean response as a straight line, as in simple regression, it is

More information

Odds ratio, Odds ratio test for independence, chi-squared statistic.

Odds ratio, Odds ratio test for independence, chi-squared statistic. Odds ratio, Odds ratio test for independence, chi-squared statistic. Announcements: Assignment 5 is live on webpage. Due Wed Aug 1 at 4:30pm. (9 days, 1 hour, 58.5 minutes ) Final exam is Aug 9. Review

More information


ECON 142 SKETCH OF SOLUTIONS FOR APPLIED EXERCISE #2 University of California, Berkeley Prof. Ken Chay Department of Economics Fall Semester, 005 ECON 14 SKETCH OF SOLUTIONS FOR APPLIED EXERCISE # Question 1: a. Below are the scatter plots of hourly wages

More information

Linear Models in STATA and ANOVA

Linear Models in STATA and ANOVA Session 4 Linear Models in STATA and ANOVA Page Strengths of Linear Relationships 4-2 A Note on Non-Linear Relationships 4-4 Multiple Linear Regression 4-5 Removal of Variables 4-8 Independent Samples

More information

One-Way Analysis of Variance

One-Way Analysis of Variance One-Way Analysis of Variance Note: Much of the math here is tedious but straightforward. We ll skim over it in class but you should be sure to ask questions if you don t understand it. I. Overview A. We

More information


Good luck! BUSINESS STATISTICS FINAL EXAM INSTRUCTIONS. Name: Glo bal Leadership M BA BUSINESS STATISTICS FINAL EXAM Name: INSTRUCTIONS 1. Do not open this exam until instructed to do so. 2. Be sure to fill in your name before starting the exam. 3. You have two hours

More information

NCSS Statistical Software Principal Components Regression. In ordinary least squares, the regression coefficients are estimated using the formula ( )

NCSS Statistical Software Principal Components Regression. In ordinary least squares, the regression coefficients are estimated using the formula ( ) Chapter 340 Principal Components Regression Introduction is a technique for analyzing multiple regression data that suffer from multicollinearity. When multicollinearity occurs, least squares estimates

More information

Introduction to Hypothesis Testing

Introduction to Hypothesis Testing I. Terms, Concepts. Introduction to Hypothesis Testing A. In general, we do not know the true value of population parameters - they must be estimated. However, we do have hypotheses about what the true

More information

CONTENTS OF DAY 2. II. Why Random Sampling is Important 9 A myth, an urban legend, and the real reason NOTES FOR SUMMER STATISTICS INSTITUTE COURSE

CONTENTS OF DAY 2. II. Why Random Sampling is Important 9 A myth, an urban legend, and the real reason NOTES FOR SUMMER STATISTICS INSTITUTE COURSE 1 2 CONTENTS OF DAY 2 I. More Precise Definition of Simple Random Sample 3 Connection with independent random variables 3 Problems with small populations 8 II. Why Random Sampling is Important 9 A myth,

More information

Vieta s Formulas and the Identity Theorem

Vieta s Formulas and the Identity Theorem Vieta s Formulas and the Identity Theorem This worksheet will work through the material from our class on 3/21/2013 with some examples that should help you with the homework The topic of our discussion

More information

Answer: C. The strength of a correlation does not change if units change by a linear transformation such as: Fahrenheit = 32 + (5/9) * Centigrade

Answer: C. The strength of a correlation does not change if units change by a linear transformation such as: Fahrenheit = 32 + (5/9) * Centigrade Statistics Quiz Correlation and Regression -- ANSWERS 1. Temperature and air pollution are known to be correlated. We collect data from two laboratories, in Boston and Montreal. Boston makes their measurements

More information

Module 3: Correlation and Covariance

Module 3: Correlation and Covariance Using Statistical Data to Make Decisions Module 3: Correlation and Covariance Tom Ilvento Dr. Mugdim Pašiƒ University of Delaware Sarajevo Graduate School of Business O ften our interest in data analysis

More information

Kenken For Teachers. Tom Davis June 27, 2010. Abstract

Kenken For Teachers. Tom Davis June 27, 2010. Abstract Kenken For Teachers Tom Davis June 7, 00 Abstract Kenken is a puzzle whose solution requires a combination of logic and simple arithmetic skills.

More information

5.1 Radical Notation and Rational Exponents

5.1 Radical Notation and Rational Exponents Section 5.1 Radical Notation and Rational Exponents 1 5.1 Radical Notation and Rational Exponents We now review how exponents can be used to describe not only powers (such as 5 2 and 2 3 ), but also roots

More information

Part 2: Analysis of Relationship Between Two Variables

Part 2: Analysis of Relationship Between Two Variables Part 2: Analysis of Relationship Between Two Variables Linear Regression Linear correlation Significance Tests Multiple regression Linear Regression Y = a X + b Dependent Variable Independent Variable

More information

Regression III: Advanced Methods

Regression III: Advanced Methods Lecture 16: Generalized Additive Models Regression III: Advanced Methods Bill Jacoby Michigan State University Goals of the Lecture Introduce Additive Models

More information

Statistics courses often teach the two-sample t-test, linear regression, and analysis of variance

Statistics courses often teach the two-sample t-test, linear regression, and analysis of variance 2 Making Connections: The Two-Sample t-test, Regression, and ANOVA In theory, there s no difference between theory and practice. In practice, there is. Yogi Berra 1 Statistics courses often teach the two-sample

More information

Data Analysis Tools. Tools for Summarizing Data

Data Analysis Tools. Tools for Summarizing Data Data Analysis Tools This section of the notes is meant to introduce you to many of the tools that are provided by Excel under the Tools/Data Analysis menu item. If your computer does not have that tool

More information

Unit 26 Estimation with Confidence Intervals

Unit 26 Estimation with Confidence Intervals Unit 26 Estimation with Confidence Intervals Objectives: To see how confidence intervals are used to estimate a population proportion, a population mean, a difference in population proportions, or a difference

More information

3.4 Statistical inference for 2 populations based on two samples

3.4 Statistical inference for 2 populations based on two samples 3.4 Statistical inference for 2 populations based on two samples Tests for a difference between two population means The first sample will be denoted as X 1, X 2,..., X m. The second sample will be denoted

More information

2x + y = 3. Since the second equation is precisely the same as the first equation, it is enough to find x and y satisfying the system

2x + y = 3. Since the second equation is precisely the same as the first equation, it is enough to find x and y satisfying the system 1. Systems of linear equations We are interested in the solutions to systems of linear equations. A linear equation is of the form 3x 5y + 2z + w = 3. The key thing is that we don t multiply the variables

More information

Statistical Functions in Excel

Statistical Functions in Excel Statistical Functions in Excel There are many statistical functions in Excel. Moreover, there are other functions that are not specified as statistical functions that are helpful in some statistical analyses.

More information

StatCrunch and Nonparametric Statistics

StatCrunch and Nonparametric Statistics StatCrunch and Nonparametric Statistics You can use StatCrunch to calculate the values of nonparametric statistics. It may not be obvious how to enter the data in StatCrunch for various data sets that

More information

MULTIPLE LINEAR REGRESSION ANALYSIS USING MICROSOFT EXCEL. by Michael L. Orlov Chemistry Department, Oregon State University (1996)

MULTIPLE LINEAR REGRESSION ANALYSIS USING MICROSOFT EXCEL. by Michael L. Orlov Chemistry Department, Oregon State University (1996) MULTIPLE LINEAR REGRESSION ANALYSIS USING MICROSOFT EXCEL by Michael L. Orlov Chemistry Department, Oregon State University (1996) INTRODUCTION In modern science, regression analysis is a necessary part

More information

Case Study in Data Analysis Does a drug prevent cardiomegaly in heart failure?

Case Study in Data Analysis Does a drug prevent cardiomegaly in heart failure? Case Study in Data Analysis Does a drug prevent cardiomegaly in heart failure? Harvey Motulsky This is the first case in what I expect will be a series of case studies. While I mention

More information

Hypothesis Testing for Beginners

Hypothesis Testing for Beginners Hypothesis Testing for Beginners Michele Piffer LSE August, 2011 Michele Piffer (LSE) Hypothesis Testing for Beginners August, 2011 1 / 53 One year ago a friend asked me to put down some easy-to-read notes

More information

The correlation coefficient

The correlation coefficient The correlation coefficient Clinical Biostatistics The correlation coefficient Martin Bland Correlation coefficients are used to measure the of the relationship or association between two quantitative

More information

Wooldridge, Introductory Econometrics, 4th ed. Chapter 7: Multiple regression analysis with qualitative information: Binary (or dummy) variables

Wooldridge, Introductory Econometrics, 4th ed. Chapter 7: Multiple regression analysis with qualitative information: Binary (or dummy) variables Wooldridge, Introductory Econometrics, 4th ed. Chapter 7: Multiple regression analysis with qualitative information: Binary (or dummy) variables We often consider relationships between observed outcomes

More information



More information

3. Mathematical Induction

3. Mathematical Induction 3. MATHEMATICAL INDUCTION 83 3. Mathematical Induction 3.1. First Principle of Mathematical Induction. Let P (n) be a predicate with domain of discourse (over) the natural numbers N = {0, 1,,...}. If (1)

More information

Chapter 7: Simple linear regression Learning Objectives

Chapter 7: Simple linear regression Learning Objectives Chapter 7: Simple linear regression Learning Objectives Reading: Section 7.1 of OpenIntro Statistics Video: Correlation vs. causation, YouTube (2:19) Video: Intro to Linear Regression, YouTube (5:18) -

More information

13. Poisson Regression Analysis

13. Poisson Regression Analysis 136 Poisson Regression Analysis 13. Poisson Regression Analysis We have so far considered situations where the outcome variable is numeric and Normally distributed, or binary. In clinical work one often

More information

0.8 Rational Expressions and Equations

0.8 Rational Expressions and Equations 96 Prerequisites 0.8 Rational Expressions and Equations We now turn our attention to rational expressions - that is, algebraic fractions - and equations which contain them. The reader is encouraged to

More information

hp calculators HP 50g Trend Lines The STAT menu Trend Lines Practice predicting the future using trend lines

hp calculators HP 50g Trend Lines The STAT menu Trend Lines Practice predicting the future using trend lines The STAT menu Trend Lines Practice predicting the future using trend lines The STAT menu The Statistics menu is accessed from the ORANGE shifted function of the 5 key by pressing Ù. When pressed, a CHOOSE

More information

Section 14 Simple Linear Regression: Introduction to Least Squares Regression

Section 14 Simple Linear Regression: Introduction to Least Squares Regression Slide 1 Section 14 Simple Linear Regression: Introduction to Least Squares Regression There are several different measures of statistical association used for understanding the quantitative relationship

More information

Calculating P-Values. Parkland College. Isela Guerra Parkland College. Recommended Citation

Calculating P-Values. Parkland College. Isela Guerra Parkland College. Recommended Citation Parkland College A with Honors Projects Honors Program 2014 Calculating P-Values Isela Guerra Parkland College Recommended Citation Guerra, Isela, "Calculating P-Values" (2014). A with Honors Projects.

More information

2. Simple Linear Regression

2. Simple Linear Regression Research methods - II 3 2. Simple Linear Regression Simple linear regression is a technique in parametric statistics that is commonly used for analyzing mean response of a variable Y which changes according

More information

Chapter 6: Multivariate Cointegration Analysis

Chapter 6: Multivariate Cointegration Analysis Chapter 6: Multivariate Cointegration Analysis 1 Contents: Lehrstuhl für Department Empirische of Wirtschaftsforschung Empirical Research and und Econometrics Ökonometrie VI. Multivariate Cointegration

More information

To give it a definition, an implicit function of x and y is simply any relationship that takes the form:

To give it a definition, an implicit function of x and y is simply any relationship that takes the form: 2 Implicit function theorems and applications 21 Implicit functions The implicit function theorem is one of the most useful single tools you ll meet this year After a while, it will be second nature to

More information

Simple Linear Regression Inference

Simple Linear Regression Inference Simple Linear Regression Inference 1 Inference requirements The Normality assumption of the stochastic term e is needed for inference even if it is not a OLS requirement. Therefore we have: Interpretation

More information


LAB 4 INSTRUCTIONS CONFIDENCE INTERVALS AND HYPOTHESIS TESTING LAB 4 INSTRUCTIONS CONFIDENCE INTERVALS AND HYPOTHESIS TESTING In this lab you will explore the concept of a confidence interval and hypothesis testing through a simulation problem in engineering setting.

More information


HYPOTHESIS TESTING: CONFIDENCE INTERVALS, T-TESTS, ANOVAS, AND REGRESSION HYPOTHESIS TESTING: CONFIDENCE INTERVALS, T-TESTS, ANOVAS, AND REGRESSION HOD 2990 10 November 2010 Lecture Background This is a lightning speed summary of introductory statistical methods for senior undergraduate

More information

NCSS Statistical Software

NCSS Statistical Software Chapter 06 Introduction This procedure provides several reports for the comparison of two distributions, including confidence intervals for the difference in means, two-sample t-tests, the z-test, the

More information

Using R for Linear Regression

Using R for Linear Regression Using R for Linear Regression In the following handout words and symbols in bold are R functions and words and symbols in italics are entries supplied by the user; underlined words and symbols are optional

More information

DDBA 8438: Introduction to Hypothesis Testing Video Podcast Transcript

DDBA 8438: Introduction to Hypothesis Testing Video Podcast Transcript DDBA 8438: Introduction to Hypothesis Testing Video Podcast Transcript JENNIFER ANN MORROW: Welcome to "Introduction to Hypothesis Testing." My name is Dr. Jennifer Ann Morrow. In today's demonstration,

More information

Two-sample t-tests. - Independent samples - Pooled standard devation - The equal variance assumption

Two-sample t-tests. - Independent samples - Pooled standard devation - The equal variance assumption Two-sample t-tests. - Independent samples - Pooled standard devation - The equal variance assumption Last time, we used the mean of one sample to test against the hypothesis that the true mean was a particular

More information



More information

CHAPTER 2. Logic. 1. Logic Definitions. Notation: Variables are used to represent propositions. The most common variables used are p, q, and r.

CHAPTER 2. Logic. 1. Logic Definitions. Notation: Variables are used to represent propositions. The most common variables used are p, q, and r. CHAPTER 2 Logic 1. Logic Definitions 1.1. Propositions. Definition 1.1.1. A proposition is a declarative sentence that is either true (denoted either T or 1) or false (denoted either F or 0). Notation:

More information

Notes on Determinant

Notes on Determinant ENGG2012B Advanced Engineering Mathematics Notes on Determinant Lecturer: Kenneth Shum Lecture 9-18/02/2013 The determinant of a system of linear equations determines whether the solution is unique, without

More information

Two-sample inference: Continuous data

Two-sample inference: Continuous data Two-sample inference: Continuous data Patrick Breheny April 5 Patrick Breheny STA 580: Biostatistics I 1/32 Introduction Our next two lectures will deal with two-sample inference for continuous data As

More information

Analysis of Variance ANOVA

Analysis of Variance ANOVA Analysis of Variance ANOVA Overview We ve used the t -test to compare the means from two independent groups. Now we ve come to the final topic of the course: how to compare means from more than two populations.

More information

Economics of Strategy (ECON 4550) Maymester 2015 Applications of Regression Analysis

Economics of Strategy (ECON 4550) Maymester 2015 Applications of Regression Analysis Economics of Strategy (ECON 4550) Maymester 015 Applications of Regression Analysis Reading: ACME Clinic (ECON 4550 Coursepak, Page 47) and Big Suzy s Snack Cakes (ECON 4550 Coursepak, Page 51) Definitions

More information

MATH10212 Linear Algebra. Systems of Linear Equations. Definition. An n-dimensional vector is a row or a column of n numbers (or letters): a 1.

MATH10212 Linear Algebra. Systems of Linear Equations. Definition. An n-dimensional vector is a row or a column of n numbers (or letters): a 1. MATH10212 Linear Algebra Textbook: D. Poole, Linear Algebra: A Modern Introduction. Thompson, 2006. ISBN 0-534-40596-7. Systems of Linear Equations Definition. An n-dimensional vector is a row or a column

More information


CALCULATIONS & STATISTICS CALCULATIONS & STATISTICS CALCULATION OF SCORES Conversion of 1-5 scale to 0-100 scores When you look at your report, you will notice that the scores are reported on a 0-100 scale, even though respondents

More information

Stepwise Regression. Chapter 311. Introduction. Variable Selection Procedures. Forward (Step-Up) Selection

Stepwise Regression. Chapter 311. Introduction. Variable Selection Procedures. Forward (Step-Up) Selection Chapter 311 Introduction Often, theory and experience give only general direction as to which of a pool of candidate variables (including transformed variables) should be included in the regression model.

More information

Pre-Algebra Lecture 6

Pre-Algebra Lecture 6 Pre-Algebra Lecture 6 Today we will discuss Decimals and Percentages. Outline: 1. Decimals 2. Ordering Decimals 3. Rounding Decimals 4. Adding and subtracting Decimals 5. Multiplying and Dividing Decimals

More information

Copyright 2007 by Laura Schultz. All rights reserved. Page 1 of 5

Copyright 2007 by Laura Schultz. All rights reserved. Page 1 of 5 Using Your TI-83/84 Calculator: Linear Correlation and Regression Elementary Statistics Dr. Laura Schultz This handout describes how to use your calculator for various linear correlation and regression

More information

Regression Analysis: A Complete Example

Regression Analysis: A Complete Example Regression Analysis: A Complete Example This section works out an example that includes all the topics we have discussed so far in this chapter. A complete example of regression analysis. PhotoDisc, Inc./Getty

More information

Traditional Conjoint Analysis with Excel

Traditional Conjoint Analysis with Excel hapter 8 Traditional onjoint nalysis with Excel traditional conjoint analysis may be thought of as a multiple regression problem. The respondent s ratings for the product concepts are observations on the

More information

Chapter 23 Inferences About Means

Chapter 23 Inferences About Means Chapter 23 Inferences About Means Chapter 23 - Inferences About Means 391 Chapter 23 Solutions to Class Examples 1. See Class Example 1. 2. We want to know if the mean battery lifespan exceeds the 300-minute

More information

1 Another method of estimation: least squares

1 Another method of estimation: least squares 1 Another method of estimation: least squares erm: -estim.tex, Dec8, 009: 6 p.m. (draft - typos/writos likely exist) Corrections, comments, suggestions welcome. 1.1 Least squares in general Assume Y i

More information

Final Exam Practice Problem Answers

Final Exam Practice Problem Answers Final Exam Practice Problem Answers The following data set consists of data gathered from 77 popular breakfast cereals. The variables in the data set are as follows: Brand: The brand name of the cereal

More information

Row Echelon Form and Reduced Row Echelon Form

Row Echelon Form and Reduced Row Echelon Form These notes closely follow the presentation of the material given in David C Lay s textbook Linear Algebra and its Applications (3rd edition) These notes are intended primarily for in-class presentation

More information

5.1 Identifying the Target Parameter

5.1 Identifying the Target Parameter University of California, Davis Department of Statistics Summer Session II Statistics 13 August 20, 2012 Date of latest update: August 20 Lecture 5: Estimation with Confidence intervals 5.1 Identifying

More information

Standard Deviation Estimator

Standard Deviation Estimator Chapter 905 Standard Deviation Estimator Introduction Even though it is not of primary interest, an estimate of the standard deviation (SD) is needed when calculating the power or sample size of

More information

Pigeonhole Principle Solutions

Pigeonhole Principle Solutions Pigeonhole Principle Solutions 1. Show that if we take n + 1 numbers from the set {1, 2,..., 2n}, then some pair of numbers will have no factors in common. Solution: Note that consecutive numbers (such

More information


4/1/2017. PS. Sequences and Series FROM 9.2 AND 9.3 IN THE BOOK AS WELL AS FROM OTHER SOURCES. TODAY IS NATIONAL MANATEE APPRECIATION DAY PS. Sequences and Series FROM 9.2 AND 9.3 IN THE BOOK AS WELL AS FROM OTHER SOURCES. TODAY IS NATIONAL MANATEE APPRECIATION DAY 1 Oh the things you should learn How to recognize and write arithmetic sequences

More information

Predicting Box Office Success: Do Critical Reviews Really Matter? By: Alec Kennedy Introduction: Information economics looks at the importance of

Predicting Box Office Success: Do Critical Reviews Really Matter? By: Alec Kennedy Introduction: Information economics looks at the importance of Predicting Box Office Success: Do Critical Reviews Really Matter? By: Alec Kennedy Introduction: Information economics looks at the importance of information in economic decisionmaking. Consumers that

More information

Guide to Microsoft Excel for calculations, statistics, and plotting data

Guide to Microsoft Excel for calculations, statistics, and plotting data Page 1/47 Guide to Microsoft Excel for calculations, statistics, and plotting data Topic Page A. Writing equations and text 2 1. Writing equations with mathematical operations 2 2. Writing equations with

More information

Review of basic statistics and the simplest forecasting model: the sample mean

Review of basic statistics and the simplest forecasting model: the sample mean Review of basic statistics and the simplest forecasting model: the sample mean Robert Nau Fuqua School of Business, Duke University August 2014 Most of what you need to remember about basic statistics

More information

Solution to Homework 2

Solution to Homework 2 Solution to Homework 2 Olena Bormashenko September 23, 2011 Section 1.4: 1(a)(b)(i)(k), 4, 5, 14; Section 1.5: 1(a)(b)(c)(d)(e)(n), 2(a)(c), 13, 16, 17, 18, 27 Section 1.4 1. Compute the following, if

More information

Association Between Variables

Association Between Variables Contents 11 Association Between Variables 767 11.1 Introduction............................ 767 11.1.1 Measure of Association................. 768 11.1.2 Chapter Summary.................... 769 11.2 Chi

More information

Chapter Four. Data Analyses and Presentation of the Findings

Chapter Four. Data Analyses and Presentation of the Findings Chapter Four Data Analyses and Presentation of the Findings The fourth chapter represents the focal point of the research report. Previous chapters of the report have laid the groundwork for the project.

More information

Common sense, and the model that we have used, suggest that an increase in p means a decrease in demand, but this is not the only possibility.

Common sense, and the model that we have used, suggest that an increase in p means a decrease in demand, but this is not the only possibility. Lecture 6: Income and Substitution E ects c 2009 Je rey A. Miron Outline 1. Introduction 2. The Substitution E ect 3. The Income E ect 4. The Sign of the Substitution E ect 5. The Total Change in Demand

More information