Chapter 11: Linear Regression - Inference in Regression Analysis - Part 2

Save this PDF as:
 WORD  PNG  TXT  JPG

Size: px
Start display at page:

Download "Chapter 11: Linear Regression - Inference in Regression Analysis - Part 2"

Transcription

1 Chapter 11: Linear Regression - Inference in Regression Analysis - Part 2 Note: Whether we calculate confidence intervals or perform hypothesis tests we need the distribution of the statistic we will use. Below is a quick review of hypothesis testing. Please also see Chapters 5 and 6. Terminology statistical hypothesis - a conjecture about a population parameter. null hypothesis - symbolized by H 0. The null hypothesis indicates that a parameter is either equal to a specific value (often that value will be zero) or perhaps or. alternative hypothesis - symbolized by H A. For our purpose, will indicate that a parameter is not equal (or possibly > or <) to a specific value. statistical test - uses sample data to make a decision about H 0. test statistic - (a.k.a. test value) - is a value obtained from the sample data (for example x or s 2. level of significance - maximum probability of comitting a Type I error (usually denoted by α). critical value - separates the critical region (the range of values that would indicate a significant difference, and hence the rejection of H 0 ) from the noncritical region. Possible outcomes of a hypothesis test. H 0 is true H 0 is false Reject H 0 Type I Error correct decision Do Not Reject H 0 correct decision Type II Error Example - Hypothesis test for the mean using a t-test. A researcher is interested in testing the hypothesis that male freshman gain more than 5 lbs in their first academic year. 25 freshman participated in the study. Their beginning and ending weights are obtained. The difference between the ending and beginning weights are computed for each. The following statistics have been calculated. x = 7.5, s 2 = 56.25, n = 25. H 0 : µ 5 H A : µ > 5 test statistic: t = x µ 0 s 2 /n t-dist w/ d.f. = n - 1 1

2 For this example, t = 56.26/ = The critical value for α =.05 with 24 degrees of freedom is Since the t < c.v. we can not reject H 0. Conclusion: The evidence does not suggest weight gain of at least 5 lbs. Fig. 1 Inference about β 1 The distribution of the estimate of β 1 is given by ˆβ 1 N ( β 1, σ 2 ) Σ(x i x) 2 We will use an estimate of the variance of ˆβ 1. An estimate of σ 2 is given by the mean square error (MSE). Thus, s 2 ( ˆβ MSE 1 ) = Σ(x i x). 2 Using the distribution of our test statistic ˆβ 1 β 1 s 2 ( ˆβ 1 ) t (n 2) it follows that a (1 α)100% confidence interval for ˆβ 1 is given by where s 2 is previously defined above. ˆβ 1 ± t (1 α s 2,d.f.=n 2) 2 ( ˆβ 1 ) Example - We will use the SAS program reg example1.sas. The data represent 20 college entrance exam scores (our predictor variable) and the respective gpa (response) at the end of the freshman year. From the output we note the following: 2

3 ˆβ 1 = s 2 ( ˆβ 1 ) = We can construct a 95% confidence interval for β 1. For α =.05, the correct percentiles come from a t-dist with 20 2 = 18 degrees of freedom. Because of the symmetry of the t-dist, the 2.5th and 97.5th percentile are given by and , respectively. Our 95% confidence interval is calculated as ± (2.101) (.14405) which yields the interval ( , ). Interpretation: With 95% confidence, we estimate the mean increase in GPA to be between.54 and 1.14 (per unit increase in the entrance exam score). Does the 95% confidence for β 1 include 0? What does this mean? Testing for a linear relationship: The statistical hypothesis to test for a linear relationship (in the simple linear regression case) is given by H 0 : β 1 = 0 Vs H A : β 1 0 The test statistic is given by t = ˆβ 1 s 2 ( ˆβ t (n 2) 1 ) Recall from the SAS program reg example1.sas the following: ˆβ 1 = and s 2 ( ˆβ 1 ) = Our test statistic is given by t = = Our critical value for the test (based on α =.05 ) is from a t-dist with 18 degrees of freedom equals

4 Fig. 2 Conclusion: Reject H 0. There appears to be a linear relationship between entrance exam scores and grade point average. Inference about β 0. If the scope of the model includes x = 0, we may want to make inference about β 0. We will use the following to construct a confidence interval for β 0. ˆβ 0 = ȳ ˆβ 1 x. An estimate of the variance of ˆβ 0 is given by s 2 ( ˆβ 0 ) = MSE ( ) 1 n + x Σ i (x i x) 2 where MSE = SSE n 2. Recall that SSE is given by SSE = Σ n i=1e 2 i = Σ i (Y i Ŷi) 2. Using the above we have the following: ˆβ 0 β 0 s 2 ( ˆβ 0 ) t (n 2) A (1 α) 100 percent confidence interval for β 0 is given by ˆβ 0 ± t (1 α 2 ; df = n 2) s 2 ( ˆβ 0 ) 4

5 Notes on inference regarding β 0 and β 1. The sampling distributions of ˆβ 0 and ˆβ 1 will still be approximately normal for minor departures of normality. If departures from normality are serious, large sample sizes will still provide estimates of β 0 and β 1 that are asymptotically normal. Inference about β 0 and β 1 are made under the assumption of repeated sampling from the same scope of X. Interval Estimation of E[Y h ] Let x h be a value in the sample or w/i the scope of the model. Ŷh = ˆβ 0 + ˆβ 1 x h. An estimate of the variance of Ŷh is given by s 2 (Ŷh) = MSE ( 1 n + (x h x) 2 ) Σ i (x i x) 2 where MSE = SSE n 2. Using the above we have the following: Ŷ h E[Y h ] s 2 (Ŷh) t (n 2) A (1 α) 100 percent confidence interval for E[Y h ] is given by Ŷ h ± t (1 α 2 ; df = n 2) s 2 (Ŷh) Comments: Recall that Ŷi = ˆβ 0 + ˆβ 1 x i plots through ( x, ȳ). The variance of Ŷh increases as the distance between x h and x increases. The confidence interval formula for E[Ŷh] applies to a single mean response. We will use an alternate formula when we want simultaneous prediction intervals for several mean responses. Prediction of a New Observation for a given level of X Recall E[Y h ] represented the mean of Y at a given level of X and our interest was in estimating the mean of Y. 5

6 Y h(new) represents a prediction of a draw from the distribution of Y for a specific X h. Why can we not just use the confidence interval for E[Y h ]? We need to account for two sources of variation. Fig Y is a random variable - we must account for variation in location of the distribution of Y. (Think in terms of the model - only source of variation is model error variance) 2. Once the location of Y is fixed, we need to account for the variation within the distribution of Y. (Think in terms of estimating the model parameters - now we incorporate the variation from having to estimate the parameters) Let s 2 (pred) denote an estimate of the variance of predicting Y h(new). We can calculate s 2 (pred) using the following: s 2 (pred) = MSE ( n + (x h x) 2 ) Σ i (x i x) 2 A (1 α)100% confidence interval for a future value of Y at X h is given by Ŷ h(new) ± t (1 α 2 ; df = n 2) s 2 (pred) Note the two sources of variation accounted for in s 2 (pred). s 2 (pred) = MSE + MSE = MSE }{{} + s 2 (Ŷh) }{{} location w/i dist ( 1 n + (x h x) 2 ) Σ i (x i x) 2 6

7 Confidence Band for the Regression Line A confidence band encompasses the entire regression line. The confidence limits for E[Y h ] apply to a single value X h. The formula is similar to that of a confidence interval for E[Y h ] except for the multiplier W rather than percentile from a t-dist. Suppose W 2 = 2 F (1 α, 2, n 2) The boundary limits at X h for a given level of α are given by Ŷ h ± W s 2 (Ŷh) The result of using W rather than the percentile from a t-distribution results in a wider interval (we must encompass the entire regression line rather than a single point). Analysis of Variance (ANOVA) approach to regression We are only looking at ANOVA for a different perspective. This will make more sense in multiple regression. Breakdown of the Sums of Squares: Y i }{{ Ȳ = (Ŷi } Ȳ ) + (Y i }{{}}{{ Ŷi) } total deviation deviation of fit Vs mean deviation of obs. Vs fit After some algebra we have n n n (Y i Ȳ )2 = (Ŷi Ȳ )2 + (Y i Ŷi) 2 i=1 i=1 i=1 Total Sums of Squares (SST) = SS Reg + SSE The breakdown of degrees of freedom is as follows: SST = SSR + SSE General case n - 1 = p n - p Simple Lin Reg n - 1 = 1 + n - 2 7

8 ANOVA Table Source Sums of Squares df Mean Square (MS) F Regression SSR p - 1 SSR/(p - 1) MSR/MSE Error SSE n - p SSE/(n - p) Total SST n - 1 Comments: For the simple linear regression model we can use the F -statistic to test the following: H 0 : β 1 = 0 H A : β 1 0 The test statistic has the following distribution: t.s. = MSR MSE F (1 α; df1 = 1; df2 = n 2) See Table 8 on p of the text for tables of the F -distribution. We will reject H 0 if the following condition holds: t.s. > F (1 α; df1 = 1; df2 = n 2) Descriptive Measures of X and Y in the Regression Model Coefficient of Determination denoted by r 2 r 2 = SSR SST r 2 measures the proportion of variation explained by the model (recall what SSR and SST represent) the range of r 2 : 0 r 2 1. the closer r 2 is to 1, the closer the degree of a linear relationship between X and Y Coefficient of Correlation (a.k.a. correlation coefficient) r = ± r 2 Use a + if the sign of ˆβ 1 is + Use a - if the sign of ˆβ 1 is - the range of r: 1 r 1. 8

9 r is just a measure of association with no clear-cut interpretation. One can use it to make relative comparisons Example - Suppose we have corr(y, X 1 ) =.7 and corr(y, X 2 ) =.2. Thus, it seems that X 1 is more correlated to Y than X 2. Limitations of r 2 and r A high r does not imply the model is useful for predictions. Why? Due to variability, the confidence intervals could be too wide to be useful. A high r does not imply the estimated regression line has a good fit. Below is an example of where one may calculate a high r value but a curve explains the relationship between X and Y. r = 0 does not imply X and Y are not related. Below is an example where r = 0 but X and Y are quadratically related. Notes on applying regression analysis: Rejecting H 0 : β 1 0 does not imply a cause-and-effect relationship. Predictions using the estimated regression line are only valid w/i the scope of the data used in estimation. 9

Statistics 112 Regression Cheatsheet Section 1B - Ryan Rosario

Statistics 112 Regression Cheatsheet Section 1B - Ryan Rosario Statistics 112 Regression Cheatsheet Section 1B - Ryan Rosario I have found that the best way to practice regression is by brute force That is, given nothing but a dataset and your mind, compute everything

More information

Outline. Topic 4 - Analysis of Variance Approach to Regression. Partitioning Sums of Squares. Total Sum of Squares. Partitioning sums of squares

Outline. Topic 4 - Analysis of Variance Approach to Regression. Partitioning Sums of Squares. Total Sum of Squares. Partitioning sums of squares Topic 4 - Analysis of Variance Approach to Regression Outline Partitioning sums of squares Degrees of freedom Expected mean squares General linear test - Fall 2013 R 2 and the coefficient of correlation

More information

Lecture 5 Hypothesis Testing in Multiple Linear Regression

Lecture 5 Hypothesis Testing in Multiple Linear Regression Lecture 5 Hypothesis Testing in Multiple Linear Regression BIOST 515 January 20, 2004 Types of tests 1 Overall test Test for addition of a single variable Test for addition of a group of variables Overall

More information

e = random error, assumed to be normally distributed with mean 0 and standard deviation σ

e = random error, assumed to be normally distributed with mean 0 and standard deviation σ 1 Linear Regression 1.1 Simple Linear Regression Model The linear regression model is applied if we want to model a numeric response variable and its dependency on at least one numeric factor variable.

More information

Chapter 9. Section Correlation

Chapter 9. Section Correlation Chapter 9 Section 9.1 - Correlation Objectives: Introduce linear correlation, independent and dependent variables, and the types of correlation Find a correlation coefficient Test a population correlation

More information

One-Way Analysis of Variance (ANOVA) Example Problem

One-Way Analysis of Variance (ANOVA) Example Problem One-Way Analysis of Variance (ANOVA) Example Problem Introduction Analysis of Variance (ANOVA) is a hypothesis-testing technique used to test the equality of two or more population (or treatment) means

More information

Regression Analysis: A Complete Example

Regression Analysis: A Complete Example Regression Analysis: A Complete Example This section works out an example that includes all the topics we have discussed so far in this chapter. A complete example of regression analysis. PhotoDisc, Inc./Getty

More information

Chapter 7: Simple linear regression Learning Objectives

Chapter 7: Simple linear regression Learning Objectives Chapter 7: Simple linear regression Learning Objectives Reading: Section 7.1 of OpenIntro Statistics Video: Correlation vs. causation, YouTube (2:19) Video: Intro to Linear Regression, YouTube (5:18) -

More information

1 Simple Linear Regression I Least Squares Estimation

1 Simple Linear Regression I Least Squares Estimation Simple Linear Regression I Least Squares Estimation Textbook Sections: 8. 8.3 Previously, we have worked with a random variable x that comes from a population that is normally distributed with mean µ and

More information

Fixed vs. Random Effects

Fixed vs. Random Effects Statistics 203: Introduction to Regression and Analysis of Variance Fixed vs. Random Effects Jonathan Taylor - p. 1/19 Today s class Implications for Random effects. One-way random effects ANOVA. Two-way

More information

SIMPLE REGRESSION ANALYSIS

SIMPLE REGRESSION ANALYSIS SIMPLE REGRESSION ANALYSIS Introduction. Regression analysis is used when two or more variables are thought to be systematically connected by a linear relationship. In simple regression, we have only two

More information

Inferential Statistics

Inferential Statistics Inferential Statistics Sampling and the normal distribution Z-scores Confidence levels and intervals Hypothesis testing Commonly used statistical methods Inferential Statistics Descriptive statistics are

More information

Regression. Name: Class: Date: Multiple Choice Identify the choice that best completes the statement or answers the question.

Regression. Name: Class: Date: Multiple Choice Identify the choice that best completes the statement or answers the question. Class: Date: Regression Multiple Choice Identify the choice that best completes the statement or answers the question. 1. Given the least squares regression line y8 = 5 2x: a. the relationship between

More information

CHAPTER 13 SIMPLE LINEAR REGRESSION. Opening Example. Simple Regression. Linear Regression

CHAPTER 13 SIMPLE LINEAR REGRESSION. Opening Example. Simple Regression. Linear Regression Opening Example CHAPTER 13 SIMPLE LINEAR REGREION SIMPLE LINEAR REGREION! Simple Regression! Linear Regression Simple Regression Definition A regression model is a mathematical equation that descries the

More information

Regression, least squares

Regression, least squares Regression, least squares Joe Felsenstein Department of Genome Sciences and Department of Biology Regression, least squares p.1/24 Fitting a straight line X Two distinct cases: The X values are chosen

More information

Simple Linear Regression Inference

Simple Linear Regression Inference Simple Linear Regression Inference 1 Inference requirements The Normality assumption of the stochastic term e is needed for inference even if it is not a OLS requirement. Therefore we have: Interpretation

More information

t-tests and F-tests in regression

t-tests and F-tests in regression t-tests and F-tests in regression Johan A. Elkink University College Dublin 5 April 2012 Johan A. Elkink (UCD) t and F-tests 5 April 2012 1 / 25 Outline 1 Simple linear regression Model Variance and R

More information

Multiple Hypothesis Testing: The F-test

Multiple Hypothesis Testing: The F-test Multiple Hypothesis Testing: The F-test Matt Blackwell December 3, 2008 1 A bit of review When moving into the matrix version of linear regression, it is easy to lose sight of the big picture and get lost

More information

Chapter 13 Introduction to Linear Regression and Correlation Analysis

Chapter 13 Introduction to Linear Regression and Correlation Analysis Chapter 3 Student Lecture Notes 3- Chapter 3 Introduction to Linear Regression and Correlation Analsis Fall 2006 Fundamentals of Business Statistics Chapter Goals To understand the methods for displaing

More information

STA 4163 Lecture 10: Practice Problems

STA 4163 Lecture 10: Practice Problems STA 463 Lecture 0: Practice Problems Problem.0: A study was conducted to determine whether a student's final grade in STA406 is linearly related to his or her performance on the MATH ability test before

More information

12: Analysis of Variance. Introduction

12: Analysis of Variance. Introduction 1: Analysis of Variance Introduction EDA Hypothesis Test Introduction In Chapter 8 and again in Chapter 11 we compared means from two independent groups. In this chapter we extend the procedure to consider

More information

Statistical Inference

Statistical Inference Statistical Inference Idea: Estimate parameters of the population distribution using data. How: Use the sampling distribution of sample statistics and methods based on what would happen if we used this

More information

One-Way Analysis of Variance: A Guide to Testing Differences Between Multiple Groups

One-Way Analysis of Variance: A Guide to Testing Differences Between Multiple Groups One-Way Analysis of Variance: A Guide to Testing Differences Between Multiple Groups In analysis of variance, the main research question is whether the sample means are from different populations. The

More information

Outline. Correlation & Regression, III. Review. Relationship between r and regression

Outline. Correlation & Regression, III. Review. Relationship between r and regression Outline Correlation & Regression, III 9.07 4/6/004 Relationship between correlation and regression, along with notes on the correlation coefficient Effect size, and the meaning of r Other kinds of correlation

More information

Hypothesis Testing Level I Quantitative Methods. IFT Notes for the CFA exam

Hypothesis Testing Level I Quantitative Methods. IFT Notes for the CFA exam Hypothesis Testing 2014 Level I Quantitative Methods IFT Notes for the CFA exam Contents 1. Introduction... 3 2. Hypothesis Testing... 3 3. Hypothesis Tests Concerning the Mean... 10 4. Hypothesis Tests

More information

Section 13, Part 1 ANOVA. Analysis Of Variance

Section 13, Part 1 ANOVA. Analysis Of Variance Section 13, Part 1 ANOVA Analysis Of Variance Course Overview So far in this course we ve covered: Descriptive statistics Summary statistics Tables and Graphs Probability Probability Rules Probability

More information

A POPULATION MEAN, CONFIDENCE INTERVALS AND HYPOTHESIS TESTING

A POPULATION MEAN, CONFIDENCE INTERVALS AND HYPOTHESIS TESTING CHAPTER 5. A POPULATION MEAN, CONFIDENCE INTERVALS AND HYPOTHESIS TESTING 5.1 Concepts When a number of animals or plots are exposed to a certain treatment, we usually estimate the effect of the treatment

More information

Inferences About Differences Between Means Edpsy 580

Inferences About Differences Between Means Edpsy 580 Inferences About Differences Between Means Edpsy 580 Carolyn J. Anderson Department of Educational Psychology University of Illinois at Urbana-Champaign Inferences About Differences Between Means Slide

More information

Descriptive Statistics

Descriptive Statistics Descriptive Statistics Primer Descriptive statistics Central tendency Variation Relative position Relationships Calculating descriptive statistics Descriptive Statistics Purpose to describe or summarize

More information

Power and Sample Size Determination

Power and Sample Size Determination Power and Sample Size Determination Bret Hanlon and Bret Larget Department of Statistics University of Wisconsin Madison November 3 8, 2011 Power 1 / 31 Experimental Design To this point in the semester,

More information

Multiple Linear Regression

Multiple Linear Regression Multiple Linear Regression A regression with two or more explanatory variables is called a multiple regression. Rather than modeling the mean response as a straight line, as in simple regression, it is

More information

Statistiek II. John Nerbonne. October 1, 2010. Dept of Information Science j.nerbonne@rug.nl

Statistiek II. John Nerbonne. October 1, 2010. Dept of Information Science j.nerbonne@rug.nl Dept of Information Science j.nerbonne@rug.nl October 1, 2010 Course outline 1 One-way ANOVA. 2 Factorial ANOVA. 3 Repeated measures ANOVA. 4 Correlation and regression. 5 Multiple regression. 6 Logistic

More information

2. Simple Linear Regression

2. Simple Linear Regression Research methods - II 3 2. Simple Linear Regression Simple linear regression is a technique in parametric statistics that is commonly used for analyzing mean response of a variable Y which changes according

More information

Final Exam Practice Problem Answers

Final Exam Practice Problem Answers Final Exam Practice Problem Answers The following data set consists of data gathered from 77 popular breakfast cereals. The variables in the data set are as follows: Brand: The brand name of the cereal

More information

Using R for Linear Regression

Using R for Linear Regression Using R for Linear Regression In the following handout words and symbols in bold are R functions and words and symbols in italics are entries supplied by the user; underlined words and symbols are optional

More information

1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96

1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96 1 Final Review 2 Review 2.1 CI 1-propZint Scenario 1 A TV manufacturer claims in its warranty brochure that in the past not more than 10 percent of its TV sets needed any repair during the first two years

More information

Instrumental Variables & 2SLS

Instrumental Variables & 2SLS Instrumental Variables & 2SLS y 1 = β 0 + β 1 y 2 + β 2 z 1 +... β k z k + u y 2 = π 0 + π 1 z k+1 + π 2 z 1 +... π k z k + v Economics 20 - Prof. Schuetze 1 Why Use Instrumental Variables? Instrumental

More information

LAB 4 INSTRUCTIONS CONFIDENCE INTERVALS AND HYPOTHESIS TESTING

LAB 4 INSTRUCTIONS CONFIDENCE INTERVALS AND HYPOTHESIS TESTING LAB 4 INSTRUCTIONS CONFIDENCE INTERVALS AND HYPOTHESIS TESTING In this lab you will explore the concept of a confidence interval and hypothesis testing through a simulation problem in engineering setting.

More information

Introduction to Stata

Introduction to Stata Introduction to Stata September 23, 2014 Stata is one of a few statistical analysis programs that social scientists use. Stata is in the mid-range of how easy it is to use. Other options include SPSS,

More information

Estimation of σ 2, the variance of ɛ

Estimation of σ 2, the variance of ɛ Estimation of σ 2, the variance of ɛ The variance of the errors σ 2 indicates how much observations deviate from the fitted surface. If σ 2 is small, parameters β 0, β 1,..., β k will be reliably estimated

More information

Notes on Applied Linear Regression

Notes on Applied Linear Regression Notes on Applied Linear Regression Jamie DeCoster Department of Social Psychology Free University Amsterdam Van der Boechorststraat 1 1081 BT Amsterdam The Netherlands phone: +31 (0)20 444-8935 email:

More information

Study Guide for the Final Exam

Study Guide for the Final Exam Study Guide for the Final Exam When studying, remember that the computational portion of the exam will only involve new material (covered after the second midterm), that material from Exam 1 will make

More information

CHAPTER 13. Experimental Design and Analysis of Variance

CHAPTER 13. Experimental Design and Analysis of Variance CHAPTER 13 Experimental Design and Analysis of Variance CONTENTS STATISTICS IN PRACTICE: BURKE MARKETING SERVICES, INC. 13.1 AN INTRODUCTION TO EXPERIMENTAL DESIGN AND ANALYSIS OF VARIANCE Data Collection

More information

Module 5: Multiple Regression Analysis

Module 5: Multiple Regression Analysis Using Statistical Data Using to Make Statistical Decisions: Data Multiple to Make Regression Decisions Analysis Page 1 Module 5: Multiple Regression Analysis Tom Ilvento, University of Delaware, College

More information

Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression

Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression Objectives: To perform a hypothesis test concerning the slope of a least squares line To recognize that testing for a

More information

Business Statistics. Successful completion of Introductory and/or Intermediate Algebra courses is recommended before taking Business Statistics.

Business Statistics. Successful completion of Introductory and/or Intermediate Algebra courses is recommended before taking Business Statistics. Business Course Text Bowerman, Bruce L., Richard T. O'Connell, J. B. Orris, and Dawn C. Porter. Essentials of Business, 2nd edition, McGraw-Hill/Irwin, 2008, ISBN: 978-0-07-331988-9. Required Computing

More information

Hypothesis Testing. Bluman Chapter 8

Hypothesis Testing. Bluman Chapter 8 CHAPTER 8 Learning Objectives C H A P T E R E I G H T Hypothesis Testing 1 Outline 8-1 Steps in Traditional Method 8-2 z Test for a Mean 8-3 t Test for a Mean 8-4 z Test for a Proportion 8-5 2 Test for

More information

One-Way Analysis of Variance

One-Way Analysis of Variance One-Way Analysis of Variance Note: Much of the math here is tedious but straightforward. We ll skim over it in class but you should be sure to ask questions if you don t understand it. I. Overview A. We

More information

AP Statistics 2001 Solutions and Scoring Guidelines

AP Statistics 2001 Solutions and Scoring Guidelines AP Statistics 2001 Solutions and Scoring Guidelines The materials included in these files are intended for non-commercial use by AP teachers for course and exam preparation; permission for any other use

More information

August 2012 EXAMINATIONS Solution Part I

August 2012 EXAMINATIONS Solution Part I August 01 EXAMINATIONS Solution Part I (1) In a random sample of 600 eligible voters, the probability that less than 38% will be in favour of this policy is closest to (B) () In a large random sample,

More information

International Statistical Institute, 56th Session, 2007: Phil Everson

International Statistical Institute, 56th Session, 2007: Phil Everson Teaching Regression using American Football Scores Everson, Phil Swarthmore College Department of Mathematics and Statistics 5 College Avenue Swarthmore, PA198, USA E-mail: peverso1@swarthmore.edu 1. Introduction

More information

Wooldridge, Introductory Econometrics, 4th ed. Chapter 15: Instrumental variables and two stage least squares

Wooldridge, Introductory Econometrics, 4th ed. Chapter 15: Instrumental variables and two stage least squares Wooldridge, Introductory Econometrics, 4th ed. Chapter 15: Instrumental variables and two stage least squares Many economic models involve endogeneity: that is, a theoretical relationship does not fit

More information

INTERPRETING THE ONE-WAY ANALYSIS OF VARIANCE (ANOVA)

INTERPRETING THE ONE-WAY ANALYSIS OF VARIANCE (ANOVA) INTERPRETING THE ONE-WAY ANALYSIS OF VARIANCE (ANOVA) As with other parametric statistics, we begin the one-way ANOVA with a test of the underlying assumptions. Our first assumption is the assumption of

More information

Part 2: Analysis of Relationship Between Two Variables

Part 2: Analysis of Relationship Between Two Variables Part 2: Analysis of Relationship Between Two Variables Linear Regression Linear correlation Significance Tests Multiple regression Linear Regression Y = a X + b Dependent Variable Independent Variable

More information

2. What is the general linear model to be used to model linear trend? (Write out the model) = + + + or

2. What is the general linear model to be used to model linear trend? (Write out the model) = + + + or Simple and Multiple Regression Analysis Example: Explore the relationships among Month, Adv.$ and Sales $: 1. Prepare a scatter plot of these data. The scatter plots for Adv.$ versus Sales, and Month versus

More information

Sydney Roberts Predicting Age Group Swimmers 50 Freestyle Time 1. 1. Introduction p. 2. 2. Statistical Methods Used p. 5. 3. 10 and under Males p.

Sydney Roberts Predicting Age Group Swimmers 50 Freestyle Time 1. 1. Introduction p. 2. 2. Statistical Methods Used p. 5. 3. 10 and under Males p. Sydney Roberts Predicting Age Group Swimmers 50 Freestyle Time 1 Table of Contents 1. Introduction p. 2 2. Statistical Methods Used p. 5 3. 10 and under Males p. 8 4. 11 and up Males p. 10 5. 10 and under

More information

Coefficient of Determination

Coefficient of Determination Coefficient of Determination The coefficient of determination R 2 (or sometimes r 2 ) is another measure of how well the least squares equation ŷ = b 0 + b 1 x performs as a predictor of y. R 2 is computed

More information

Statistical Significance and Bivariate Tests

Statistical Significance and Bivariate Tests Statistical Significance and Bivariate Tests BUS 735: Business Decision Making and Research 1 1.1 Goals Goals Specific goals: Re-familiarize ourselves with basic statistics ideas: sampling distributions,

More information

Univariate Regression

Univariate Regression Univariate Regression Correlation and Regression The regression line summarizes the linear relationship between 2 variables Correlation coefficient, r, measures strength of relationship: the closer r is

More information

Data Analysis. Lecture Empirical Model Building and Methods (Empirische Modellbildung und Methoden) SS Analysis of Experiments - Introduction

Data Analysis. Lecture Empirical Model Building and Methods (Empirische Modellbildung und Methoden) SS Analysis of Experiments - Introduction Data Analysis Lecture Empirical Model Building and Methods (Empirische Modellbildung und Methoden) Prof. Dr. Dr. h.c. Dieter Rombach Dr. Andreas Jedlitschka SS 2014 Analysis of Experiments - Introduction

More information

Course Text. Required Computing Software. Course Description. Course Objectives. StraighterLine. Business Statistics

Course Text. Required Computing Software. Course Description. Course Objectives. StraighterLine. Business Statistics Course Text Business Statistics Lind, Douglas A., Marchal, William A. and Samuel A. Wathen. Basic Statistics for Business and Economics, 7th edition, McGraw-Hill/Irwin, 2010, ISBN: 9780077384470 [This

More information

MULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS

MULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS MULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS MSR = Mean Regression Sum of Squares MSE = Mean Squared Error RSS = Regression Sum of Squares SSE = Sum of Squared Errors/Residuals α = Level of Significance

More information

" Y. Notation and Equations for Regression Lecture 11/4. Notation:

 Y. Notation and Equations for Regression Lecture 11/4. Notation: Notation: Notation and Equations for Regression Lecture 11/4 m: The number of predictor variables in a regression Xi: One of multiple predictor variables. The subscript i represents any number from 1 through

More information

UCLA STAT 13 Statistical Methods - Final Exam Review Solutions Chapter 7 Sampling Distributions of Estimates

UCLA STAT 13 Statistical Methods - Final Exam Review Solutions Chapter 7 Sampling Distributions of Estimates UCLA STAT 13 Statistical Methods - Final Exam Review Solutions Chapter 7 Sampling Distributions of Estimates 1. (a) (i) µ µ (ii) σ σ n is exactly Normally distributed. (c) (i) is approximately Normally

More information

Section 1: Simple Linear Regression

Section 1: Simple Linear Regression Section 1: Simple Linear Regression Carlos M. Carvalho The University of Texas McCombs School of Business http://faculty.mccombs.utexas.edu/carlos.carvalho/teaching/ 1 Regression: General Introduction

More information

Chapter 3 Descriptive Statistics: Numerical Measures. Learning objectives

Chapter 3 Descriptive Statistics: Numerical Measures. Learning objectives Chapter 3 Descriptive Statistics: Numerical Measures Slide 1 Learning objectives 1. Single variable Part I (Basic) 1.1. How to calculate and use the measures of location 1.. How to calculate and use the

More information

ANOVA Analysis of Variance

ANOVA Analysis of Variance ANOVA Analysis of Variance What is ANOVA and why do we use it? Can test hypotheses about mean differences between more than 2 samples. Can also make inferences about the effects of several different IVs,

More information

UNDERSTANDING THE TWO-WAY ANOVA

UNDERSTANDING THE TWO-WAY ANOVA UNDERSTANDING THE e have seen how the one-way ANOVA can be used to compare two or more sample means in studies involving a single independent variable. This can be extended to two independent variables

More information

Statistics - Written Examination MEC Students - BOVISA

Statistics - Written Examination MEC Students - BOVISA Statistics - Written Examination MEC Students - BOVISA Prof.ssa A. Guglielmi 26.0.2 All rights reserved. Legal action will be taken against infringement. Reproduction is prohibited without prior consent.

More information

1.5 Oneway Analysis of Variance

1.5 Oneway Analysis of Variance Statistics: Rosie Cornish. 200. 1.5 Oneway Analysis of Variance 1 Introduction Oneway analysis of variance (ANOVA) is used to compare several means. This method is often used in scientific or medical experiments

More information

Using Minitab for Regression Analysis: An extended example

Using Minitab for Regression Analysis: An extended example Using Minitab for Regression Analysis: An extended example The following example uses data from another text on fertilizer application and crop yield, and is intended to show how Minitab can be used to

More information

Introduction to Analysis of Variance (ANOVA) Limitations of the t-test

Introduction to Analysis of Variance (ANOVA) Limitations of the t-test Introduction to Analysis of Variance (ANOVA) The Structural Model, The Summary Table, and the One- Way ANOVA Limitations of the t-test Although the t-test is commonly used, it has limitations Can only

More information

Statistics Review PSY379

Statistics Review PSY379 Statistics Review PSY379 Basic concepts Measurement scales Populations vs. samples Continuous vs. discrete variable Independent vs. dependent variable Descriptive vs. inferential stats Common analyses

More information

2013 MBA Jump Start Program. Statistics Module Part 3

2013 MBA Jump Start Program. Statistics Module Part 3 2013 MBA Jump Start Program Module 1: Statistics Thomas Gilbert Part 3 Statistics Module Part 3 Hypothesis Testing (Inference) Regressions 2 1 Making an Investment Decision A researcher in your firm just

More information

Prediction and Confidence Intervals in Regression

Prediction and Confidence Intervals in Regression Fall Semester, 2001 Statistics 621 Lecture 3 Robert Stine 1 Prediction and Confidence Intervals in Regression Preliminaries Teaching assistants See them in Room 3009 SH-DH. Hours are detailed in the syllabus.

More information

Simple Regression Theory II 2010 Samuel L. Baker

Simple Regression Theory II 2010 Samuel L. Baker SIMPLE REGRESSION THEORY II 1 Simple Regression Theory II 2010 Samuel L. Baker Assessing how good the regression equation is likely to be Assignment 1A gets into drawing inferences about how close the

More information

Yiming Peng, Department of Statistics. February 12, 2013

Yiming Peng, Department of Statistics. February 12, 2013 Regression Analysis Using JMP Yiming Peng, Department of Statistics February 12, 2013 2 Presentation and Data http://www.lisa.stat.vt.edu Short Courses Regression Analysis Using JMP Download Data to Desktop

More information

1. The parameters to be estimated in the simple linear regression model Y=α+βx+ε ε~n(0,σ) are: a) α, β, σ b) α, β, ε c) a, b, s d) ε, 0, σ

1. The parameters to be estimated in the simple linear regression model Y=α+βx+ε ε~n(0,σ) are: a) α, β, σ b) α, β, ε c) a, b, s d) ε, 0, σ STA 3024 Practice Problems Exam 2 NOTE: These are just Practice Problems. This is NOT meant to look just like the test, and it is NOT the only thing that you should study. Make sure you know all the material

More information

Data Analysis: Describing Data - Descriptive Statistics

Data Analysis: Describing Data - Descriptive Statistics WHAT IT IS Return to Table of ontents Descriptive statistics include the numbers, tables, charts, and graphs used to describe, organize, summarize, and present raw data. Descriptive statistics are most

More information

Interaction between quantitative predictors

Interaction between quantitative predictors Interaction between quantitative predictors In a first-order model like the ones we have discussed, the association between E(y) and a predictor x j does not depend on the value of the other predictors

More information

Unit 26: Small Sample Inference for One Mean

Unit 26: Small Sample Inference for One Mean Unit 26: Small Sample Inference for One Mean Prerequisites Students need the background on confidence intervals and significance tests covered in Units 24 and 25. Additional Topic Coverage Additional coverage

More information

Nonparametric Methods for Two Samples. Nonparametric Methods for Two Samples

Nonparametric Methods for Two Samples. Nonparametric Methods for Two Samples Nonparametric Methods for Two Samples An overview In the independent two-sample t-test, we assume normality, independence, and equal variances. This t-test is robust against nonnormality, but is sensitive

More information

Statistical Functions in Excel

Statistical Functions in Excel Statistical Functions in Excel There are many statistical functions in Excel. Moreover, there are other functions that are not specified as statistical functions that are helpful in some statistical analyses.

More information

Class 19: Two Way Tables, Conditional Distributions, Chi-Square (Text: Sections 2.5; 9.1)

Class 19: Two Way Tables, Conditional Distributions, Chi-Square (Text: Sections 2.5; 9.1) Spring 204 Class 9: Two Way Tables, Conditional Distributions, Chi-Square (Text: Sections 2.5; 9.) Big Picture: More than Two Samples In Chapter 7: We looked at quantitative variables and compared the

More information

Hypothesis Testing or How to Decide to Decide Edpsy 580

Hypothesis Testing or How to Decide to Decide Edpsy 580 Hypothesis Testing or How to Decide to Decide Edpsy 580 Carolyn J. Anderson Department of Educational Psychology University of Illinois at Urbana-Champaign Hypothesis Testing or How to Decide to Decide

More information

IAPRI Quantitative Analysis Capacity Building Series. Multiple regression analysis & interpreting results

IAPRI Quantitative Analysis Capacity Building Series. Multiple regression analysis & interpreting results IAPRI Quantitative Analysis Capacity Building Series Multiple regression analysis & interpreting results How important is R-squared? R-squared Published in Agricultural Economics 0.45 Best article of the

More information

Multiple Linear Regression in Data Mining

Multiple Linear Regression in Data Mining Multiple Linear Regression in Data Mining Contents 2.1. A Review of Multiple Linear Regression 2.2. Illustration of the Regression Process 2.3. Subset Selection in Linear Regression 1 2 Chap. 2 Multiple

More information

Hedonism example. Our questions in the last session. Our questions in this session

Hedonism example. Our questions in the last session. Our questions in this session Random Slope Models Hedonism example Our questions in the last session Do differences between countries in hedonism remain after controlling for individual age? How much of the variation in hedonism is

More information

Factors affecting online sales

Factors affecting online sales Factors affecting online sales Table of contents Summary... 1 Research questions... 1 The dataset... 2 Descriptive statistics: The exploratory stage... 3 Confidence intervals... 4 Hypothesis tests... 4

More information

Elementary Statistics Sample Exam #3

Elementary Statistics Sample Exam #3 Elementary Statistics Sample Exam #3 Instructions. No books or telephones. Only the supplied calculators are allowed. The exam is worth 100 points. 1. A chi square goodness of fit test is considered to

More information

Statistical Inference and t-tests

Statistical Inference and t-tests 1 Statistical Inference and t-tests Objectives Evaluate the difference between a sample mean and a target value using a one-sample t-test. Evaluate the difference between a sample mean and a target value

More information

Module 5 Hypotheses Tests: Comparing Two Groups

Module 5 Hypotheses Tests: Comparing Two Groups Module 5 Hypotheses Tests: Comparing Two Groups Objective: In medical research, we often compare the outcomes between two groups of patients, namely exposed and unexposed groups. At the completion of this

More information

Hypothesis Testing & Data Analysis. Statistics. Descriptive Statistics. What is the difference between descriptive and inferential statistics?

Hypothesis Testing & Data Analysis. Statistics. Descriptive Statistics. What is the difference between descriptive and inferential statistics? 2 Hypothesis Testing & Data Analysis 5 What is the difference between descriptive and inferential statistics? Statistics 8 Tools to help us understand our data. Makes a complicated mess simple to understand.

More information

Instrumental Variables & 2SLS

Instrumental Variables & 2SLS Instrumental Variables & 2SLS y 1 = β 0 + β 1 y 2 + β 2 z 1 +... β k z k + u y 2 = π 0 + π 1 z k+1 + π 2 z 1 +... π k z k + v Economics 20 - Prof. Schuetze 1 Why Use Instrumental Variables? Instrumental

More information

Lecture 2: Simple Linear Regression

Lecture 2: Simple Linear Regression DMBA: Statistics Lecture 2: Simple Linear Regression Least Squares, SLR properties, Inference, and Forecasting Carlos Carvalho The University of Texas McCombs School of Business mccombs.utexas.edu/faculty/carlos.carvalho/teaching

More information

Chapter 12 Sample Size and Power Calculations. Chapter Table of Contents

Chapter 12 Sample Size and Power Calculations. Chapter Table of Contents Chapter 12 Sample Size and Power Calculations Chapter Table of Contents Introduction...253 Hypothesis Testing...255 Confidence Intervals...260 Equivalence Tests...264 One-Way ANOVA...269 Power Computation

More information

Regression Analysis. Data Calculations Output

Regression Analysis. Data Calculations Output Regression Analysis In an attempt to find answers to questions such as those posed above, empirical labour economists use a useful tool called regression analysis. Regression analysis is essentially a

More information

Estimation and Inference in Cointegration Models Economics 582

Estimation and Inference in Cointegration Models Economics 582 Estimation and Inference in Cointegration Models Economics 582 Eric Zivot May 17, 2012 Tests for Cointegration Let the ( 1) vector Y be (1). Recall, Y is cointegrated with 0 cointegrating vectors if there

More information

Regression step-by-step using Microsoft Excel

Regression step-by-step using Microsoft Excel Step 1: Regression step-by-step using Microsoft Excel Notes prepared by Pamela Peterson Drake, James Madison University Type the data into the spreadsheet The example used throughout this How to is a regression

More information

REGRESSION LINES IN STATA

REGRESSION LINES IN STATA REGRESSION LINES IN STATA THOMAS ELLIOTT 1. Introduction to Regression Regression analysis is about eploring linear relationships between a dependent variable and one or more independent variables. Regression

More information