Regression, least squares

Save this PDF as:

Size: px
Start display at page:

Transcription

1 Regression, least squares Joe Felsenstein Department of Genome Sciences and Department of Biology Regression, least squares p.1/24

2 Fitting a straight line X Two distinct cases: The X values are chosen arbitrarily by you, and then values are measured for each. (X, ) pairs have a joint distribution and are sampled by you. These are different. We will assume the former. Regression, least squares p.2/24

3 Least squares Find intercept (a) and slope (b) by minimizing the sum of squares of departures of points from the line: X Q = n ( i (a + b X i )) 2 i=1 Regression, least squares p.3/24

4 Least squares Find intercept (a) and slope (b) by minimizing the sum of squares of departures of points from the line: X Q = n ( i (a + b X i )) 2 i=1 Regression, least squares p.4/24

5 Least squares Find intercept (a) and slope (b) by minimizing the sum of squares of departures of points from the line: X Q = n ( i (a + b X i )) 2 i=1 Regression, least squares p.5/24

6 Least squares Find intercept (a) and slope (b) by minimizing the sum of squares of departures of points from the line: X Q = n ( i (a + b X i )) 2 i=1 Regression, least squares p.6/24

7 The least squares solution The line passes through the point which is the means of both variables: ( X, Ȳ ) Its slope is ( i Ȳ )( X i X ) b.x = i ( Xi X ) 2 i Regression, least squares p.7/24

8 This assumes X 1 X 2 The X values are known precisely, without error. Each is drawn from a normal distribution whose mean is on the true line The standard deviations of these normal distributions are all equal. The residuals of the points are independent. How could we check these in R? Regression, least squares p.8/24

9 What if variances around line are unequal? When standard deviations around the line are not equal, and are a known function of X (up to a constant anyway), such as σ i = X 1/2, we can cope with that by having unequal weights of the points. If the quantities we square are the residuals, expressed as proportions of the (local) standard deviation: Q = i ( ) 2 i (a + b X i ) = ( 1 σ i σ 2 i i ) ( i (a + b X i )) 2 so that the natural weight is the reciprocal of the local variance. This leads to formulas for the slope that weight each term. Regression, least squares p.9/24

10 Estimating the standard deviation around the line The variance σ 2 is estimated simply by s 2, the mean square of the deviation from the estimated (unweighted) regression line. But the number of degrees of freedom in the denominator should be n 2 as both a and b are being estimated from these data. s 2 = ( i (a + b X i )) 2 i n 2 Regression, least squares p.10/24

11 How to do it in R ou can fit a least-squares regression using the function mm <- lsfit(x,) where X and are vectors of the same length, so the points we fit are (X[1],[1]), (X[2],[2]),... The coefficients of the fit are then given by mm\$coef The residuals are mm\$residuals and to print out the tests for zero slope just do ls.print(mm) and you will get something like this: Residual Standard Error= R-Square= F-statistic (df=1, 15)= p-value= Estimate Std.Err t-value Pr(> t ) Intercept X Regression, least squares p.11/24

12 Plotting residuals can be important X Regression, least squares p.12/24

13 Plotting residuals can be important X Regression, least squares p.13/24

14 residual of Plotting residuals can be important 0 X In this case, there is some sign that variances around the line are not the same for all values of X Regression, least squares p.14/24

15 residual of Plotting residuals can be important, part 2 0 X In this case, the relationship may not be a straight line. Regression, least squares p.15/24

16 residual of Plotting residuals can be important, part 3 0 X In this case, there is some indication that the residuals are correlated. Regression, least squares p.16/24

17 Putting confidence limits on the slope We often want confidence limits on the slope, and to do tests of hypotheses such as that the slope is zero. The departure of the estimate b y.x from its true value β can be shown to be distributed as t n 2 s b.x β = ( Xi X ) 2 where t n 2 is drawn from a t-distribution with n 2 degrees of freedom. By plugging in the upper and lower (say) 2.5% points of the t-distribution we can get the confidence limits from this, and test hypotheses such as that β = 0. Note that the denominator of the right-hand side implies the sensible point that choosing X s that are far apart helps. i Regression, least squares p.17/24

18 Confidence interval on the line, new points ou can also use the same machinery (in ways not shown here) to put confidence limits on where the line is at any given X (dark curve below) and confidence limits on what a new data point will be at that point (wider lighter curves). These are hyperbolas. Note the increasing uncertainty as you extrapolate. X Regression, least squares p.18/24

19 Regression through the origin If the regression line must pass through (0, 0), this just means that we replace X and Ȳ by zero. The result is that the slope is simply b.x = i X i i i X 2 i Regression, least squares p.19/24

20 Multiple regression One can do linear regression with more than one X variable: = β 0 + β 1 X 1 + β 2 X 2 + ε slope = β 1 (0,0,β 0) slope = β 2 (0,0,0) X 1 X 2 Regression, least squares p.20/24

21 Least squares fit for multiple regression The fit for multiple regression is a least squares fit, to minimize the sum of squared errors of prediction of Q = i ( i (β 0 + β 1 X 1 + β 2 X 2 )) 2 There are weighted versions of this too, when the error variance is different for different values of X 1 and X 2 When we solve for the β i by differentiating Q and equating the derivatives to zero, we get.. Regression, least squares p.21/24

22 The equations for multiple regression ou can do regression with one and multiple different X variables. This most easily done with matrices (sorry) n = 1 X 11 X 12 X 13 1 X 21 X 22 X X n1 X n2 X n3 β 0 β 1 β 2 β 3 + errors or, in matrix notation = Xβ + ε which can be shown to lead to the estimate of the regression coefficients ˆβ = ( X t X ) 1 ( X t ) Regression, least squares p.22/24

23 Fitting a (multivariate) linear model In R there is a function lm that uses a formula, leaving out the coefficients and with an implied intercept. So to fit = β 0 + β 1 X 1 + β 2 X 2 + ε ou simply have vectors X1 and X2, and leaving out the term for the intercept and the coefficients, just do: > lm( X1+X2) Call: lm(formula = X1 + X2) Coefficients: (Intercept) X1 X Regression, least squares p.23/24

24 Fitting a polynomial curve The machinery for fitting multiple regression can be used to fit polynomials all we have to do is consider X, X 2, X 3, i... to be separate variables. The least-squares equation is, for example: = β 0 + β 1 X + β 2 X 2 + β 3 X 3 + ε This can be done simply with the R linear model function lm this way: > lm( poly(x1,degree=3,raw=true)) Call: lm(formula = poly(x1, degree = 3, raw = TRUE)) Coefficients: (Intercept) poly(x1, degree = 3, raw = TRUE) poly(x1, degree = 3, raw = TRUE)2 poly(x1, degree = 3, raw = TRUE) Regression, least squares p.24/24

Simple Linear Regression Chapter 11

Simple Linear Regression Chapter 11 Rationale Frequently decision-making situations require modeling of relationships among business variables. For instance, the amount of sale of a product may be related

Data and Regression Analysis. Lecturer: Prof. Duane S. Boning. Rev 10

Data and Regression Analysis Lecturer: Prof. Duane S. Boning Rev 10 1 Agenda 1. Comparison of Treatments (One Variable) Analysis of Variance (ANOVA) 2. Multivariate Analysis of Variance Model forms 3.

Chapter 11: Two Variable Regression Analysis

Department of Mathematics Izmir University of Economics Week 14-15 2014-2015 In this chapter, we will focus on linear models and extend our analysis to relationships between variables, the definitions

Statistics for Management II-STAT 362-Final Review

Statistics for Management II-STAT 362-Final Review Multiple Choice Identify the letter of the choice that best completes the statement or answers the question. 1. The ability of an interval estimate to

where b is the slope of the line and a is the intercept i.e. where the line cuts the y axis.

Least Squares Introduction We have mentioned that one should not always conclude that because two variables are correlated that one variable is causing the other to behave a certain way. However, sometimes

t-tests and F-tests in regression

t-tests and F-tests in regression Johan A. Elkink University College Dublin 5 April 2012 Johan A. Elkink (UCD) t and F-tests 5 April 2012 1 / 25 Outline 1 Simple linear regression Model Variance and R

SELF-TEST: SIMPLE REGRESSION

ECO 22000 McRAE SELF-TEST: SIMPLE REGRESSION Note: Those questions indicated with an (N) are unlikely to appear in this form on an in-class examination, but you should be able to describe the procedures

Cov(x, y) V ar(x)v ar(y)

Simple linear regression Systematic components: β 0 + β 1 x i Stochastic component : error term ε Y i = β 0 + β 1 x i + ε i ; i = 1,..., n E(Y X) = β 0 + β 1 x the central parameter is the slope parameter

Regression Analysis: Basic Concepts

The simple linear model Regression Analysis: Basic Concepts Allin Cottrell Represents the dependent variable, y i, as a linear function of one independent variable, x i, subject to a random disturbance

12-1 Multiple Linear Regression Models

12-1.1 Introduction Many applications of regression analysis involve situations in which there are more than one regressor variable. A regression model that contains more than one regressor variable is

POLYNOMIAL AND MULTIPLE REGRESSION. Polynomial regression used to fit nonlinear (e.g. curvilinear) data into a least squares linear regression model.

Polynomial Regression POLYNOMIAL AND MULTIPLE REGRESSION Polynomial regression used to fit nonlinear (e.g. curvilinear) data into a least squares linear regression model. It is a form of linear regression

Statistics II Final Exam - January Use the University stationery to give your answers to the following questions.

Statistics II Final Exam - January 2012 Use the University stationery to give your answers to the following questions. Do not forget to write down your name and class group in each page. Indicate clearly

Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression

Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression Objectives: To perform a hypothesis test concerning the slope of a least squares line To recognize that testing for a

Regression in ANOVA. James H. Steiger. Department of Psychology and Human Development Vanderbilt University

Regression in ANOVA James H. Steiger Department of Psychology and Human Development Vanderbilt University James H. Steiger (Vanderbilt University) 1 / 30 Regression in ANOVA 1 Introduction 2 Basic Linear

, has mean A) 0.3. B) the smaller of 0.8 and 0.5. C) 0.15. D) which cannot be determined without knowing the sample results.

BA 275 Review Problems - Week 9 (11/20/06-11/24/06) CD Lessons: 69, 70, 16-20 Textbook: pp. 520-528, 111-124, 133-141 An SRS of size 100 is taken from a population having proportion 0.8 of successes. An

Multiple Linear Regression

Multiple Linear Regression A regression with two or more explanatory variables is called a multiple regression. Rather than modeling the mean response as a straight line, as in simple regression, it is

Using R for Linear Regression

Using R for Linear Regression In the following handout words and symbols in bold are R functions and words and symbols in italics are entries supplied by the user; underlined words and symbols are optional

Chapter 11: Linear Regression - Inference in Regression Analysis - Part 2

Chapter 11: Linear Regression - Inference in Regression Analysis - Part 2 Note: Whether we calculate confidence intervals or perform hypothesis tests we need the distribution of the statistic we will use.

Weighted Least Squares

Weighted Least Squares The standard linear model assumes that Var(ε i ) = σ 2 for i = 1,..., n. As we have seen, however, there are instances where Var(Y X = x i ) = Var(ε i ) = σ2 w i. Here w 1,..., w

Regression Analysis: A Complete Example

Regression Analysis: A Complete Example This section works out an example that includes all the topics we have discussed so far in this chapter. A complete example of regression analysis. PhotoDisc, Inc./Getty

0.1 Multiple Regression Models

0.1 Multiple Regression Models We will introduce the multiple Regression model as a mean of relating one numerical response variable y to two or more independent (or predictor variables. We will see different

Simple Linear Regression

Inference for Regression Simple Linear Regression IPS Chapter 10.1 2009 W.H. Freeman and Company Objectives (IPS Chapter 10.1) Simple linear regression Statistical model for linear regression Estimating

e = random error, assumed to be normally distributed with mean 0 and standard deviation σ

1 Linear Regression 1.1 Simple Linear Regression Model The linear regression model is applied if we want to model a numeric response variable and its dependency on at least one numeric factor variable.

Two-Variable Regression: Interval Estimation and Hypothesis Testing

Two-Variable Regression: Interval Estimation and Hypothesis Testing Jamie Monogan University of Georgia Intermediate Political Methodology Jamie Monogan (UGA) Confidence Intervals & Hypothesis Testing

Regression Analysis. Pekka Tolonen

Regression Analysis Pekka Tolonen Outline of Topics Simple linear regression: the form and estimation Hypothesis testing and statistical significance Empirical application: the capital asset pricing model

Regression in SPSS. Workshop offered by the Mississippi Center for Supercomputing Research and the UM Office of Information Technology

Regression in SPSS Workshop offered by the Mississippi Center for Supercomputing Research and the UM Office of Information Technology John P. Bentley Department of Pharmacy Administration University of

We extended the additive model in two variables to the interaction model by adding a third term to the equation.

Quadratic Models We extended the additive model in two variables to the interaction model by adding a third term to the equation. Similarly, we can extend the linear model in one variable to the quadratic

Math 62 Statistics Sample Exam Questions

Math 62 Statistics Sample Exam Questions 1. (10) Explain the difference between the distribution of a population and the sampling distribution of a statistic, such as the mean, of a sample randomly selected

E205 Final: Version B

Name: Class: Date: E205 Final: Version B Multiple Choice Identify the choice that best completes the statement or answers the question. 1. The owner of a local nightclub has recently surveyed a random

Chapter 5: Linear regression

Chapter 5: Linear regression Last lecture: Ch 4............................................................ 2 Next: Ch 5................................................................. 3 Simple linear

In Chapter 2, we used linear regression to describe linear relationships. The setting for this is a

Math 143 Inference on Regression 1 Review of Linear Regression In Chapter 2, we used linear regression to describe linear relationships. The setting for this is a bivariate data set (i.e., a list of cases/subjects

SIMPLE REGRESSION ANALYSIS

SIMPLE REGRESSION ANALYSIS Introduction. Regression analysis is used when two or more variables are thought to be systematically connected by a linear relationship. In simple regression, we have only two

Chapter 10 Correlation and Regression. Overview. Section 10-2 Correlation Key Concept. Definition. Definition. Exploring the Data

Chapter 10 Correlation and Regression 10-1 Overview 10-2 Correlation 10- Regression Overview This chapter introduces important methods for making inferences about a correlation (or relationship) between

, then the form of the model is given by: which comprises a deterministic component involving the three regression coefficients (

Multiple regression Introduction Multiple regression is a logical extension of the principles of simple linear regression to situations in which there are several predictor variables. For instance if we

Statistical Modelling in Stata 5: Linear Models

Statistical Modelling in Stata 5: Linear Models Mark Lunt Arthritis Research UK Centre for Excellence in Epidemiology University of Manchester 08/11/2016 Structure This Week What is a linear model? How

A brush-up on residuals

A brush-up on residuals Going back to a single level model... y i = β 0 + β 1 x 1i + ε i The residual for each observation is the difference between the value of y predicted by the equation and the actual

The scatterplot indicates a positive linear relationship between waist size and body fat percentage:

STAT E-150 Statistical Methods Multiple Regression Three percent of a man's body is essential fat, which is necessary for a healthy body. However, too much body fat can be dangerous. For men between the

Stat 411/511 ANOVA & REGRESSION. Charlotte Wickham. stat511.cwick.co.nz. Nov 31st 2015

Stat 411/511 ANOVA & REGRESSION Nov 31st 2015 Charlotte Wickham stat511.cwick.co.nz This week Today: Lack of fit F-test Weds: Review email me topics, otherwise I ll go over some of last year s final exam

1) Write the following as an algebraic expression using x as the variable: Triple a number subtracted from the number

1) Write the following as an algebraic expression using x as the variable: Triple a number subtracted from the number A. 3(x - x) B. x 3 x C. 3x - x D. x - 3x 2) Write the following as an algebraic expression

Section I: Multiple Choice Select the best answer for each question.

Chapter 15 (Regression Inference) AP Statistics Practice Test (TPS- 4 p796) Section I: Multiple Choice Select the best answer for each question. 1. Which of the following is not one of the conditions that

Topic 3 Correlation and Regression

Topic 3 Correlation and Regression Linear Regression I 1 / 15 Outline Principle of Least Squares Regression Equations Residuals 2 / 15 Introduction Covariance and correlation are measures of linear association.

Exam and Solution. Please discuss each problem on a separate sheet of paper, not just on a separate page!

Econometrics - Exam 1 Exam and Solution Please discuss each problem on a separate sheet of paper, not just on a separate page! Problem 1: (20 points A health economist plans to evaluate whether screening

Lecture 5: Linear least-squares Regression III: Advanced Methods William G. Jacoby Department of Political Science Michigan State University http://polisci.msu.edu/jacoby/icpsr/regress3 Simple Linear Regression

Regression Analysis Prof. Soumen Maity Department of Mathematics Indian Institute of Technology, Kharagpur. Lecture - 2 Simple Linear Regression

Regression Analysis Prof. Soumen Maity Department of Mathematics Indian Institute of Technology, Kharagpur Lecture - 2 Simple Linear Regression Hi, this is my second lecture in module one and on simple

Lecture 5 Hypothesis Testing in Multiple Linear Regression

Lecture 5 Hypothesis Testing in Multiple Linear Regression BIOST 515 January 20, 2004 Types of tests 1 Overall test Test for addition of a single variable Test for addition of a group of variables Overall

CHAPTER 13 SIMPLE LINEAR REGRESSION. Opening Example. Simple Regression. Linear Regression

Opening Example CHAPTER 13 SIMPLE LINEAR REGREION SIMPLE LINEAR REGREION! Simple Regression! Linear Regression Simple Regression Definition A regression model is a mathematical equation that descries the

Paired Differences and Regression

Paired Differences and Regression Students sometimes have difficulty distinguishing between paired data and independent samples when comparing two means. One can return to this topic after covering simple

Lecture 5: Correlation and Linear Regression

Lecture 5: Correlation and Linear Regression 3.5. (Pearson) correlation coefficient The correlation coefficient measures the strength of the linear relationship between two variables. The correlation is

Chapter 9. Section Correlation

Chapter 9 Section 9.1 - Correlation Objectives: Introduce linear correlation, independent and dependent variables, and the types of correlation Find a correlation coefficient Test a population correlation

" Y. Notation and Equations for Regression Lecture 11/4. Notation:

Notation: Notation and Equations for Regression Lecture 11/4 m: The number of predictor variables in a regression Xi: One of multiple predictor variables. The subscript i represents any number from 1 through

OLS is not only unbiased it is also the most precise (efficient) unbiased estimation technique - ie the estimator has the smallest variance

Lecture 5: Hypothesis Testing What we know now: OLS is not only unbiased it is also the most precise (efficient) unbiased estimation technique - ie the estimator has the smallest variance (if the Gauss-Markov

Statistics 112 Regression Cheatsheet Section 1B - Ryan Rosario

Statistics 112 Regression Cheatsheet Section 1B - Ryan Rosario I have found that the best way to practice regression is by brute force That is, given nothing but a dataset and your mind, compute everything

12.1 Inference for Linear Regression

12.1 Inference for Linear Regression Least Squares Regression Line y = a + bx You might want to refresh your memory of LSR lines by reviewing Chapter 3! 1 Sample Distribution of b p740 Shape Center Spread

Part (Semi Partial) and Partial Regression Coefficients

Part (Semi Partial) and Partial Regression Coefficients Hervé Abdi 1 1 overview The semi-partial regression coefficient also called part correlation is used to express the specific portion of variance

1. ε is normally distributed with a mean of 0 2. the variance, σ 2, is constant 3. All pairs of error terms are uncorrelated

STAT E-150 Statistical Methods Residual Analysis; Data Transformations The validity of the inference methods (hypothesis testing, confidence intervals, and prediction intervals) depends on the error term,

Multiple Hypothesis Testing: The F-test

Multiple Hypothesis Testing: The F-test Matt Blackwell December 3, 2008 1 A bit of review When moving into the matrix version of linear regression, it is easy to lose sight of the big picture and get lost

Inferential Statistics

Inferential Statistics Sampling and the normal distribution Z-scores Confidence levels and intervals Hypothesis testing Commonly used statistical methods Inferential Statistics Descriptive statistics are

Correlation and Simple Linear Regression

Correlation and Simple Linear Regression We are often interested in studying the relationship among variables to determine whether they are associated with one another. When we think that changes in a

Chapter 3: Describing Relationships

Chapter 3: Describing Relationships The Practice of Statistics, 4 th edition For AP* STARNES, YATES, MOORE Chapter 3 2 Describing Relationships 3.1 Scatterplots and Correlation 3.2 Learning Targets After

Chapter 2. Looking at Data: Relationships. Introduction to the Practice of STATISTICS SEVENTH. Moore / McCabe / Craig. Lecture Presentation Slides

Chapter 2 Looking at Data: Relationships Introduction to the Practice of STATISTICS SEVENTH EDITION Moore / McCabe / Craig Lecture Presentation Slides Chapter 2 Looking at Data: Relationships 2.1 Scatterplots

Inference in Regression Analysis. Dr. Frank Wood

Inference in Regression Analysis Dr. Frank Wood Inference in the Normal Error Regression Model Y i = β 0 + β 1 X i + ɛ i Y i value of the response variable in the i th trial β 0 and β 1 are parameters

Statistical Models in R

Statistical Models in R Some Examples Steven Buechler Department of Mathematics 276B Hurley Hall; 1-6233 Fall, 2007 Outline Statistical Models Linear Models in R Regression Regression analysis is the appropriate

Prediction and Confidence Intervals in Regression

Fall Semester, 2001 Statistics 621 Lecture 3 Robert Stine 1 Prediction and Confidence Intervals in Regression Preliminaries Teaching assistants See them in Room 3009 SH-DH. Hours are detailed in the syllabus.

Hence, multiplying by 12, the 95% interval for the hourly rate is (965, 1435)

Confidence Intervals for Poisson data For an observation from a Poisson distribution, we have σ 2 = λ. If we observe r events, then our estimate ˆλ = r : N(λ, λ) If r is bigger than 20, we can use this

Simple linear regression

Simple linear regression Introduction Simple linear regression is a statistical method for obtaining a formula to predict values of one variable from another where there is a causal relationship between

A correlation exists between two variables when one of them is related to the other in some way.

Lecture #10 Chapter 10 Correlation and Regression The main focus of this chapter is to form inferences based on sample data that come in pairs. Given such paired sample data, we want to determine whether

Inference for Regression

Simple Linear Regression Inference for Regression The simple linear regression model Estimating regression parameters; Confidence intervals and significance tests for regression parameters Inference about

MA 134 Lecture Notes August 20, 2012 Introduction The purpose of this lecture is to... Introduction The purpose of this lecture is to... Learn about different types of equations Introduction The purpose

The importance of graphing the data: Anscombe s regression examples

The importance of graphing the data: Anscombe s regression examples Bruce Weaver Northern Health Research Conference Nipissing University, North Bay May 30-31, 2008 B. Weaver, NHRC 2008 1 The Objective

Part 2: Analysis of Relationship Between Two Variables

Part 2: Analysis of Relationship Between Two Variables Linear Regression Linear correlation Significance Tests Multiple regression Linear Regression Y = a X + b Dependent Variable Independent Variable

Assessing Forecasting Error: The Prediction Interval. David Gerbing School of Business Administration Portland State University

Assessing Forecasting Error: The Prediction Interval David Gerbing School of Business Administration Portland State University November 27, 2015 Contents 1 Prediction Intervals 1 1.1 Modeling Error..................................

12/31/2016. PSY 512: Advanced Statistics for Psychological and Behavioral Research 2

PSY 512: Advanced Statistics for Psychological and Behavioral Research 2 Understand linear regression with a single predictor Understand how we assess the fit of a regression model Total Sum of Squares

Example: Boats and Manatees

Figure 9-6 Example: Boats and Manatees Slide 1 Given the sample data in Table 9-1, find the value of the linear correlation coefficient r, then refer to Table A-6 to determine whether there is a significant

PASS Sample Size Software. Linear Regression

Chapter 855 Introduction Linear regression is a commonly used procedure in statistical analysis. One of the main objectives in linear regression analysis is to test hypotheses about the slope (sometimes

Sydney Roberts Predicting Age Group Swimmers 50 Freestyle Time 1. 1. Introduction p. 2. 2. Statistical Methods Used p. 5. 3. 10 and under Males p.

Sydney Roberts Predicting Age Group Swimmers 50 Freestyle Time 1 Table of Contents 1. Introduction p. 2 2. Statistical Methods Used p. 5 3. 10 and under Males p. 8 4. 11 and up Males p. 10 5. 10 and under

Multiple Linear Regression. Multiple linear regression is the extension of simple linear regression to the case of two or more independent variables.

1 Multiple Linear Regression Basic Concepts Multiple linear regression is the extension of simple linear regression to the case of two or more independent variables. In simple linear regression, we had

Chapter 7: Simple linear regression Learning Objectives

Chapter 7: Simple linear regression Learning Objectives Reading: Section 7.1 of OpenIntro Statistics Video: Correlation vs. causation, YouTube (2:19) Video: Intro to Linear Regression, YouTube (5:18) -

15.1 The Regression Model: Analysis of Residuals

15.1 The Regression Model: Analysis of Residuals Tom Lewis Fall Term 2009 Tom Lewis () 15.1 The Regression Model: Analysis of Residuals Fall Term 2009 1 / 12 Outline 1 The regression model 2 Estimating

Chapter 5. Regression Models

Chapter 5: Regression Models 118 April 9, 2013 Chapter 5. Regression Models Regression analysis is probably the most used tool in statistics. Regression deals with modeling how one variable (called a response)

Hypothesis Testing Level I Quantitative Methods. IFT Notes for the CFA exam

Hypothesis Testing 2014 Level I Quantitative Methods IFT Notes for the CFA exam Contents 1. Introduction... 3 2. Hypothesis Testing... 3 3. Hypothesis Tests Concerning the Mean... 10 4. Hypothesis Tests

NCSS Statistical Software Principal Components Regression. In ordinary least squares, the regression coefficients are estimated using the formula ( )

Chapter 340 Principal Components Regression Introduction is a technique for analyzing multiple regression data that suffer from multicollinearity. When multicollinearity occurs, least squares estimates

The Simple Linear Regression Model: Specification and Estimation

Chapter 3 The Simple Linear Regression Model: Specification and Estimation 3.1 An Economic Model Suppose that we are interested in studying the relationship between household income and expenditure on

Section 1.5 Linear Models

Section 1.5 Linear Models Some real-life problems can be modeled using linear equations. Now that we know how to find the slope of a line, the equation of a line, and the point of intersection of two lines,

Statistiek II. John Nerbonne. March 24, 2010. Information Science, Groningen Slides improved a lot by Harmut Fitz, Groningen!

Information Science, Groningen j.nerbonne@rug.nl Slides improved a lot by Harmut Fitz, Groningen! March 24, 2010 Correlation and regression We often wish to compare two different variables Examples: compare

SPSS Guide: Regression Analysis

SPSS Guide: Regression Analysis I put this together to give you a step-by-step guide for replicating what we did in the computer lab. It should help you run the tests we covered. The best way to get familiar

Introduction to Regression. Dr. Tom Pierce Radford University

Introduction to Regression Dr. Tom Pierce Radford University In the chapter on correlational techniques we focused on the Pearson R as a tool for learning about the relationship between two variables.

Wooldridge, Introductory Econometrics, 4th ed. Multiple regression analysis:

Wooldridge, Introductory Econometrics, 4th ed. Chapter 4: Inference Multiple regression analysis: We have discussed the conditions under which OLS estimators are unbiased, and derived the variances of

1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96

1 Final Review 2 Review 2.1 CI 1-propZint Scenario 1 A TV manufacturer claims in its warranty brochure that in the past not more than 10 percent of its TV sets needed any repair during the first two years

Linear Regression with One Regressor

Linear Regression with One Regressor Michael Ash Lecture 10 Analogy to the Mean True parameter µ Y β 0 and β 1 Meaning Central tendency Intercept and slope E(Y ) E(Y X ) = β 0 + β 1 X Data Y i (X i, Y

Chapter 4 Describing the Relation between Two Variables

Chapter 4 Describing the Relation between Two Variables 4.1 Scatter Diagrams and Correlation The response variable is the variable whose value can be explained by the value of the explanatory or predictor

OLS in Matrix Form. Let y be an n 1 vector of observations on the dependent variable.

OLS in Matrix Form 1 The True Model Let X be an n k matrix where we have observations on k independent variables for n observations Since our model will usually contain a constant term, one of the columns

Multivariate Analysis of Variance (MANOVA)

Multivariate Analysis of Variance (MANOVA) Example: (Spector, 987) describes a study of two drugs on human heart rate There are 24 subjects enrolled in the study which are assigned at random to one of

(d) True or false? When the number of treatments a=9, the number of blocks b=10, and the other parameters r =10 and k=9, it is a BIBD design.

PhD Qualifying exam Methodology Jan 2014 Solutions 1. True or false question - only circle "true " or "false" (a) True or false? F-statistic can be used for checking the equality of two population variances

5. Linear Regression

5. Linear Regression Outline.................................................................... 2 Simple linear regression 3 Linear model............................................................. 4

Lecture 7 Linear Regression Diagnostics

Lecture 7 Linear Regression Diagnostics BIOST 515 January 27, 2004 BIOST 515, Lecture 6 Major assumptions 1. The relationship between the outcomes and the predictors is (approximately) linear. 2. The error

Questions and Answers on Hypothesis Testing and Confidence Intervals

Questions and Answers on Hypothesis Testing and Confidence Intervals L. Magee Fall, 2008 1. Using 25 observations and 5 regressors, including the constant term, a researcher estimates a linear regression

Simple Regression Theory II 2010 Samuel L. Baker

SIMPLE REGRESSION THEORY II 1 Simple Regression Theory II 2010 Samuel L. Baker Assessing how good the regression equation is likely to be Assignment 1A gets into drawing inferences about how close the

Chapter 13 Introduction to Linear Regression and Correlation Analysis

Chapter 3 Student Lecture Notes 3- Chapter 3 Introduction to Linear Regression and Correlation Analsis Fall 2006 Fundamentals of Business Statistics Chapter Goals To understand the methods for displaing