For more information about how to cite these materials visit

Size: px
Start display at page:

Download "For more information about how to cite these materials visit"

Transcription

1 Author(s): Kerby Shedden, Ph.D., 2010 License: Unless otherwise noted, this material is made available under the terms of the Creative Commons Attribution Share Alike 3.0 License: We have reviewed this material in accordance with U.S. Copyright Law and have tried to maximize your ability to use, share, and adapt it. The citation key on the following slide provides information about how you may share and adapt this material. Copyright holders of content included in this material should contact with any questions, corrections, or clarification regarding the use of content. For more information about how to cite these materials visit Any medical information in this material is intended to inform and educate and is not a tool for self-diagnosis or a replacement for medical evaluation, advice, diagnosis or treatment by a healthcare professional. Please speak to your physician if you have questions about your medical condition. Viewer discretion is advised: Some medical content is graphic and may not be suitable for all viewers. 1 / 1

2 Decomposing Variance Kerby Shedden Department of Statistics, University of Michigan October 5, / 1

3 Law of total variation For any regression model involving a response Y and a covariate vector X, we have var(y ) = var X E(Y X ) + E X var(y X ). Note that this only makes sense if we treat X as being random. We often wish to distinguish these two situations: The population is homoscedastic: var(y X ) does not depend on X, so we can simply write var(y X ) = σ 2, and we get var(y ) = var X E(Y X ) + σ 2. The population is heteroscedastic: var(y X ) is a function σ 2 (X ) with expected value σ 2 = E X σ 2 (X ), and again we get var(y ) = var X E(Y X ) + σ 2. If we write Y = f (X ) + ɛ with E(ɛ X ) = 0, then E(Y X ) = f (X ), and var X E(Y X ) summarizes the variation of f (X ) over the marginal distribution of X. 3 / 1

4 Law of total variation 4 3 E(Y X) X Orange curves: conditional distributions of Y given X Purple curve: marginal distribution of Y Black dots: conditional means of Y given X 4 / 1

5 Pearson correlation The population Pearson correlation coefficient of two jointly distributed scalar-valued random variables X and Y is ρ XY cov(x, Y ) σ X σ Y. Given data Y = (Y 1,..., Y n ) and X = (X 1,..., X n ), the Pearson correlation coefficient is estimated by ˆρ XY = ĉov(x, Y ) ˆσ X ˆσ Y = i (X i X )(Y i Ȳ ) i (X i X ) 2 i (Y i Ȳ ) 2 = (X X ) (Y Ȳ ) X X Y Ȳ. When we write Y Ȳ here, this means Y Ȳ 1, where 1 is a vector of 1 s, and Ȳ is a scalar. 5 / 1

6 Pearson correlation By the Cauchy-Schwartz inequality, 1 ρ XY 1 1 ˆρ XY 1. The sample correlation coefficient is slightly biased, but the bias is so small that it is usually ignored. 6 / 1

7 Pearson correlation and simple linear regression slopes For the simple linear regression model Y = α + βx + ɛ, if we view X as a random variable that is uncorrelated with ɛ, then and the correlation is cov(x, Y ) = βσ 2 X β ρ XY cor(x, Y ) =. β2 + σ 2 /σx 2 The sample correlation coefficient is related to the least squares slope estimate: ˆβ = ĉov(x, Y ) ˆσ 2 X = ˆρ XY ˆσ Y ˆσ X. 7 / 1

8 Orthogonality between fitted values and residuals Recall that the fitted values are and the residuals are Ŷ = X ˆβ = PY R = Y Ŷ = (I P)Y. Since P(I P) = 0 it follows that Ŷ R = 0. since R = 0, it is equivalent to state that the sample correlation between R and Ŷ is zero, i.e. ĉor(r, Ŷ ) = 0. 8 / 1

9 Coefficient of determination A descriptive summary of the explanatory power of X for Y is given by the coefficient of determination, also known as the proportion of explained variance, or multiple R 2. This is the quantity R 2 1 Y Ŷ 2 Ŷ Ȳ 2 var(ŷ ) = = Y Ȳ 2 Y Ȳ 2 var(y ). The equivalence between the two expressions follows from the identity Y Ȳ 2 = Y Ŷ + Ŷ Ȳ 2 = Y Ŷ 2 + Ŷ Ȳ 2 + 2(Y Ŷ ) (Ŷ Ȳ ) = Y Ŷ 2 + Ŷ Ȳ 2, It should be clear that R 2 = 0 iff Ŷ = Ȳ and R 2 = 1 iff Ŷ = Y. 9 / 1

10 Coefficient of determination The coefficient of determination is equal to To see this, note that ĉor(ŷ, Y )2. ĉor(ŷ, Y ) = (Ŷ Ȳ ) (Y Ȳ ) Ŷ Ȳ Y Ȳ = (Ŷ Ȳ ) (Y Ŷ + Ŷ Ȳ ) Ŷ Ȳ Y Ȳ = (Ŷ Ȳ ) (Y Ŷ ) + (Ŷ Ȳ ) (Ŷ Ȳ ) Ŷ Ȳ Y Ȳ = Ŷ Ȳ Y Ȳ. 10 / 1

11 Coefficient of determination in simple linear regression In general, R 2 = ĉor(y, Ŷ )2 = ĉov(y, Ŷ )2 var(y ) var(ŷ ). In the case of simple linear regression, ĉov(y, Ŷ ) = ĉov(y, ˆα + ˆβX ) = ˆβ ĉov(y, X ), and var(ŷ ) = var(ˆα + ˆβX ) = ˆβ 2 var(x ) Thus for simple linear regression, R 2 = ĉor(y, X ) 2 = ĉor(y, Ŷ )2. 11 / 1

12 Relationship to the F statistic The F-statistic for the null hypothesis is β 1 =... = β p = 0 Ŷ Ȳ 2 Y Ŷ n p 1 = R2 2 p 1 R 2 n p 1, p which is an increasing function of R / 1

13 Adjusted R 2 The sample R 2 is an estimate of the population R 2 : 1 var(y X ) var(y ). Since it is a ratio, the plug-in estimate R 2 is biased, although the bias is not large unless the sample size is small or the number of covariates is large. The adjusted R 2 is an approximately unbiased estimate of the population R 2 : 1 (1 R 2 n 1 ) n p 1. The adjusted R 2 is always less than the unadjusted R 2. The adjusted R 2 is always less than or equal to one, but can be negative. 13 / 1

14 The unique variation in one covariate How much information about Y is present in a covariate X k? This question is not straightforward when the covariates are non-orthogonal, since several covariates may contain overlapping information about Y. Let Xk be the residual of X k after regressing it against all other covariates (including the intercept). If P k is the projection onto span({x j, j k}), then X k = (I P k )X k. We could use var(x k )/ var(x k) to assess how much of the variation in X k is unique in that it is not also captured by other predictors. But this measure doesn t involve Y, so it can t tell us whether the unique variation in X k is useful in the regression analysis. 14 / 1

15 The unique regression information in one covariate To learn how X k contributes uniquely to the regression, we can consider how introducing X k to a working regression model affects the R 2. Let Ŷ k = P k Y be the fitted values in the model omitting covariate k. Let R 2 denote the multiple R 2 for the full model, and let R k 2 be the multiple R 2 for the regression omitting covariate X k. The value of R 2 R 2 k is a way to quantify how much unique information about Y in X k is not captured by the other covariates. This is called the semi-partial R / 1

16 Identity involving norms of fitted values and residuals Before we continue, we will need a simple identity that is often useful. In general, if A and B are orthogonal, then A + B 2 = A 2 + B 2. If A and B A are orthogonal, then B 2 = B A + A 2 = B A 2 + A 2. Thus we have B 2 A 2 = B A 2. Applying this fact to regression, we know that the fitted values and residuals are orthogonal. Thus for the regression omitting variable k, Ŷ k and Y Ŷ k are orthogonal, so so Y Ŷ k 2 = Y 2 Ŷ k 2. By the same argument, Y Ŷ 2 = Y 2 Ŷ / 1

17 Improvement in R 2 due to one covariate Now we can obtain a simple, direct expression for the semi-partial R 2. Since X k and is orthogonal to the other covariates, Ŷ = Ŷ k + Y, X k X k, X k X k, Ŷ 2 = Ŷ k 2 + Y, X k 2 / X k / 1

18 Improvement in R 2 due to one covariate Thus we have R 2 = 1 Y Ŷ 2 Y Ȳ 2 = 1 Y 2 Ŷ 2 Y Ȳ 2 = 1 Y 2 Ŷ k 2 Y, X k 2 / X k 2 Y Ȳ 2 = 1 Y Ŷ k 2 Y Ȳ + Y, X k 2 / Xk 2 2 Y Ȳ 2 = R 2 k + Y, X k 2 / X k 2 Y Ȳ / 1

19 Semi-partial R 2 Thus the semi-partial R 2 is R 2 R 2 k = Y, X k 2 / X k 2 Y Ȳ 2 = Y, X k / X k 2 Y Ȳ 2 where Ŷk is the fitted value for regressing Y on X k. Since X k / X k is centered and has length 1, it follows that R 2 R 2 k = ĉor(y, X k ) 2 = ĉor(y, Ŷ k ) 2. Thus the semi-partial R 2 for covariate k has two equivalent interpretations: It is the improvement in R 2 resulting from including covariate k in a working regression model that already contains the other covariates. It is the R 2 for a simple linear regression of Y on X k = (I P k)x k. 19 / 1

20 Partial R 2 The partial R 2 is R 2 R 2 k 1 R 2 k = Y, X k 2 / X k 2 Y Ŷ k 2. The partial R 2 for covariate k is the fraction of the maximum possible improvement in R 2 that is contributed by covariate k. Let Ŷ k be the fitted values for regressing Y on all covariates except X k. Since Ŷ k X k = 0, Y, X k 2 Y Ŷ k 2 X k 2 = Y Ŷ k, X k 2 Y Ŷ k 2 X k 2 The expression on the left is the usual R 2 that would be obtained when regressing Y Ŷ k on X k. Thus the partial R2 is the same as the usual R 2 for (I P k )Y regressed on (I P k )X k. 20 / 1

21 Decomposition of projection matrices Suppose P R n n is a rank-d projection matrix, and U is a n d orthogonal matrix whose columns span col(p). If we partition U by columns U = U 1 U 2 U d, then P = UU, so we can write P = d U j U j. Note that this representation is not unique, since there are different orthogonal bases for col(p). j=1 Each summand U j U j R n n is a rank-1 projection matrix onto U j. 21 / 1

22 Decomposition of R 2 Question: In a multiple regression model, how much of the variance in Y is explained by a particular covariate? Orthogonal case: If the design matrix X is orthogonal (X X = I ), the projection P onto col(x ) can be decomposed as P = p p P j = 11 n + X j X j, j=0 where X j is the j th column of the design matrix (assuming here that the first column of X is an intercept). j=1 22 / 1

23 Decomposition of R 2 (orthogonal case) The n n rank-1 matrix P j = X j X j is the projection onto span(x j ) (and P 0 is the projection onto the span of the vector of 1 s). Furthermore, by orthogonality, P j P k = 0 unless j = k. Since by orthogonality p Ŷ Ȳ = P j Y, j=1 p Ŷ Ȳ 2 = P j Y 2. Here we are using the fact that if U 1,..., U m are orthogonal, then U U m 2 = U U m 2. j=1 23 / 1

24 Decomposition of R 2 (orthogonal case) The R 2 for simple linear regression of Y on X j is R 2 j Ŷ Ȳ 2 / Y Ȳ 2 = P j Y 2 / Y Ȳ 2, so we see that for orthogonal design matrices, R 2 = p Rj 2. That is, the overall coefficient of determination is the sum of univariate coefficients of determination for all the explanatory variables. j=1 24 / 1

25 Decomposition of R 2 Non-orthogonal case: If X is not orthogonal, the overall R 2 will not be the sum of single covariate R 2 s. If we let R 2 j be as above (the R 2 values for regressing Y on each X j ), then there are two different situations: j R2 j > R 2, and j R2 j < R / 1

26 Decomposition of R 2 Case 1: R 2 j > R 2 It s not surprising that j R2 j suppose that can be bigger than R 2. For example, Y = X 1 + ɛ is the data generating model, and X 2 is highly correlated with X 1 (but is not part of the data generating model). For the regression of Y on both X 1 and X 2, the multiple R 2 will be 1 σ 2 /var(y ) (since E(Y X 1, X 2 ) = E(Y X 1 ) = X 1 ). The R 2 values for Y regressed on either X 1 or X 2 separately will also be approximately 1 σ 2 /var(y ). Thus R R2 2 2R2. 26 / 1

27 Decomposition of R 2 Case 2: j R2 j < R 2 This is more surprising, and is sometimes called enhancement. As an example, suppose the data generating model is Y = Z + ɛ, but we don t observe Z (for simplicity assume EZ = 0). Instead, we observe a value X 1 that satisfies X 1 = Z + X 2, where X 2 has mean 0 and is independent of Z and ɛ. Since X 2 is independent of Z and ɛ, it is also independent of Y, thus R2 2 0 for large n. 27 / 1

28 Decomposition of R 2 (enhancement example) The multiple R 2 of Y on X 1 and X 2 is approximately σ 2 Z /(σ2 Z + σ2 ) for large n, since the fitted values will converge to Ŷ = X 1 X 2 = Z. To calculate R 2 1, first note that for the regression of Y on X 1, and ˆβ = ĉov(y, X 1) var(x 1 ) σ2 Z σz 2 + σ2 X 2 ˆα / 1

29 Decomposition of R 2 (enhancement example) Therefore for large n, n 1 Y Ŷ 2 n 1 Z + ɛ σz 2 X 1 /(σz 2 + σx 2 2 ) 2 = n 1 σx 2 2 Z/(σZ 2 + σx 2 2 ) + ɛ σz 2 X 2 /(σz 2 + σx 2 2 ) 2 = σx 4 2 σz 2 /(σz 2 + σx 2 2 ) 2 + σ 2 + σz 4 σx 2 2 /(σz 2 + σx 2 2 ) 2 = σx 2 2 σz 2 /(σz 2 + σx 2 2 ) + σ 2. Therefore R 2 1 = 1 n 1 Y Ŷ 2 n 1 Y Ȳ 2 1 σ2 X 2 σ 2 Z /(σ2 Z + σ2 X 2 ) + σ 2 σ 2 Z + σ2. = σ 2 Z (σ 2 Z + σ2 )(1 + σ 2 X 2 /σ 2 Z ) 29 / 1

30 Decomposition of R 2 (enhancement example) Thus R1 2 /R 2 1/(1 + σx 2 2 /σz 2 ), which is strictly less than one if σx 2 2 > 0. Since R2 2 = 0, it follows that R2 > R1 2 + R2 2. The reason for this is that while X 2 contains no directly useful information about Y (hence R2 2 = 0), it can remove the measurement error in X 1, making X 1 a better predictor of Z. 30 / 1

31 Decomposition of R 2 (enhancement example) We can also calculate the limiting partial R 2 for adding X 2 to a model that already contains X 1 : σ 2 X 2 σ 2 X 2 + σ 2 (1 + σ 2 X 2 /σ 2 Z ). 31 / 1

32 Partial R 2 example 2 Suppose the design matrix satisfies X X /n = and the data generating model is r 0 r 1 with var ɛ = σ 2. Y = X 1 + X 2 + ɛ 32 / 1

33 Partial R 2 example 2 We will calculate the partial R 2 for X 1, using the fact that the partial R 2 is the regular R 2 for regressing on (I P 1 )Y (I P 1 )X 1 where P 1 is the projection onto span ({1, X 2 }). Since this is a simple linear regression, the partial R 2 can be expressed ĉor((i P 1 )Y, (I P 1 )X 1 ) / 1

34 Partial R 2 example 2 The numerator of the partial R 2 is the square of ĉov((i P 1 )Y, (I P 1 )X 1 ) = Y (I P 1 )X 1 /n = (X 1 + X 2 + ɛ) (X 1 rx 2 )/n 1 r 2. The denominator contains two factors. The first is (I P 1 )X 1 2 /n = X 1(I P 1 )X 1 /n = X 1(X 1 rx 2 )/n 1 r / 1

35 Partial R 2 example 2 The other factor in the denominator is Y (I P 1 )Y /n: Y (I P 1 )Y /n = (X 1 + X 2 ) (I P 1 )(X 1 + X 2 )/n + ɛ (I P 1 )ɛ/n + 2ɛ (I P 1 )(X 1 + X 2 )/n (X 1 + X 2 ) (X 1 rx 2 )/n + σ 2 1 r 2 + σ 2. Thus we get that the partial R 2 is approximately equal to 1 r 2 1 r 2 + σ 2. If r = 1 then the result is zero (X 1 has no unique explanatory power), and if r = 0, the result is 1/(1 + σ 2 ), indicating that after controlling for X 2, around 1/(1 + σ 2 ) fraction of the remaining variance is explained by X 1 (the rest is due to ɛ). 35 / 1

36 Summary Each of the three R 2 values can be expressed either in terms of variance ratios, or as a squared correlation coefficient: Multiple R 2 Semi-partial R 2 Partial R 2 VR Ŷ Ȳ 2 / Y Ȳ 2 R 2 R k 2 (R 2 R k)/(1 2 R k) 2 Correlation ccor(ŷ, Y ) 2 ccor(y, Xk ) 2 ccor((i P k )Y, Xk ) 2 36 / 1

Review Jeopardy. Blue vs. Orange. Review Jeopardy

Review Jeopardy. Blue vs. Orange. Review Jeopardy Review Jeopardy Blue vs. Orange Review Jeopardy Jeopardy Round Lectures 0-3 Jeopardy Round $200 How could I measure how far apart (i.e. how different) two observations, y 1 and y 2, are from each other?

More information

Sections 2.11 and 5.8

Sections 2.11 and 5.8 Sections 211 and 58 Timothy Hanson Department of Statistics, University of South Carolina Stat 704: Data Analysis I 1/25 Gesell data Let X be the age in in months a child speaks his/her first word and

More information

Introduction to General and Generalized Linear Models

Introduction to General and Generalized Linear Models Introduction to General and Generalized Linear Models General Linear Models - part I Henrik Madsen Poul Thyregod Informatics and Mathematical Modelling Technical University of Denmark DK-2800 Kgs. Lyngby

More information

Regression III: Advanced Methods

Regression III: Advanced Methods Lecture 16: Generalized Additive Models Regression III: Advanced Methods Bill Jacoby Michigan State University http://polisci.msu.edu/jacoby/icpsr/regress3 Goals of the Lecture Introduce Additive Models

More information

α = u v. In other words, Orthogonal Projection

α = u v. In other words, Orthogonal Projection Orthogonal Projection Given any nonzero vector v, it is possible to decompose an arbitrary vector u into a component that points in the direction of v and one that points in a direction orthogonal to v

More information

Section 14 Simple Linear Regression: Introduction to Least Squares Regression

Section 14 Simple Linear Regression: Introduction to Least Squares Regression Slide 1 Section 14 Simple Linear Regression: Introduction to Least Squares Regression There are several different measures of statistical association used for understanding the quantitative relationship

More information

The Bivariate Normal Distribution

The Bivariate Normal Distribution The Bivariate Normal Distribution This is Section 4.7 of the st edition (2002) of the book Introduction to Probability, by D. P. Bertsekas and J. N. Tsitsiklis. The material in this section was not included

More information

Factor analysis. Angela Montanari

Factor analysis. Angela Montanari Factor analysis Angela Montanari 1 Introduction Factor analysis is a statistical model that allows to explain the correlations between a large number of observed correlated variables through a small number

More information

Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model

Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model 1 September 004 A. Introduction and assumptions The classical normal linear regression model can be written

More information

Part 2: Analysis of Relationship Between Two Variables

Part 2: Analysis of Relationship Between Two Variables Part 2: Analysis of Relationship Between Two Variables Linear Regression Linear correlation Significance Tests Multiple regression Linear Regression Y = a X + b Dependent Variable Independent Variable

More information

Univariate Regression

Univariate Regression Univariate Regression Correlation and Regression The regression line summarizes the linear relationship between 2 variables Correlation coefficient, r, measures strength of relationship: the closer r is

More information

Recall that two vectors in are perpendicular or orthogonal provided that their dot

Recall that two vectors in are perpendicular or orthogonal provided that their dot Orthogonal Complements and Projections Recall that two vectors in are perpendicular or orthogonal provided that their dot product vanishes That is, if and only if Example 1 The vectors in are orthogonal

More information

Covariance and Correlation

Covariance and Correlation Covariance and Correlation ( c Robert J. Serfling Not for reproduction or distribution) We have seen how to summarize a data-based relative frequency distribution by measures of location and spread, such

More information

Chapter 6: Multivariate Cointegration Analysis

Chapter 6: Multivariate Cointegration Analysis Chapter 6: Multivariate Cointegration Analysis 1 Contents: Lehrstuhl für Department Empirische of Wirtschaftsforschung Empirical Research and und Econometrics Ökonometrie VI. Multivariate Cointegration

More information

NCSS Statistical Software Principal Components Regression. In ordinary least squares, the regression coefficients are estimated using the formula ( )

NCSS Statistical Software Principal Components Regression. In ordinary least squares, the regression coefficients are estimated using the formula ( ) Chapter 340 Principal Components Regression Introduction is a technique for analyzing multiple regression data that suffer from multicollinearity. When multicollinearity occurs, least squares estimates

More information

Notes on Applied Linear Regression

Notes on Applied Linear Regression Notes on Applied Linear Regression Jamie DeCoster Department of Social Psychology Free University Amsterdam Van der Boechorststraat 1 1081 BT Amsterdam The Netherlands phone: +31 (0)20 444-8935 email:

More information

Penalized regression: Introduction

Penalized regression: Introduction Penalized regression: Introduction Patrick Breheny August 30 Patrick Breheny BST 764: Applied Statistical Modeling 1/19 Maximum likelihood Much of 20th-century statistics dealt with maximum likelihood

More information

Econometrics Simple Linear Regression

Econometrics Simple Linear Regression Econometrics Simple Linear Regression Burcu Eke UC3M Linear equations with one variable Recall what a linear equation is: y = b 0 + b 1 x is a linear equation with one variable, or equivalently, a straight

More information

IAPRI Quantitative Analysis Capacity Building Series. Multiple regression analysis & interpreting results

IAPRI Quantitative Analysis Capacity Building Series. Multiple regression analysis & interpreting results IAPRI Quantitative Analysis Capacity Building Series Multiple regression analysis & interpreting results How important is R-squared? R-squared Published in Agricultural Economics 0.45 Best article of the

More information

GRADES 7, 8, AND 9 BIG IDEAS

GRADES 7, 8, AND 9 BIG IDEAS Table 1: Strand A: BIG IDEAS: MATH: NUMBER Introduce perfect squares, square roots, and all applications Introduce rational numbers (positive and negative) Introduce the meaning of negative exponents for

More information

Module 5: Multiple Regression Analysis

Module 5: Multiple Regression Analysis Using Statistical Data Using to Make Statistical Decisions: Data Multiple to Make Regression Decisions Analysis Page 1 Module 5: Multiple Regression Analysis Tom Ilvento, University of Delaware, College

More information

Factor Analysis. Factor Analysis

Factor Analysis. Factor Analysis Factor Analysis Principal Components Analysis, e.g. of stock price movements, sometimes suggests that several variables may be responding to a small number of underlying forces. In the factor model, we

More information

Multiple Linear Regression in Data Mining

Multiple Linear Regression in Data Mining Multiple Linear Regression in Data Mining Contents 2.1. A Review of Multiple Linear Regression 2.2. Illustration of the Regression Process 2.3. Subset Selection in Linear Regression 1 2 Chap. 2 Multiple

More information

Session 7 Bivariate Data and Analysis

Session 7 Bivariate Data and Analysis Session 7 Bivariate Data and Analysis Key Terms for This Session Previously Introduced mean standard deviation New in This Session association bivariate analysis contingency table co-variation least squares

More information

Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression

Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression Objectives: To perform a hypothesis test concerning the slope of a least squares line To recognize that testing for a

More information

Linear Algebra Notes

Linear Algebra Notes Linear Algebra Notes Chapter 19 KERNEL AND IMAGE OF A MATRIX Take an n m matrix a 11 a 12 a 1m a 21 a 22 a 2m a n1 a n2 a nm and think of it as a function A : R m R n The kernel of A is defined as Note

More information

Introduction to Matrix Algebra

Introduction to Matrix Algebra Psychology 7291: Multivariate Statistics (Carey) 8/27/98 Matrix Algebra - 1 Introduction to Matrix Algebra Definitions: A matrix is a collection of numbers ordered by rows and columns. It is customary

More information

Summary of Formulas and Concepts. Descriptive Statistics (Ch. 1-4)

Summary of Formulas and Concepts. Descriptive Statistics (Ch. 1-4) Summary of Formulas and Concepts Descriptive Statistics (Ch. 1-4) Definitions Population: The complete set of numerical information on a particular quantity in which an investigator is interested. We assume

More information

POLYNOMIAL AND MULTIPLE REGRESSION. Polynomial regression used to fit nonlinear (e.g. curvilinear) data into a least squares linear regression model.

POLYNOMIAL AND MULTIPLE REGRESSION. Polynomial regression used to fit nonlinear (e.g. curvilinear) data into a least squares linear regression model. Polynomial Regression POLYNOMIAL AND MULTIPLE REGRESSION Polynomial regression used to fit nonlinear (e.g. curvilinear) data into a least squares linear regression model. It is a form of linear regression

More information

Quadratic forms Cochran s theorem, degrees of freedom, and all that

Quadratic forms Cochran s theorem, degrees of freedom, and all that Quadratic forms Cochran s theorem, degrees of freedom, and all that Dr. Frank Wood Frank Wood, fwood@stat.columbia.edu Linear Regression Models Lecture 1, Slide 1 Why We Care Cochran s theorem tells us

More information

MULTIVARIATE PROBABILITY DISTRIBUTIONS

MULTIVARIATE PROBABILITY DISTRIBUTIONS MULTIVARIATE PROBABILITY DISTRIBUTIONS. PRELIMINARIES.. Example. Consider an experiment that consists of tossing a die and a coin at the same time. We can consider a number of random variables defined

More information

2. Simple Linear Regression

2. Simple Linear Regression Research methods - II 3 2. Simple Linear Regression Simple linear regression is a technique in parametric statistics that is commonly used for analyzing mean response of a variable Y which changes according

More information

Inner Product Spaces

Inner Product Spaces Math 571 Inner Product Spaces 1. Preliminaries An inner product space is a vector space V along with a function, called an inner product which associates each pair of vectors u, v with a scalar u, v, and

More information

Orthogonal Diagonalization of Symmetric Matrices

Orthogonal Diagonalization of Symmetric Matrices MATH10212 Linear Algebra Brief lecture notes 57 Gram Schmidt Process enables us to find an orthogonal basis of a subspace. Let u 1,..., u k be a basis of a subspace V of R n. We begin the process of finding

More information

FEGYVERNEKI SÁNDOR, PROBABILITY THEORY AND MATHEmATICAL

FEGYVERNEKI SÁNDOR, PROBABILITY THEORY AND MATHEmATICAL FEGYVERNEKI SÁNDOR, PROBABILITY THEORY AND MATHEmATICAL STATIsTICs 4 IV. RANDOm VECTORs 1. JOINTLY DIsTRIBUTED RANDOm VARIABLEs If are two rom variables defined on the same sample space we define the joint

More information

Section 1: Simple Linear Regression

Section 1: Simple Linear Regression Section 1: Simple Linear Regression Carlos M. Carvalho The University of Texas McCombs School of Business http://faculty.mccombs.utexas.edu/carlos.carvalho/teaching/ 1 Regression: General Introduction

More information

Regression Analysis: A Complete Example

Regression Analysis: A Complete Example Regression Analysis: A Complete Example This section works out an example that includes all the topics we have discussed so far in this chapter. A complete example of regression analysis. PhotoDisc, Inc./Getty

More information

MULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS

MULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS MULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS MSR = Mean Regression Sum of Squares MSE = Mean Squared Error RSS = Regression Sum of Squares SSE = Sum of Squared Errors/Residuals α = Level of Significance

More information

Inner product. Definition of inner product

Inner product. Definition of inner product Math 20F Linear Algebra Lecture 25 1 Inner product Review: Definition of inner product. Slide 1 Norm and distance. Orthogonal vectors. Orthogonal complement. Orthogonal basis. Definition of inner product

More information

CAPM, Arbitrage, and Linear Factor Models

CAPM, Arbitrage, and Linear Factor Models CAPM, Arbitrage, and Linear Factor Models CAPM, Arbitrage, Linear Factor Models 1/ 41 Introduction We now assume all investors actually choose mean-variance e cient portfolios. By equating these investors

More information

Example: Boats and Manatees

Example: Boats and Manatees Figure 9-6 Example: Boats and Manatees Slide 1 Given the sample data in Table 9-1, find the value of the linear correlation coefficient r, then refer to Table A-6 to determine whether there is a significant

More information

Answer: C. The strength of a correlation does not change if units change by a linear transformation such as: Fahrenheit = 32 + (5/9) * Centigrade

Answer: C. The strength of a correlation does not change if units change by a linear transformation such as: Fahrenheit = 32 + (5/9) * Centigrade Statistics Quiz Correlation and Regression -- ANSWERS 1. Temperature and air pollution are known to be correlated. We collect data from two laboratories, in Boston and Montreal. Boston makes their measurements

More information

Correlation in Random Variables

Correlation in Random Variables Correlation in Random Variables Lecture 11 Spring 2002 Correlation in Random Variables Suppose that an experiment produces two random variables, X and Y. What can we say about the relationship between

More information

Simple Regression Theory II 2010 Samuel L. Baker

Simple Regression Theory II 2010 Samuel L. Baker SIMPLE REGRESSION THEORY II 1 Simple Regression Theory II 2010 Samuel L. Baker Assessing how good the regression equation is likely to be Assignment 1A gets into drawing inferences about how close the

More information

Outline. Topic 4 - Analysis of Variance Approach to Regression. Partitioning Sums of Squares. Total Sum of Squares. Partitioning sums of squares

Outline. Topic 4 - Analysis of Variance Approach to Regression. Partitioning Sums of Squares. Total Sum of Squares. Partitioning sums of squares Topic 4 - Analysis of Variance Approach to Regression Outline Partitioning sums of squares Degrees of freedom Expected mean squares General linear test - Fall 2013 R 2 and the coefficient of correlation

More information

Linear Models for Continuous Data

Linear Models for Continuous Data Chapter 2 Linear Models for Continuous Data The starting point in our exploration of statistical models in social research will be the classical linear model. Stops along the way include multiple linear

More information

University of Ljubljana Doctoral Programme in Statistics Methodology of Statistical Research Written examination February 14 th, 2014.

University of Ljubljana Doctoral Programme in Statistics Methodology of Statistical Research Written examination February 14 th, 2014. University of Ljubljana Doctoral Programme in Statistics ethodology of Statistical Research Written examination February 14 th, 2014 Name and surname: ID number: Instructions Read carefully the wording

More information

15.062 Data Mining: Algorithms and Applications Matrix Math Review

15.062 Data Mining: Algorithms and Applications Matrix Math Review .6 Data Mining: Algorithms and Applications Matrix Math Review The purpose of this document is to give a brief review of selected linear algebra concepts that will be useful for the course and to develop

More information

Financial Risk Management Exam Sample Questions/Answers

Financial Risk Management Exam Sample Questions/Answers Financial Risk Management Exam Sample Questions/Answers Prepared by Daniel HERLEMONT 1 2 3 4 5 6 Chapter 3 Fundamentals of Statistics FRM-99, Question 4 Random walk assumes that returns from one time period

More information

Correlation key concepts:

Correlation key concepts: CORRELATION Correlation key concepts: Types of correlation Methods of studying correlation a) Scatter diagram b) Karl pearson s coefficient of correlation c) Spearman s Rank correlation coefficient d)

More information

Regression Analysis. Regression Analysis MIT 18.S096. Dr. Kempthorne. Fall 2013

Regression Analysis. Regression Analysis MIT 18.S096. Dr. Kempthorne. Fall 2013 Lecture 6: Regression Analysis MIT 18.S096 Dr. Kempthorne Fall 2013 MIT 18.S096 Regression Analysis 1 Outline Regression Analysis 1 Regression Analysis MIT 18.S096 Regression Analysis 2 Multiple Linear

More information

Multiple Regression: What Is It?

Multiple Regression: What Is It? Multiple Regression Multiple Regression: What Is It? Multiple regression is a collection of techniques in which there are multiple predictors of varying kinds and a single outcome We are interested in

More information

Additional sources Compilation of sources: http://lrs.ed.uiuc.edu/tseportal/datacollectionmethodologies/jin-tselink/tselink.htm

Additional sources Compilation of sources: http://lrs.ed.uiuc.edu/tseportal/datacollectionmethodologies/jin-tselink/tselink.htm Mgt 540 Research Methods Data Analysis 1 Additional sources Compilation of sources: http://lrs.ed.uiuc.edu/tseportal/datacollectionmethodologies/jin-tselink/tselink.htm http://web.utk.edu/~dap/random/order/start.htm

More information

Chapter 10. Key Ideas Correlation, Correlation Coefficient (r),

Chapter 10. Key Ideas Correlation, Correlation Coefficient (r), Chapter 0 Key Ideas Correlation, Correlation Coefficient (r), Section 0-: Overview We have already explored the basics of describing single variable data sets. However, when two quantitative variables

More information

2. Linear regression with multiple regressors

2. Linear regression with multiple regressors 2. Linear regression with multiple regressors Aim of this section: Introduction of the multiple regression model OLS estimation in multiple regression Measures-of-fit in multiple regression Assumptions

More information

1 Introduction to Matrices

1 Introduction to Matrices 1 Introduction to Matrices In this section, important definitions and results from matrix algebra that are useful in regression analysis are introduced. While all statements below regarding the columns

More information

Notes on Orthogonal and Symmetric Matrices MENU, Winter 2013

Notes on Orthogonal and Symmetric Matrices MENU, Winter 2013 Notes on Orthogonal and Symmetric Matrices MENU, Winter 201 These notes summarize the main properties and uses of orthogonal and symmetric matrices. We covered quite a bit of material regarding these topics,

More information

Orthogonal Projections

Orthogonal Projections Orthogonal Projections and Reflections (with exercises) by D. Klain Version.. Corrections and comments are welcome! Orthogonal Projections Let X,..., X k be a family of linearly independent (column) vectors

More information

7 Time series analysis

7 Time series analysis 7 Time series analysis In Chapters 16, 17, 33 36 in Zuur, Ieno and Smith (2007), various time series techniques are discussed. Applying these methods in Brodgar is straightforward, and most choices are

More information

Understanding and Applying Kalman Filtering

Understanding and Applying Kalman Filtering Understanding and Applying Kalman Filtering Lindsay Kleeman Department of Electrical and Computer Systems Engineering Monash University, Clayton 1 Introduction Objectives: 1. Provide a basic understanding

More information

October 3rd, 2012. Linear Algebra & Properties of the Covariance Matrix

October 3rd, 2012. Linear Algebra & Properties of the Covariance Matrix Linear Algebra & Properties of the Covariance Matrix October 3rd, 2012 Estimation of r and C Let rn 1, rn, t..., rn T be the historical return rates on the n th asset. rn 1 rṇ 2 r n =. r T n n = 1, 2,...,

More information

Dot product and vector projections (Sect. 12.3) There are two main ways to introduce the dot product

Dot product and vector projections (Sect. 12.3) There are two main ways to introduce the dot product Dot product and vector projections (Sect. 12.3) Two definitions for the dot product. Geometric definition of dot product. Orthogonal vectors. Dot product and orthogonal projections. Properties of the dot

More information

Similarity and Diagonalization. Similar Matrices

Similarity and Diagonalization. Similar Matrices MATH022 Linear Algebra Brief lecture notes 48 Similarity and Diagonalization Similar Matrices Let A and B be n n matrices. We say that A is similar to B if there is an invertible n n matrix P such that

More information

SF2940: Probability theory Lecture 8: Multivariate Normal Distribution

SF2940: Probability theory Lecture 8: Multivariate Normal Distribution SF2940: Probability theory Lecture 8: Multivariate Normal Distribution Timo Koski 24.09.2014 Timo Koski () Mathematisk statistik 24.09.2014 1 / 75 Learning outcomes Random vectors, mean vector, covariance

More information

New Work Item for ISO 3534-5 Predictive Analytics (Initial Notes and Thoughts) Introduction

New Work Item for ISO 3534-5 Predictive Analytics (Initial Notes and Thoughts) Introduction Introduction New Work Item for ISO 3534-5 Predictive Analytics (Initial Notes and Thoughts) Predictive analytics encompasses the body of statistical knowledge supporting the analysis of massive data sets.

More information

The Cobb-Douglas Production Function

The Cobb-Douglas Production Function 171 10 The Cobb-Douglas Production Function This chapter describes in detail the most famous of all production functions used to represent production processes both in and out of agriculture. First used

More information

The Method of Least Squares

The Method of Least Squares Hervé Abdi 1 1 Introduction The least square methods (LSM) is probably the most popular technique in statistics. This is due to several factors. First, most common estimators can be casted within this

More information

PITFALLS IN TIME SERIES ANALYSIS. Cliff Hurvich Stern School, NYU

PITFALLS IN TIME SERIES ANALYSIS. Cliff Hurvich Stern School, NYU PITFALLS IN TIME SERIES ANALYSIS Cliff Hurvich Stern School, NYU The t -Test If x 1,..., x n are independent and identically distributed with mean 0, and n is not too small, then t = x 0 s n has a standard

More information

Linear Algebra: Vectors

Linear Algebra: Vectors A Linear Algebra: Vectors A Appendix A: LINEAR ALGEBRA: VECTORS TABLE OF CONTENTS Page A Motivation A 3 A2 Vectors A 3 A2 Notational Conventions A 4 A22 Visualization A 5 A23 Special Vectors A 5 A3 Vector

More information

This chapter will demonstrate how to perform multiple linear regression with IBM SPSS

This chapter will demonstrate how to perform multiple linear regression with IBM SPSS CHAPTER 7B Multiple Regression: Statistical Methods Using IBM SPSS This chapter will demonstrate how to perform multiple linear regression with IBM SPSS first using the standard method and then using the

More information

Factorization Theorems

Factorization Theorems Chapter 7 Factorization Theorems This chapter highlights a few of the many factorization theorems for matrices While some factorization results are relatively direct, others are iterative While some factorization

More information

SPSS Guide: Regression Analysis

SPSS Guide: Regression Analysis SPSS Guide: Regression Analysis I put this together to give you a step-by-step guide for replicating what we did in the computer lab. It should help you run the tests we covered. The best way to get familiar

More information

LOGIT AND PROBIT ANALYSIS

LOGIT AND PROBIT ANALYSIS LOGIT AND PROBIT ANALYSIS A.K. Vasisht I.A.S.R.I., Library Avenue, New Delhi 110 012 amitvasisht@iasri.res.in In dummy regression variable models, it is assumed implicitly that the dependent variable Y

More information

SF2940: Probability theory Lecture 8: Multivariate Normal Distribution

SF2940: Probability theory Lecture 8: Multivariate Normal Distribution SF2940: Probability theory Lecture 8: Multivariate Normal Distribution Timo Koski 24.09.2015 Timo Koski Matematisk statistik 24.09.2015 1 / 1 Learning outcomes Random vectors, mean vector, covariance matrix,

More information

" Y. Notation and Equations for Regression Lecture 11/4. Notation:

 Y. Notation and Equations for Regression Lecture 11/4. Notation: Notation: Notation and Equations for Regression Lecture 11/4 m: The number of predictor variables in a regression Xi: One of multiple predictor variables. The subscript i represents any number from 1 through

More information

Lecture 5: Singular Value Decomposition SVD (1)

Lecture 5: Singular Value Decomposition SVD (1) EEM3L1: Numerical and Analytical Techniques Lecture 5: Singular Value Decomposition SVD (1) EE3L1, slide 1, Version 4: 25-Sep-02 Motivation for SVD (1) SVD = Singular Value Decomposition Consider the system

More information

Simple linear regression

Simple linear regression Simple linear regression Introduction Simple linear regression is a statistical method for obtaining a formula to predict values of one variable from another where there is a causal relationship between

More information

MULTIPLE REGRESSION WITH CATEGORICAL DATA

MULTIPLE REGRESSION WITH CATEGORICAL DATA DEPARTMENT OF POLITICAL SCIENCE AND INTERNATIONAL RELATIONS Posc/Uapp 86 MULTIPLE REGRESSION WITH CATEGORICAL DATA I. AGENDA: A. Multiple regression with categorical variables. Coding schemes. Interpreting

More information

Canonical Correlation Analysis

Canonical Correlation Analysis Canonical Correlation Analysis Lecture 11 August 4, 2011 Advanced Multivariate Statistical Methods ICPSR Summer Session #2 Lecture #11-8/4/2011 Slide 1 of 39 Today s Lecture Canonical Correlation Analysis

More information

problem arises when only a non-random sample is available differs from censored regression model in that x i is also unobserved

problem arises when only a non-random sample is available differs from censored regression model in that x i is also unobserved 4 Data Issues 4.1 Truncated Regression population model y i = x i β + ε i, ε i N(0, σ 2 ) given a random sample, {y i, x i } N i=1, then OLS is consistent and efficient problem arises when only a non-random

More information

COMP6053 lecture: Relationship between two variables: correlation, covariance and r-squared. jn2@ecs.soton.ac.uk

COMP6053 lecture: Relationship between two variables: correlation, covariance and r-squared. jn2@ecs.soton.ac.uk COMP6053 lecture: Relationship between two variables: correlation, covariance and r-squared jn2@ecs.soton.ac.uk Relationships between variables So far we have looked at ways of characterizing the distribution

More information

5. Linear Regression

5. Linear Regression 5. Linear Regression Outline.................................................................... 2 Simple linear regression 3 Linear model............................................................. 4

More information

Ordinal Regression. Chapter

Ordinal Regression. Chapter Ordinal Regression Chapter 4 Many variables of interest are ordinal. That is, you can rank the values, but the real distance between categories is unknown. Diseases are graded on scales from least severe

More information

Joint Exam 1/P Sample Exam 1

Joint Exam 1/P Sample Exam 1 Joint Exam 1/P Sample Exam 1 Take this practice exam under strict exam conditions: Set a timer for 3 hours; Do not stop the timer for restroom breaks; Do not look at your notes. If you believe a question

More information

Multivariate Logistic Regression

Multivariate Logistic Regression 1 Multivariate Logistic Regression As in univariate logistic regression, let π(x) represent the probability of an event that depends on p covariates or independent variables. Then, using an inv.logit formulation

More information

Linear Algebra Review. Vectors

Linear Algebra Review. Vectors Linear Algebra Review By Tim K. Marks UCSD Borrows heavily from: Jana Kosecka kosecka@cs.gmu.edu http://cs.gmu.edu/~kosecka/cs682.html Virginia de Sa Cogsci 8F Linear Algebra review UCSD Vectors The length

More information

MISSING DATA TECHNIQUES WITH SAS. IDRE Statistical Consulting Group

MISSING DATA TECHNIQUES WITH SAS. IDRE Statistical Consulting Group MISSING DATA TECHNIQUES WITH SAS IDRE Statistical Consulting Group ROAD MAP FOR TODAY To discuss: 1. Commonly used techniques for handling missing data, focusing on multiple imputation 2. Issues that could

More information

Appendices with Supplementary Materials for CAPM for Estimating Cost of Equity Capital: Interpreting the Empirical Evidence

Appendices with Supplementary Materials for CAPM for Estimating Cost of Equity Capital: Interpreting the Empirical Evidence Appendices with Supplementary Materials for CAPM for Estimating Cost of Equity Capital: Interpreting the Empirical Evidence This document contains supplementary material to the paper titled CAPM for estimating

More information

Statistics in Retail Finance. Chapter 6: Behavioural models

Statistics in Retail Finance. Chapter 6: Behavioural models Statistics in Retail Finance 1 Overview > So far we have focussed mainly on application scorecards. In this chapter we shall look at behavioural models. We shall cover the following topics:- Behavioural

More information

1) Write the following as an algebraic expression using x as the variable: Triple a number subtracted from the number

1) Write the following as an algebraic expression using x as the variable: Triple a number subtracted from the number 1) Write the following as an algebraic expression using x as the variable: Triple a number subtracted from the number A. 3(x - x) B. x 3 x C. 3x - x D. x - 3x 2) Write the following as an algebraic expression

More information

Testing for Lack of Fit

Testing for Lack of Fit Chapter 6 Testing for Lack of Fit How can we tell if a model fits the data? If the model is correct then ˆσ 2 should be an unbiased estimate of σ 2. If we have a model which is not complex enough to fit

More information

Introduction: Overview of Kernel Methods

Introduction: Overview of Kernel Methods Introduction: Overview of Kernel Methods Statistical Data Analysis with Positive Definite Kernels Kenji Fukumizu Institute of Statistical Mathematics, ROIS Department of Statistical Science, Graduate University

More information

Introduction to Regression and Data Analysis

Introduction to Regression and Data Analysis Statlab Workshop Introduction to Regression and Data Analysis with Dan Campbell and Sherlock Campbell October 28, 2008 I. The basics A. Types of variables Your variables may take several forms, and it

More information

Statistical Models in R

Statistical Models in R Statistical Models in R Some Examples Steven Buechler Department of Mathematics 276B Hurley Hall; 1-6233 Fall, 2007 Outline Statistical Models Linear Models in R Regression Regression analysis is the appropriate

More information

Web-based Supplementary Materials for Bayesian Effect Estimation. Accounting for Adjustment Uncertainty by Chi Wang, Giovanni

Web-based Supplementary Materials for Bayesian Effect Estimation. Accounting for Adjustment Uncertainty by Chi Wang, Giovanni 1 Web-based Supplementary Materials for Bayesian Effect Estimation Accounting for Adjustment Uncertainty by Chi Wang, Giovanni Parmigiani, and Francesca Dominici In Web Appendix A, we provide detailed

More information

Partial Fractions. Combining fractions over a common denominator is a familiar operation from algebra:

Partial Fractions. Combining fractions over a common denominator is a familiar operation from algebra: Partial Fractions Combining fractions over a common denominator is a familiar operation from algebra: From the standpoint of integration, the left side of Equation 1 would be much easier to work with than

More information

1 Teaching notes on GMM 1.

1 Teaching notes on GMM 1. Bent E. Sørensen January 23, 2007 1 Teaching notes on GMM 1. Generalized Method of Moment (GMM) estimation is one of two developments in econometrics in the 80ies that revolutionized empirical work in

More information

Linear Regression. Guy Lebanon

Linear Regression. Guy Lebanon Linear Regression Guy Lebanon Linear Regression Model and Least Squares Estimation Linear regression is probably the most popular model for predicting a RV Y R based on multiple RVs X 1,..., X d R. It

More information

Lecture 14: GLM Estimation and Logistic Regression

Lecture 14: GLM Estimation and Logistic Regression Lecture 14: GLM Estimation and Logistic Regression Dipankar Bandyopadhyay, Ph.D. BMTRY 711: Analysis of Categorical Data Spring 2011 Division of Biostatistics and Epidemiology Medical University of South

More information

CORRELATION ANALYSIS

CORRELATION ANALYSIS CORRELATION ANALYSIS Learning Objectives Understand how correlation can be used to demonstrate a relationship between two factors. Know how to perform a correlation analysis and calculate the coefficient

More information