# EC327: Advanced Econometrics, Spring 2007

Save this PDF as:

Size: px
Start display at page:

## Transcription

1 EC327: Advanced Econometrics, Spring 2007 Wooldridge, Introductory Econometrics (3rd ed, 2006) Appendix D: Summary of matrix algebra Basic definitions A matrix is a rectangular array of numbers, with m rows and n columns, which are the row dimension and column dimension, respectively. The matrix A will have typical element a ij. A vector is a matrix with one row and/or one column. Thus a scalar can be considered a 1 1 matrix. A m-element row vector has one row and m columns. A n-element column vector has n rows and one column.

2 A square matrix has m = n. A diagonal matrix is a square matrix with off-diagonal elements equal to 0. It may have m distinct diagonal elements. If those m elements are equal, it is a scalar matrix. If they all equal 1, it is an identity matrix of order m, customarily written as I or I m. A symmetric matrix is a square matrix for which a ij = a ji i, j. The elements above and below its main diagonal are equal. It is often written in upper triangular or lower triangular form, since there is no need to report more than the main diagonal and sub- (super-) diagonal elements. A symmetric matrix we often compute in econometrics is the correlation matrix of a set of variables. The correlation matrix will have 1s on its main diagonal (since every variable is perfectly correlated with itself) and off-diagonal values between (-1,+1).

3 Another very important matrix is what Stata calls the VCE, or estimated variance-covariance matrix of the estimated parameters of a regression equation. It is by construction a symmetric matrix with positive elements on the main diagonal (the estimated variances of the parameters b, whose positive square roots are the reported standard errors of those parameters). The off-diagonal elements are the estimated covariances of the estimated parameters, used in computing hypothesis tests or confidence intervals involving more than one parameter. This matrix may be examined after a regress command in Stata with the command estat vce. The transpose of a matrix reflects the matrix around its main diagonal. We write the transpose of A as A (or, less commonly, A T ). If B = A, then b ij = a ji i, j. If A is m n, B is of order n m. The rows of A become the columns of A, and vice versa.

4 Several relations involving the transpose of a matrix: (1) the transpose of a row vector is a column vector; (2) the transpose of a transpose is the matrix itself; (3) the transpose of a symmetric matrix is the matrix itself; (4) The transpose of the sum (difference) is the sum (difference) of the transposes. A matrix of any row and column dimension with all elements equal to zero is a zero matrix or null matrix. It plays the role in matrix algebra (or linear algebra) that the scalar 0 plays in ordinary algebra, in the sense that adding or subtracting a null matrix has no effect, and multiplying by a null matrix returns a null matrix. The identity matrix I plays the role of the number 1 in ordinary algebra: e.g., multiplying by I

5 has no effect, and we can always insert an I of appropriate order in a matrix product without changing the product. Matrix operations In matrix algebra, algebraic operations are only defined for matrices or vectors of appropriate order. Since vectors are special cases of matrices, we will only speak of matrices. Two matrices A and B are equal if they have the same order the same number of rows and columns and if a ij = b ij i, j. Addition or subtraction can be performed if and only if the matrices are of the same order. The sum (difference) C = A ± B is defined as c ij = a ij ± b ij i, j. That is, just as we compare matrices for equality element-by-element, we add (subtract) matrices element-by-element. Scalar multiplication is defined for any matrix as C = k A, and involves multiplying each element by that scalar: c ij = k a ij i, j.

6 Matrix multiplication is defined for matrices A m n and B n q as C m q = A B. This product is defined since the number of columns in the first matrix equals the number of rows in the second matrix. To define matrix multiplication, we must first define the dot product of vectors u n 1 and v n 1. The product d = u v is a scalar, defined as d = n i=1 u i v i. This implies that if u = v, the scalar d will be the sum of squares of the elements of u. We may also compute the outer product of these vectors, u v, which for vectors of the same length will create a n n matrix. In matrix multiplication, each element of the result matrix is defined as a dot product. If C = A B, c ij is the dot product of the i th row of A and the j th column of B: c ij = n k=1 A ik B kj. These dot products are only defined for vectors of the same length, which gives rise to the constraint on the number of columns of A

7 and the number of rows of B. When this constraint is satisfied, the vectors are conformable for multiplication. In ordinary algebra, multiplication is commutative: we can write x y or y x and receive the same result. In matrix algebra, the product of arbitrary matrices will not exist unless they are conformable. if one of those products exists, the other will not except in special cases. If A and B are square matrices of the same order, then both A B and B A always exist, but they do not yield the same result matrix except under special circumstances. (A natural exception: we can always write I A and A I, both of which equal A. Likewise for the null matrix.) We can also multiply a matrix by a vector. If we write C = u A, we are multiplying the m-element row vector u by matrix A, which

8 must have m rows; it may have any number of columns. We are premultiplying A by u. We may also postmultiply A by vector v. If A is m n, v must be a n-element column vector. Some properties of matrix multiplication: (1) multiplication is associative with respect to scalars and matrices, as long as they are conformable: k(a + B) = ka + kb, C ( A + B) = C A + C B, ( A + B) C = A C + B C. (2) The transpose of a product is the product of the transposes, in reverse order: (A B) = B A. (3) For any matrix X the products X X and X X exist, and are symmetric. We cannot speak of dividing one matrix by another (unless they are both scalars, and the rules of ordinary algebra apply). Instead, we

9 define the concept of the inverse matrix. A square matrix A m m may possess a unique inverse, written A 1, under certain conditions. If it exists, the inverse is defined as that matrix which satisfies A A 1 = A 1 A = I m, and in that sense the inverse operates like the division operator in normal algebra, where x 1 x = 1 x 0. A matrix which possesses an inverse is said to be nonsingular, or invertible, and it has a nonzero determinant. A singular matrix possesses a zero determinant. We will not go into the computation of inverse matrices or determinants which is better left to a computer but we must understand their importance to econometrics and the linear regression model in particular. Properties of inverses: (1) The inverse of a product is the product of the inverses, in reverse order: (A B) 1 = B 1 A 1. The transpose of the inverse is the inverse of the transpose: (A ) 1 = (A 1 ).

10 The trace of a square matrix, denoted tr(a), is the scalar sum of its diagonal elements. So, for instance, the trace of the identity matrix I n is n. The trace of the sum (difference) of matrices is the sum (difference) of the traces, and the trace of a product yielding a square matrix is not dependent on order: that is, if A m n and B n m, both the products AB and BA exist, and have the same trace. Linear independence and rank The notion of a matrix possessing an inverse is related to the concept of linear independence. A set of n-vectors [x 1, x 2,..., x r ] is said to be linearly independent if and only if α x = 0 implies that α is the null vector. If α x = 0 holds for a set of scalars α 1, α 2,..., α r that are not all zero, then the set of x-vectors are linearly dependent. This implies that at least one vector

11 in this set can be written as a linear combination of the others. In econometric terms, this is the problem of perfect collinearity: one or more of the regressors can be expressed as an exact linear combination of the others. If A is a n k matrix, the column rank of A is the maximum number of linearly independent columns of A. If rank(a) = k, A has full column rank. Row rank is defined similarly. The rank of a matrix is the minimum of its row and column ranks. When we consider a data matrix X n k, with n > k, its rank cannot exceed k. If a square matrix A of order k is of full rank (rank=k), then A 1 exists: the matrix is invertible. The rank of a product of matrices cannot exceed either of their ranks, and may be zero. Quadratic forms and positive definite matrices

12 If A is a n n symmetric matrix, then the quadratic form associated with A is the scalar function Q = x Ax = n i=1 a ii x 2 i + 2 n i=1 j>i a ij x i x j where x is any n-vector. If x Ax > 0 for all n-vectors x except x=0, then matrix A is said to be positive definite (p.d.). If x Ax 0, then A is said to be positive semi-definite (p.s.d.). A p.d. matrix A has all positive diagonal elements and possesses an inverse A 1 which is also p.d. and a positive determinant. For any X n k, X X and XX are p.s.d. If X n k with n > k has rank k, then X X is p.d. and nonsingular, implying that X X 1 exists. This is the relevant concern for regression, where we have a data matrix X of n observations on k variables or regressors, with n > k. If those regressors are linearly independent, so that X is of full rank k, we can invert the matrix X X: a key step in computing linear regression estimates.

13 Idempotent matrices If A is a n n symmetric matrix, it is said to be idempotent if and only if A A = A. The identity matrix, playing the role of the number 1 in ordinary algebra, is idempotent. An idempotent matrix has rank equal to its trace, and it is p.s.d. If we have a data matrix X n k with rank(x)=k, then P = X(X X) 1 X (1) M = I n X(X X) 1 X = I n P (2) are both symmetric, idempotent matrices. P n n has rank k, while M n n has rank n k, since the trace of P is that of I k, and its trace is equal to its rank.

14 Matrix differentiation For n-vectors a and x, define f(x) = a x. Then the derivative of the function with respect to its (vector) argument is f/ x = a a 1 n vector. For a n n symmetric matrix A with quadratic form Q = x Ax, the derivative of Q with respect to its vector argument is a 1 n vector. Q/ x = 2x A Moments and distributions of random vectors Operations on random variables can be expressed in terms of vectors of random variables. If y is a n-element random vector, then its expected value E[y] is merely the n-vector of its expected values. This generalizes to a

15 random matrix. A linear transformation of y with non-random matrices yields E[Ay + b] = AE[y] + b If y is a n-element random vector, then its variance-covariance matrix or VCE is the symmetric matrix Var(y) = σ 2 1 σ 21. σ2 2. σ n1 σn2... σn 2 where σj 2 is the variance of y j and σ ij is the covariance of y i and y j. Just as we can perform algebra on scalar variances and covariances, we can operate on the VCE of y. Some properties: (1) If a is a n-element nonrandom vector, Var(a y)

16 = [a Var(y) a] 0. (2) If Var(a y) > 0 a 0, Var(y) is p.d. and possesses an inverse. (3) If µ =E[y], V ar(y) = E[(y µ)(y µ) ]. (4) If the elements of y are uncorrelated, Var(y) is a diagonal matrix. This is the assumption of independence of the elements of random vector y: for instance, of the error terms of a regression equation. (5) If in addition Var(y j ) = σ 2 j, then Var(y) = σ 2 I n. This is the assumption of homoskedasticity of the elements of random vector y: for instance, of the error terms of a regression equation. (6) For nonrandom A m n and b n 1, Var(A y + b) = [ A Var(y) A ]. If y is a n-element multivariate Normal random vector with mean vector µ and VCE Σ, we write y N(µ, Σ). Properties of the multivariate normal distribution:

17 (1) If y N(µ, Σ), each element of y is Normally distributed. (2) If y N(µ, Σ), any two elements of y are independent if and only if they are uncorrelated (σ ij = 0). (3) If y N(µ, Σ) and A, b are nonrandom, then A y + b N(Aµ + b, AΣA ) A χ 2 n random variable is the sum of n squared independent standard Normal variables. If u N(0, I n ), then u u χ 2 n. A t-distributed random variable is the ratio of a standard Normal variable Z to a χ 2 n random variable X, standardized by its degrees of freedom, where the variables Z, X are independent. Let u N(0, I n ), c be a nonrandom n- vector and A be a nonrandom, n n symmetric, idempotent matrix with rank q, with Ac = 0. Then the quantity [c u/ c c]/ u Au t q.

18 A F-distributed random variable is the ratio of two independent χ 2 random variables, standardized by their respective degrees of freedom. If u N(0, I n ) and A, B are n n nonrandom, symmetric idempotent matrices with rank(a) = k 1 and rank(b) = k 2, then ((u Au)/k 1 )/(u Bu)/k 2 ) F k 1 k 2.

19 Appendix E: The linear regression model in matrix form OLS estimation The multiple linear regression model with k regressors can be written as y t = β 0 +β 1 x t1 +β 2 x t β k x tk +u t, t = 1, 2,..., n where y t is the dependent variable for observation t and x 1... x k are the regressors. β 0 is the intercept term (constant) and β 1... β k are the slope parameters. We define x t as the 1 (k + 1) row vector (1, x t1,..., x tk ) and the column vector β = (β 0, β 1,..., β k ). Then the model can be written as y t = x t β + u t, t = 1, 2,..., n

20 and the entire regression problem as y = Xβ + u (3) where y is the n 1 vector of observations on the dependent variable, X n (k+1) is the matrix of observations on the regressors, including an initial column of 1s, and u is the n 1 vector of unobservable errors. OLS estimation involves minimizing the sum of squared residuals, where the residuals e t = (y t x t b) where b is the vector of estimated parameters corresponding to the population parameters β. Minimizing the sum of squared e t is equivalent to minimizing SSR = e e with respect to the (k + 1) elements of b. A solution to this optimization problem involves a set of (k + 1) first order conditions (FOC) SSR(b)/ b = 0 and setting those conditions equal to zero. The FOCs give rise to a set of (k + 1) simultaneous equations in the (k + 1) unknowns b. In

21 matrix form, the residuals may be written as e = y Xb, and the FOCs then become X (y Xb) = 0 X y = (X X)b b = (X X) 1 X y (4) if the inverse exists, that is, if and only if X is of full (column) rank (k + 1). If that condition is satisfied, the cross-products matrix X X will be a positive definite matrix. For that to be so, it must be the case that the columns of X are linearly independent. This rules out the case of perfect collinearity, which will arise if one or more of the regressors can be written as a linear combination of the others. Recall that the first column of X is a vector of 1s. This implies that no other column of X can take on a constant value, for it would be a multiple of the first column. Likewise, a

22 situation where the sum of some of the regressors equals a column of ones will violate this assumption. This will occur in the case of the dummy variable trap where a complete set of mutually exclusive and exhaustive (MEE) indicator variables are included in a regression containing a constant term. Given a solution for b, the OLS predicted (fitted) values ŷ = Xb, and the calculated residuals equal e = y ŷ. From Equation (4), the first order condition may be written as X e = 0. Since the first column of X is a vector of 1s, the FOC implies that the residuals sum to zero and have an average value of zero. The remaining FOCs imply that each column of X (and any linear combinations of those columns) has zero covariance with e. This is algebraically implied by OLS, not an assumption. Since ŷ is such a linear combination, it is also true that ŷ e = 0: the residuals have zero covariance with the predicted values.

23 Finite sample properties of OLS The assumptions underlying the OLS model: 1. Linear in parameters: The model can be written as in Equation (3), with the observed n-vector y, observed n (k + 1) matrix X and n-vector u of unobserved errors. 2. No perfect collinearity: The matrix X has rank (k + 1). 3. Zero conditional mean: Conditional on the matrix X, each element of u has zero mean: E[u t X] = 0 t. 4. Spherical disturbances: V ar(u X) = σ 2 I n. This combines the two assumptions of homoskedasticity, V ar(u t X) = σ 2 t and independence, Cov(u t, u s X = 0 t s. The

24 first assumption implies that the u t are identically distributed, while the second assumption implies that they are independently distributed, or in a time series context, free of serial correlation. Taken together, they allow us to say that u is a i.i.d. random variable with a scalar variance-covariance matrix. Given these four assumptions, we may prove several theorems related to the OLS estimator: 1. Unbiasedness of OLS: Under assumptions 1, 2 and 3, the OLS estimator b is unbiased with respect to β. b = (X X) 1 X y (5) = (X X) 1 X (Xβ + u) = (X X) 1 (X X)β + (X X) 1 X u

25 Taking the conditional expectation, E[b X] = β + (X X) 1 X E(u X) (6) = β + (X X) 1 X 0 = β so that b is unbiased. 2. VCE of the OLS estimator: Under assumptions 1, 2, 3 and 4, V ar(b X) = σ 2 (X X) 1. (7) From the last line of Equation (6), we have V ar(b X) = V ar[(x X) 1 X u X] (8) = (X X) 1 X [V ar(u X)]X(X X) 1 Crucially depending on assumption 4, we can then write V ar(b X) = (X X) 1 X (σ 2 I n )X(X X) 1 = σ 2 (X X) 1 (X X)(X X) 1 = σ 2 (X X) 1 (9)

26 This conditional VCE depends on the unknown parameter σ 2, but replacing that with its consistent estimate s 2 it becomes an operational formula. 3. Gauss Markov: Under assumptions 1, 2, 3 and 4, b is the Best Linear Unbiased Estimator of β (b is BLUE). Any linear estimator of β can be written as ˆβ = A y (10) where A is a n (k + 1) matrix whose elements are not functions of y but may be functions of X. Given the model of Equation (3), we may write ˆβ as ˆβ = A (Xβ + u) = (A X)β + A u. (11) We may then write the conditional expectation of ˆβ as E[ˆβ X] = A Xβ + E[A u X] = A Xβ + A E(u X) = A Xβ (12)

27 The last line following from assumption 3. For ˆβ to be an unbiased estimator, it must be that E[ˆβ X] = β β. Because A X is a (k + 1) (k + 1) matrix, unbiasedness requires that A X = I k+1. From Equation (11) we have V ar(ˆβ X) = A [V ar(u X)]A = σ 2 A A (13) invoking assumption 4 (i.i.d. disturbances). Therefore V ar(ˆβ X) V ar(b X) = σ 2 [A A (X X) 1 ] = σ 2 [A A A X(X X) 1 X A] = σ 2 A [I n X(X X) 1 X ]A = σ 2 A MA (14) where M = [I n X(X X) 1 X ], a symmetric and idempotent matrix which is positive semi-definite (p.s.d.) for any matrix A. But Equation (14) represents the difference between the VCE of any arbitrary linear estimator of β and the VCE of the OLS

28 estimator. That difference will be the null matrix if and only if A A = (X X) 1 : that is, if we choose A to reproduce the OLS estimator. For any other choice of A, the difference will be a positive semi-definite matrix, which implies that at least one of the diagonal elements is larger in the first matrix than in the second. That is, the maximum precision estimator of each element of β can only be achieved by OLS. Any other linear, unbiased estimator of β will have a larger estimated variance for at least one element of β. Thus OLS is BLUE: the Best (minimum-variance, or most efficient) Linear Unbiased Estimator of β. 4. Unbiasedness of s 2 : The unbiased estimator of the error variance σ 2 can be calculated as s 2 = e e/(n k 1), with E[s 2 X] =

29 σ 2 σ 2 > 0. e = y Xb (15) = y X[(X X) 1 X y] = My = Mu where M is defined as above. symmetric and idempotent, Since M is e e = u M Mu = u Mu (16) which is a scalar, equal to its trace. So E[u Mu X] = E[tr(u Mu) X] (17) = E[tr(Muu ) X] = tr[e(muu ) X] = tr[me[uu X]] = tr(mσ 2 I n ) = σ 2 tr(m) = σ 2 (n k 1) because tr(m) = tr(i n ) tr[x[(x X) 1 X ] = n tr(i k+1 ). Therefore, E[s 2 X] = E[u Mu X)/(n k 1) = σ 2 (18)

30 Statistical inference With an additional assumption 5. Normality: conditional on X, the u t are independently and identically distributed (i.i.d.) as Normal[0, σ 2 ]. The vector of errors u, conditional on X, is distributed multivariate Normal[0, σ 2 I n ]. we may present the theorem Normality of b: Under assumptions 1,2,3,4,5, b conditional on X is distributed as multivariate Normal[β, σ 2 (X X) 1 ]. with the corollary that under the null hypothesis, t-statistics have a t n k 1 distribution. Under normality, b is the minimum variance unbiased estimator of β, conditional on X, in the sense that it reaches the Cramer-Rao lower

31 bound (CRLB) for the VCE of unbiased estimators of β. The CRLB defines the minimum variance possible for any unbiased estimator linear or nonlinear. Since OLS reaches that bound, it is the most precise unbiased estimator available.

### Introduction to Matrix Algebra I

Appendix A Introduction to Matrix Algebra I Today we will begin the course with a discussion of matrix algebra. Why are we studying this? We will use matrix algebra to derive the linear regression model

### Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model

Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model 1 September 004 A. Introduction and assumptions The classical normal linear regression model can be written

### 1 Introduction to Matrices

1 Introduction to Matrices In this section, important definitions and results from matrix algebra that are useful in regression analysis are introduced. While all statements below regarding the columns

### The basic unit in matrix algebra is a matrix, generally expressed as: a 11 a 12. a 13 A = a 21 a 22 a 23

(copyright by Scott M Lynch, February 2003) Brief Matrix Algebra Review (Soc 504) Matrix algebra is a form of mathematics that allows compact notation for, and mathematical manipulation of, high-dimensional

### 3.1 Least squares in matrix form

118 3 Multiple Regression 3.1 Least squares in matrix form E Uses Appendix A.2 A.4, A.6, A.7. 3.1.1 Introduction More than one explanatory variable In the foregoing chapter we considered the simple regression

### 1 Introduction. 2 Matrices: Definition. Matrix Algebra. Hervé Abdi Lynne J. Williams

In Neil Salkind (Ed.), Encyclopedia of Research Design. Thousand Oaks, CA: Sage. 00 Matrix Algebra Hervé Abdi Lynne J. Williams Introduction Sylvester developed the modern concept of matrices in the 9th

### 4. MATRICES Matrices

4. MATRICES 170 4. Matrices 4.1. Definitions. Definition 4.1.1. A matrix is a rectangular array of numbers. A matrix with m rows and n columns is said to have dimension m n and may be represented as follows:

### Introduction to Matrix Algebra

Psychology 7291: Multivariate Statistics (Carey) 8/27/98 Matrix Algebra - 1 Introduction to Matrix Algebra Definitions: A matrix is a collection of numbers ordered by rows and columns. It is customary

### Lecture 2 Matrix Operations

Lecture 2 Matrix Operations transpose, sum & difference, scalar multiplication matrix multiplication, matrix-vector product matrix inverse 2 1 Matrix transpose transpose of m n matrix A, denoted A T or

### MATRIX ALGEBRA AND SYSTEMS OF EQUATIONS. + + x 2. x n. a 11 a 12 a 1n b 1 a 21 a 22 a 2n b 2 a 31 a 32 a 3n b 3. a m1 a m2 a mn b m

MATRIX ALGEBRA AND SYSTEMS OF EQUATIONS 1. SYSTEMS OF EQUATIONS AND MATRICES 1.1. Representation of a linear system. The general system of m equations in n unknowns can be written a 11 x 1 + a 12 x 2 +

### MATRIX ALGEBRA AND SYSTEMS OF EQUATIONS

MATRIX ALGEBRA AND SYSTEMS OF EQUATIONS Systems of Equations and Matrices Representation of a linear system The general system of m equations in n unknowns can be written a x + a 2 x 2 + + a n x n b a

### Matrices, Determinants and Linear Systems

September 21, 2014 Matrices A matrix A m n is an array of numbers in rows and columns a 11 a 12 a 1n r 1 a 21 a 22 a 2n r 2....... a m1 a m2 a mn r m c 1 c 2 c n We say that the dimension of A is m n (we

### 4. Matrix inverses. left and right inverse. linear independence. nonsingular matrices. matrices with linearly independent columns

L. Vandenberghe EE133A (Spring 2016) 4. Matrix inverses left and right inverse linear independence nonsingular matrices matrices with linearly independent columns matrices with linearly independent rows

### Eigenvalues, Eigenvectors, Matrix Factoring, and Principal Components

Eigenvalues, Eigenvectors, Matrix Factoring, and Principal Components The eigenvalues and eigenvectors of a square matrix play a key role in some important operations in statistics. In particular, they

### by the matrix A results in a vector which is a reflection of the given

Eigenvalues & Eigenvectors Example Suppose Then So, geometrically, multiplying a vector in by the matrix A results in a vector which is a reflection of the given vector about the y-axis We observe that

### Diagonal, Symmetric and Triangular Matrices

Contents 1 Diagonal, Symmetric Triangular Matrices 2 Diagonal Matrices 2.1 Products, Powers Inverses of Diagonal Matrices 2.1.1 Theorem (Powers of Matrices) 2.2 Multiplying Matrices on the Left Right by

### Wooldridge, Introductory Econometrics, 4th ed. Chapter 15: Instrumental variables and two stage least squares

Wooldridge, Introductory Econometrics, 4th ed. Chapter 15: Instrumental variables and two stage least squares Many economic models involve endogeneity: that is, a theoretical relationship does not fit

Lecture 5: Linear least-squares Regression III: Advanced Methods William G. Jacoby Department of Political Science Michigan State University http://polisci.msu.edu/jacoby/icpsr/regress3 Simple Linear Regression

### Linear Algebra: Matrices

B Linear Algebra: Matrices B 1 Appendix B: LINEAR ALGEBRA: MATRICES TABLE OF CONTENTS Page B.1. Matrices B 3 B.1.1. Concept................... B 3 B.1.2. Real and Complex Matrices............ B 3 B.1.3.

### Matrix Differentiation

1 Introduction Matrix Differentiation ( and some other stuff ) Randal J. Barnes Department of Civil Engineering, University of Minnesota Minneapolis, Minnesota, USA Throughout this presentation I have

### 15.062 Data Mining: Algorithms and Applications Matrix Math Review

.6 Data Mining: Algorithms and Applications Matrix Math Review The purpose of this document is to give a brief review of selected linear algebra concepts that will be useful for the course and to develop

### 2.1: MATRIX OPERATIONS

.: MATRIX OPERATIONS What are diagonal entries and the main diagonal of a matrix? What is a diagonal matrix? When are matrices equal? Scalar Multiplication 45 Matrix Addition Theorem (pg 0) Let A, B, and

### MATH10212 Linear Algebra. Systems of Linear Equations. Definition. An n-dimensional vector is a row or a column of n numbers (or letters): a 1.

MATH10212 Linear Algebra Textbook: D. Poole, Linear Algebra: A Modern Introduction. Thompson, 2006. ISBN 0-534-40596-7. Systems of Linear Equations Definition. An n-dimensional vector is a row or a column

### Notes for STA 437/1005 Methods for Multivariate Data

Notes for STA 437/1005 Methods for Multivariate Data Radford M. Neal, 26 November 2010 Random Vectors Notation: Let X be a random vector with p elements, so that X = [X 1,..., X p ], where denotes transpose.

### Linear Dependence Tests

Linear Dependence Tests The book omits a few key tests for checking the linear dependence of vectors. These short notes discuss these tests, as well as the reasoning behind them. Our first test checks

### The Inverse of a Matrix

The Inverse of a Matrix 7.4 Introduction In number arithmetic every number a ( 0) has a reciprocal b written as a or such that a ba = ab =. Some, but not all, square matrices have inverses. If a square

### EC9A0: Pre-sessional Advanced Mathematics Course

University of Warwick, EC9A0: Pre-sessional Advanced Mathematics Course Peter J. Hammond & Pablo F. Beker 1 of 55 EC9A0: Pre-sessional Advanced Mathematics Course Slides 1: Matrix Algebra Peter J. Hammond

### Helpsheet. Giblin Eunson Library MATRIX ALGEBRA. library.unimelb.edu.au/libraries/bee. Use this sheet to help you:

Helpsheet Giblin Eunson Library ATRIX ALGEBRA Use this sheet to help you: Understand the basic concepts and definitions of matrix algebra Express a set of linear equations in matrix notation Evaluate determinants

### Matrix Algebra and Applications

Matrix Algebra and Applications Dudley Cooke Trinity College Dublin Dudley Cooke (Trinity College Dublin) Matrix Algebra and Applications 1 / 49 EC2040 Topic 2 - Matrices and Matrix Algebra Reading 1 Chapters

### Solution to Homework 2

Solution to Homework 2 Olena Bormashenko September 23, 2011 Section 1.4: 1(a)(b)(i)(k), 4, 5, 14; Section 1.5: 1(a)(b)(c)(d)(e)(n), 2(a)(c), 13, 16, 17, 18, 27 Section 1.4 1. Compute the following, if

### Linear Algebra Notes for Marsden and Tromba Vector Calculus

Linear Algebra Notes for Marsden and Tromba Vector Calculus n-dimensional Euclidean Space and Matrices Definition of n space As was learned in Math b, a point in Euclidean three space can be thought of

### 13 MATH FACTS 101. 2 a = 1. 7. The elements of a vector have a graphical interpretation, which is particularly easy to see in two or three dimensions.

3 MATH FACTS 0 3 MATH FACTS 3. Vectors 3.. Definition We use the overhead arrow to denote a column vector, i.e., a linear segment with a direction. For example, in three-space, we write a vector in terms

### ELEC-E8104 Stochastics models and estimation, Lecture 3b: Linear Estimation in Static Systems

Stochastics models and estimation, Lecture 3b: Linear Estimation in Static Systems Minimum Mean Square Error (MMSE) MMSE estimation of Gaussian random vectors Linear MMSE estimator for arbitrarily distributed

### Matrix Norms. Tom Lyche. September 28, Centre of Mathematics for Applications, Department of Informatics, University of Oslo

Matrix Norms Tom Lyche Centre of Mathematics for Applications, Department of Informatics, University of Oslo September 28, 2009 Matrix Norms We consider matrix norms on (C m,n, C). All results holds for

### December 4, 2013 MATH 171 BASIC LINEAR ALGEBRA B. KITCHENS

December 4, 2013 MATH 171 BASIC LINEAR ALGEBRA B KITCHENS The equation 1 Lines in two-dimensional space (1) 2x y = 3 describes a line in two-dimensional space The coefficients of x and y in the equation

### Solving Linear Systems, Continued and The Inverse of a Matrix

, Continued and The of a Matrix Calculus III Summer 2013, Session II Monday, July 15, 2013 Agenda 1. The rank of a matrix 2. The inverse of a square matrix Gaussian Gaussian solves a linear system by reducing

### SYSTEMS OF REGRESSION EQUATIONS

SYSTEMS OF REGRESSION EQUATIONS 1. MULTIPLE EQUATIONS y nt = x nt n + u nt, n = 1,...,N, t = 1,...,T, x nt is 1 k, and n is k 1. This is a version of the standard regression model where the observations

### L1-2. Special Matrix Operations: Permutations, Transpose, Inverse, Augmentation 12 Aug 2014

L1-2. Special Matrix Operations: Permutations, Transpose, Inverse, Augmentation 12 Aug 2014 Unfortunately, no one can be told what the Matrix is. You have to see it for yourself. -- Morpheus Primary concepts:

### Regression, least squares

Regression, least squares Joe Felsenstein Department of Genome Sciences and Department of Biology Regression, least squares p.1/24 Fitting a straight line X Two distinct cases: The X values are chosen

### Vector and Matrix Norms

Chapter 1 Vector and Matrix Norms 11 Vector Spaces Let F be a field (such as the real numbers, R, or complex numbers, C) with elements called scalars A Vector Space, V, over the field F is a non-empty

### Scientific Computing: An Introductory Survey

Scientific Computing: An Introductory Survey Chapter 3 Linear Least Squares Prof. Michael T. Heath Department of Computer Science University of Illinois at Urbana-Champaign Copyright c 2002. Reproduction

### B such that AB = I and BA = I. (We say B is an inverse of A.) Definition A square matrix A is invertible (or nonsingular) if matrix

Matrix inverses Recall... Definition A square matrix A is invertible (or nonsingular) if matrix B such that AB = and BA =. (We say B is an inverse of A.) Remark Not all square matrices are invertible.

### Quadratic forms Cochran s theorem, degrees of freedom, and all that

Quadratic forms Cochran s theorem, degrees of freedom, and all that Dr. Frank Wood Frank Wood, fwood@stat.columbia.edu Linear Regression Models Lecture 1, Slide 1 Why We Care Cochran s theorem tells us

### Linear Algebra: Determinants, Inverses, Rank

D Linear Algebra: Determinants, Inverses, Rank D 1 Appendix D: LINEAR ALGEBRA: DETERMINANTS, INVERSES, RANK TABLE OF CONTENTS Page D.1. Introduction D 3 D.2. Determinants D 3 D.2.1. Some Properties of

### Variance of OLS Estimators and Hypothesis Testing. Randomness in the model. GM assumptions. Notes. Notes. Notes. Charlie Gibbons ARE 212.

Variance of OLS Estimators and Hypothesis Testing Charlie Gibbons ARE 212 Spring 2011 Randomness in the model Considering the model what is random? Y = X β + ɛ, β is a parameter and not random, X may be

### Methods for Finding Bases

Methods for Finding Bases Bases for the subspaces of a matrix Row-reduction methods can be used to find bases. Let us now look at an example illustrating how to obtain bases for the row space, null space,

### Math 313 Lecture #10 2.2: The Inverse of a Matrix

Math 1 Lecture #10 2.2: The Inverse of a Matrix Matrix algebra provides tools for creating many useful formulas just like real number algebra does. For example, a real number a is invertible if there is

### Typical Linear Equation Set and Corresponding Matrices

EWE: Engineering With Excel Larsen Page 1 4. Matrix Operations in Excel. Matrix Manipulations: Vectors, Matrices, and Arrays. How Excel Handles Matrix Math. Basic Matrix Operations. Solving Systems of

### DETERMINANTS. b 2. x 2

DETERMINANTS 1 Systems of two equations in two unknowns A system of two equations in two unknowns has the form a 11 x 1 + a 12 x 2 = b 1 a 21 x 1 + a 22 x 2 = b 2 This can be written more concisely in

### Math 312 Homework 1 Solutions

Math 31 Homework 1 Solutions Last modified: July 15, 01 This homework is due on Thursday, July 1th, 01 at 1:10pm Please turn it in during class, or in my mailbox in the main math office (next to 4W1) Please

### Matrix Algebra, Class Notes (part 1) by Hrishikesh D. Vinod Copyright 1998 by Prof. H. D. Vinod, Fordham University, New York. All rights reserved.

Matrix Algebra, Class Notes (part 1) by Hrishikesh D. Vinod Copyright 1998 by Prof. H. D. Vinod, Fordham University, New York. All rights reserved. 1 Sum, Product and Transpose of Matrices. If a ij with

### LINEAR ALGEBRA. September 23, 2010

LINEAR ALGEBRA September 3, 00 Contents 0. LU-decomposition.................................... 0. Inverses and Transposes................................. 0.3 Column Spaces and NullSpaces.............................

### Similarity and Diagonalization. Similar Matrices

MATH022 Linear Algebra Brief lecture notes 48 Similarity and Diagonalization Similar Matrices Let A and B be n n matrices. We say that A is similar to B if there is an invertible n n matrix P such that

### A linear combination is a sum of scalars times quantities. Such expressions arise quite frequently and have the form

Section 1.3 Matrix Products A linear combination is a sum of scalars times quantities. Such expressions arise quite frequently and have the form (scalar #1)(quantity #1) + (scalar #2)(quantity #2) +...

### Introduction to Matrix Algebra

4 Introduction to Matrix Algebra In the previous chapter, we learned the algebraic results that form the foundation for the study of factor analysis and structural equation modeling. These results, powerful

### NOTES on LINEAR ALGEBRA 1

School of Economics, Management and Statistics University of Bologna Academic Year 205/6 NOTES on LINEAR ALGEBRA for the students of Stats and Maths This is a modified version of the notes by Prof Laura

### Lecture 6. Inverse of Matrix

Lecture 6 Inverse of Matrix Recall that any linear system can be written as a matrix equation In one dimension case, ie, A is 1 1, then can be easily solved as A x b Ax b x b A 1 A b A 1 b provided that

### INTRODUCTORY LINEAR ALGEBRA WITH APPLICATIONS B. KOLMAN, D. R. HILL

SOLUTIONS OF THEORETICAL EXERCISES selected from INTRODUCTORY LINEAR ALGEBRA WITH APPLICATIONS B. KOLMAN, D. R. HILL Eighth Edition, Prentice Hall, 2005. Dr. Grigore CĂLUGĂREANU Department of Mathematics

### Similar matrices and Jordan form

Similar matrices and Jordan form We ve nearly covered the entire heart of linear algebra once we ve finished singular value decompositions we ll have seen all the most central topics. A T A is positive

### Estimation and Inference in Cointegration Models Economics 582

Estimation and Inference in Cointegration Models Economics 582 Eric Zivot May 17, 2012 Tests for Cointegration Let the ( 1) vector Y be (1). Recall, Y is cointegrated with 0 cointegrating vectors if there

### Using row reduction to calculate the inverse and the determinant of a square matrix

Using row reduction to calculate the inverse and the determinant of a square matrix Notes for MATH 0290 Honors by Prof. Anna Vainchtein 1 Inverse of a square matrix An n n square matrix A is called invertible

### Linear Algebra Review. Vectors

Linear Algebra Review By Tim K. Marks UCSD Borrows heavily from: Jana Kosecka kosecka@cs.gmu.edu http://cs.gmu.edu/~kosecka/cs682.html Virginia de Sa Cogsci 8F Linear Algebra review UCSD Vectors The length

### SOLVING LINEAR SYSTEMS

SOLVING LINEAR SYSTEMS Linear systems Ax = b occur widely in applied mathematics They occur as direct formulations of real world problems; but more often, they occur as a part of the numerical analysis

### CS3220 Lecture Notes: QR factorization and orthogonal transformations

CS3220 Lecture Notes: QR factorization and orthogonal transformations Steve Marschner Cornell University 11 March 2009 In this lecture I ll talk about orthogonal matrices and their properties, discuss

### 2. Linear regression with multiple regressors

2. Linear regression with multiple regressors Aim of this section: Introduction of the multiple regression model OLS estimation in multiple regression Measures-of-fit in multiple regression Assumptions

### Chapter 3: The Multiple Linear Regression Model

Chapter 3: The Multiple Linear Regression Model Advanced Econometrics - HEC Lausanne Christophe Hurlin University of Orléans November 23, 2013 Christophe Hurlin (University of Orléans) Advanced Econometrics

### Notes on Applied Linear Regression

Notes on Applied Linear Regression Jamie DeCoster Department of Social Psychology Free University Amsterdam Van der Boechorststraat 1 1081 BT Amsterdam The Netherlands phone: +31 (0)20 444-8935 email:

### The Delta Method and Applications

Chapter 5 The Delta Method and Applications 5.1 Linear approximations of functions In the simplest form of the central limit theorem, Theorem 4.18, we consider a sequence X 1, X,... of independent and

### We know a formula for and some properties of the determinant. Now we see how the determinant can be used.

Cramer s rule, inverse matrix, and volume We know a formula for and some properties of the determinant. Now we see how the determinant can be used. Formula for A We know: a b d b =. c d ad bc c a Can we

### Multivariate normal distribution and testing for means (see MKB Ch 3)

Multivariate normal distribution and testing for means (see MKB Ch 3) Where are we going? 2 One-sample t-test (univariate).................................................. 3 Two-sample t-test (univariate).................................................

### Hypothesis Testing Level I Quantitative Methods. IFT Notes for the CFA exam

Hypothesis Testing 2014 Level I Quantitative Methods IFT Notes for the CFA exam Contents 1. Introduction... 3 2. Hypothesis Testing... 3 3. Hypothesis Tests Concerning the Mean... 10 4. Hypothesis Tests

### Vectors. Philippe B. Laval. Spring 2012 KSU. Philippe B. Laval (KSU) Vectors Spring /

Vectors Philippe B Laval KSU Spring 2012 Philippe B Laval (KSU) Vectors Spring 2012 1 / 18 Introduction - Definition Many quantities we use in the sciences such as mass, volume, distance, can be expressed

### Elementary Matrices and The LU Factorization

lementary Matrices and The LU Factorization Definition: ny matrix obtained by performing a single elementary row operation (RO) on the identity (unit) matrix is called an elementary matrix. There are three

### Continued Fractions and the Euclidean Algorithm

Continued Fractions and the Euclidean Algorithm Lecture notes prepared for MATH 326, Spring 997 Department of Mathematics and Statistics University at Albany William F Hammond Table of Contents Introduction

### T ( a i x i ) = a i T (x i ).

Chapter 2 Defn 1. (p. 65) Let V and W be vector spaces (over F ). We call a function T : V W a linear transformation form V to W if, for all x, y V and c F, we have (a) T (x + y) = T (x) + T (y) and (b)

### Cofactor Expansion: Cramer s Rule

Cofactor Expansion: Cramer s Rule MATH 322, Linear Algebra I J. Robert Buchanan Department of Mathematics Spring 2015 Introduction Today we will focus on developing: an efficient method for calculating

### Orthogonal Diagonalization of Symmetric Matrices

MATH10212 Linear Algebra Brief lecture notes 57 Gram Schmidt Process enables us to find an orthogonal basis of a subspace. Let u 1,..., u k be a basis of a subspace V of R n. We begin the process of finding

### Clustering in the Linear Model

Short Guides to Microeconometrics Fall 2014 Kurt Schmidheiny Universität Basel Clustering in the Linear Model 2 1 Introduction Clustering in the Linear Model This handout extends the handout on The Multiple

### The Characteristic Polynomial

Physics 116A Winter 2011 The Characteristic Polynomial 1 Coefficients of the characteristic polynomial Consider the eigenvalue problem for an n n matrix A, A v = λ v, v 0 (1) The solution to this problem

### Vector algebra Christian Miller CS Fall 2011

Vector algebra Christian Miller CS 354 - Fall 2011 Vector algebra A system commonly used to describe space Vectors, linear operators, tensors, etc. Used to build classical physics and the vast majority

### SYSTEMS OF EQUATIONS

SYSTEMS OF EQUATIONS 1. Examples of systems of equations Here are some examples of systems of equations. Each system has a number of equations and a number (not necessarily the same) of variables for which

### F Matrix Calculus F 1

F Matrix Calculus F 1 Appendix F: MATRIX CALCULUS TABLE OF CONTENTS Page F1 Introduction F 3 F2 The Derivatives of Vector Functions F 3 F21 Derivative of Vector with Respect to Vector F 3 F22 Derivative

### Linear Maps. Isaiah Lankham, Bruno Nachtergaele, Anne Schilling (February 5, 2007)

MAT067 University of California, Davis Winter 2007 Linear Maps Isaiah Lankham, Bruno Nachtergaele, Anne Schilling (February 5, 2007) As we have discussed in the lecture on What is Linear Algebra? one of

### NOTES ON LINEAR TRANSFORMATIONS

NOTES ON LINEAR TRANSFORMATIONS Definition 1. Let V and W be vector spaces. A function T : V W is a linear transformation from V to W if the following two properties hold. i T v + v = T v + T v for all

### Review Jeopardy. Blue vs. Orange. Review Jeopardy

Review Jeopardy Blue vs. Orange Review Jeopardy Jeopardy Round Lectures 0-3 Jeopardy Round \$200 How could I measure how far apart (i.e. how different) two observations, y 1 and y 2, are from each other?

### 1 Short Introduction to Time Series

ECONOMICS 7344, Spring 202 Bent E. Sørensen January 24, 202 Short Introduction to Time Series A time series is a collection of stochastic variables x,.., x t,.., x T indexed by an integer value t. The

### Notes on Determinant

ENGG2012B Advanced Engineering Mathematics Notes on Determinant Lecturer: Kenneth Shum Lecture 9-18/02/2013 The determinant of a system of linear equations determines whether the solution is unique, without

### [1] Diagonal factorization

8.03 LA.6: Diagonalization and Orthogonal Matrices [ Diagonal factorization [2 Solving systems of first order differential equations [3 Symmetric and Orthonormal Matrices [ Diagonal factorization Recall:

### Mean squared error matrix comparison of least aquares and Stein-rule estimators for regression coefficients under non-normal disturbances

METRON - International Journal of Statistics 2008, vol. LXVI, n. 3, pp. 285-298 SHALABH HELGE TOUTENBURG CHRISTIAN HEUMANN Mean squared error matrix comparison of least aquares and Stein-rule estimators

### Recall that two vectors in are perpendicular or orthogonal provided that their dot

Orthogonal Complements and Projections Recall that two vectors in are perpendicular or orthogonal provided that their dot product vanishes That is, if and only if Example 1 The vectors in are orthogonal

### Econometrics Simple Linear Regression

Econometrics Simple Linear Regression Burcu Eke UC3M Linear equations with one variable Recall what a linear equation is: y = b 0 + b 1 x is a linear equation with one variable, or equivalently, a straight

### Tensors on a vector space

APPENDIX B Tensors on a vector space In this Appendix, we gather mathematical definitions and results pertaining to tensors. The purpose is mostly to introduce the modern, geometrical view on tensors,

### 1 VECTOR SPACES AND SUBSPACES

1 VECTOR SPACES AND SUBSPACES What is a vector? Many are familiar with the concept of a vector as: Something which has magnitude and direction. an ordered pair or triple. a description for quantities such

### Matrix Solution of Equations

Contents 8 Matrix Solution of Equations 8.1 Solution by Cramer s Rule 2 8.2 Solution by Inverse Matrix Method 13 8.3 Solution by Gauss Elimination 22 Learning outcomes In this Workbook you will learn to

### Algebraic and Combinatorial Circuits

C H A P T E R Algebraic and Combinatorial Circuits Algebraic circuits combine operations drawn from an algebraic system. In this chapter we develop algebraic and combinatorial circuits for a variety of

### Section 5.3. Section 5.3. u m ] l jj. = l jj u j + + l mj u m. v j = [ u 1 u j. l mj

Section 5. l j v j = [ u u j u m ] l jj = l jj u j + + l mj u m. l mj Section 5. 5.. Not orthogonal, the column vectors fail to be perpendicular to each other. 5..2 his matrix is orthogonal. Check that

### Brief Introduction to Vectors and Matrices

CHAPTER 1 Brief Introduction to Vectors and Matrices In this chapter, we will discuss some needed concepts found in introductory course in linear algebra. We will introduce matrix, vector, vector-valued

### Introduction to General and Generalized Linear Models

Introduction to General and Generalized Linear Models General Linear Models - part I Henrik Madsen Poul Thyregod Informatics and Mathematical Modelling Technical University of Denmark DK-2800 Kgs. Lyngby