# Journal of Statistical Software

Save this PDF as:

Size: px
Start display at page:

## Transcription

1 JSS Journal of Statistical Software August 26, Volume 16, Issue 8. Some Algorithms for the Conditional Mean Vector and Covariance Matrix John F. Monahan North Carolina State University Abstract We consider here the problem of computing the mean vector and covariance matrix for a conditional normal distribution, considering especially a sequence of problems where the conditioning variables are changing. The sweep operator provides one simple general approach that is easy to implement and update. A second, more goal-oriented general method avoids explicit computation of the vector and matrix, while enabling easy evaluation of the conditional density for likelihood computation or easy generation from the conditional distribution. The covariance structure that arises from the special case of an ARMA(p, q) time series can be exploited for substantial improvements in computational efficiency. Keywords: conditional distribution, sweep operator, ARMA process. 1. Introduction Consider Z Normal n (µ, A) and partition Z = (Z 1, Z 2 ) T, similarly, µ = (µ 1, µ 2 ) T, and the covariance matrix A, to write ( ) Z1 µ1 A11 A N Z n, 12 s 2 µ 2 A 21 A 22 n s. (1) The conditional distribution of Z 1 given Z 2 = is also multivariate normal, given by where and (Z 1 Z 2 = ) N s (µ 1.2, A 1.2 ), (2) µ 1.2 = µ 1 + A 12 A 1 22 ( ),

2 2 Some Algorithms for the Conditional Mean Vector and Covariance Matrix A 1.2 = A 11 A 12 A 1 22 A 21. Note that the conditional density of (Z 1 Z 2 = ) equals the ratio of the joint normal density of (Z 1, Z 2 ) and the marginal normal density of Z 2 evaluated at Z 2 =. Taking µ = to simplify, cancelling constants, and taking the log of both sides, we get the useful result (z 1 A 12 A 1 22 ) T (A 1.2 ) 1 (z 1 A 12 A 1 22 ) = z T A 1 z z T 2 A 1 22 (3) which can be rewritten as z T A 1 z = (z 1 A 12 A 1 22 ) T (A 1.2 ) 1 (z 1 A 12 A 1 22 ) + z T 2 A (4) Amidst the cancellation of the constants is a useful relationship of the determinants: A 1.2 = A / A 22. (5) We consider here the problem of computing the conditional mean vector µ 1.2 and covariance matrix A 1.2 where n is large, s is modest in size, and where the partitioning is arbitrary. The applications envisioned include both generation from the conditional distribution and inference in cases of censored or missing values. Especially interesting are applications where the partitioning may be repeatedly changing, as may occur in some MCMC techniques. In Sections 2 and 3, we consider the case of the general multivariate normal distribution, and the special case where Z arises from a Gaussian autoregressive-moving average (ARMA(p, q)) process is considered in Section 4. FORTRAN 95 code implementing some of these results is included with this manuscript as gsubex1.f95 to gsubex3.f A general approach with the sweep operator Following standard computational techniques (see, for example, Golub and van Loan 1996; Monahan 21; Stewart 1973) such as Gaussian elimination or Cholesky factorization, the direct computation of µ 1.2 and A 1.2 will take O(n 3 ) floating point operations, owing from computing the inverse (solving a system of equations) in A 22 which is O(n) in size. However, if the partitioning were to be changed slightly, little of these results could be used to address the modified problem. The sweep operator is more appropriate here, being easy to update while taking nearly the same computations to form µ 1.2 and A 1.2. The sweep operator is widely used in regression problems (see Stiefel 1963; Goodnight 1979; Monahan 21, Section 5.12) computing the following tableau after sweeping the first p rows/columns of the (p + 1) (p + 1) matrix: X T X X T y y T X y T sweep y (X T X) 1 (X T X) 1 X T y y T X(X T X) 1 y T y y T X(X T X) 1 X T y The sweep operator can be viewed as a compact scheme for full column elimination using diagonal pivots v jj. It is usually implemented as a single step, operating on one row/column at a time, so that the inverse is a reciprocal. When an identity matrix is augmented, as below, full elimination pivoting on the first p columns produces the matrix X T X X T y I p Ip (X T X) y T X y T 1 X T y (X T X) 1 y y T y y T X(X T X) 1 X T y y T X(X T X) 1..

3 Journal of Statistical Software 3 Merely overwriting the inverse on top of the matrix, and avoiding storing the known elements of the identity matrix produces the sweep step. The sweep operator can be used to compute the regression coefficients (X T X) 1 X T y and error sum of squares y T y y T X(X T X) 1 X T y. Applying the sweep operator to the partitioned covariance matrix A, the result of sweeping the LAST n s rows/columns produced the following tableau A11 A 12 A 21 A 22 sweep A 1.2 A 12 A 1 22 A 1 22 A 21 A 1 22 which obviously includes the necessary information for the conditional distribution. The ease of computing updates follows from the commutativity and reversibility properties of the sweep operator. The commutativity property means that the order of the sweeping does not matter. Reversibility means that sweeping a particular row/column a second time is the same as not sweeping it in the first place. As a consequence, sweeping a particular row/column merely moves a variable from one subset to the other. For the conditional distribution, the rows/columns corresponding to the variables that are to be conditioned upon are swept. Conditioning on a new variable means sweeping that row/column of the covariance matrix. Removing a variable from the list to be conditioned upon means sweeping that row/column a second time. Each sweep of a row/column takes O(n 2 ) operations. As in computing the determinant from full column pivoting, the determinant of A 22 can be computed from the product of the pivot elements v jj, which can be seen in the following Example 1. Example 1: Let n = 3, and A given below and the following sweep sequence: A = sweep 2 ( µ1 (Z 1, Z 3 Z 2 = ) N 2 µ sweep (Z 1 Z 2 =, Z 3 = z 3 ) N 1 (µ sweep 2 ( µ1 (Z 1, Z 2 Z 3 = z 3 ) N 2 + µ sweep 1 (Z 2 Z 1 = z 1, Z 3 = z 3 ) N 1 (µ /4 + 1/ ( ), /3 v 22 = 4, = 4 ) 4 2 v 33 = 5, 2 6 = 2, 2.7 ) z 3 µ 3 v 22 =.3, 6 = 6 (z 3 µ 3 ), /3 1/ v 11 = 3, z 1 µ 1 z 3 µ 3 ) 3 6 = 18, 3 )

4 4 Some Algorithms for the Conditional Mean Vector and Covariance Matrix Note that the repeating decimals.333 and.166 in the example above are rounded versions of 1/3 and 1/6. In practice, rounding error will accumulate with each sweep step, and the process may need to be restarted when the accumulated error becomes noticeable. 3. A second general method This second general approach aims to solve the two specific problems without explicitly computing the conditional mean vector and covariance matrix. One task is computing the likelihood of the conditional normal distribution, namely the quadratic form (z 1 µ 1.2 ) T (A 1.2 ) 1 (z 1 µ 1.2 ) and the determinant A 1.2. The second task would be to generate a random vector from the conditional distribution. Consider the reverse-cholesky factorization of the covariance matrix A, producing the upper triangular factor U: T A11 A A = 12 = UU T U = 1 U 2 U 1 U 2 U = 1 U T 1 + U 2 U T 2 U 2 U T 3 A 21 A 22 U 3 U 3 U 3 U T 2 U 3 U T 3 The usual Cholesky factorization produces a lower triangular matrix and the induction order goes from upper left to lower right; here the order is reversed and an upper triangular matrix is produced. Now consider the problem of generating from the joint distribution using a random vector v Normal n (, I n ). Partitioning v in the same way, the algebra would follow the simple form z 1 = µ 1 + U 1 v 1 + U 2 v 2 = µ 2 + U 3 v 2. Conditioning on Z 2 = would really mean knowing. If that were so, we could then solve for v 2, solving the triangular system to get v 2 = U 1 3 ( ). Then z 1 could then be generated conditional on the value of, following leading to the result A 1.2 = U 1 U T 1. z 1 = µ 1 + U 1 v 1 + U 2 U 1 3 ( ), For computing the likelihood, the determinant part is easy: A 1.2 = U 1 2, following from (5), since U 3 2 = A 22. The quadratic form is nearly as easy, since from (4) we have (z 1 µ 1.2 ) T (A 1.2 ) 1 (z 1 µ 1.2 ) = (z µ) T A 1 (z µ) ( ) T (A 22 ) 1 ( ). The quadratic form in A 1 can be written in partitioned form as the squared length of U 1 U 2 U 3 1 z1 µ 1 = U 1 1 (z 1 µ 1 ) U 2 U 1 3 ( ) U 1 3 ( ),

5 Journal of Statistical Software 5 while the quadratic form in (A 22 ) 1 is just the squared length of the second partitioned vector above ( ) T (A 22 ) 1 ( ) = U 1 3 ( ) 2 so that the difference is the squared length of the first component (z 1 µ 1.2 ) T (A 1.2 ) 1 (z 1 µ 1.2 ) = U 1 1 (z 1 µ 1 ) U 2 U 1 3 ( ) 2. As presented, this approach offers no particular advantage over any other direct method it merely uses a form of Cholesky decomposition. If the partitioning were changed, however, this Cholesky factor can be updated. The updating procedure due to Clarke (1981) was devised for subset regression computations and follows the same spirit as the sweep operator (see also Daniel, Gragg, Kaufman, and Stewart 1976). The efficiency of the update scheme will depend on the pattern of changes of partitioning. The change in partitioning can be viewed in terms of a permutation matrix P. Conditioning on an additional component (or one fewer) means permuting that component to the partition boundary and then moving the boundary. Here we want to construct the reverse-cholesky (upper triangular) factor of the changed matrix P AP T using P U. The problem is that P U is no longer upper triangular but needs to be returned to that form. This can be done by postmultiplying by sufficient Givens transformations G 1, G 2,..., G m, each an orthogonal matrix, as many as may be needed to return P UG 1 G m to upper triangular form, and restoring the factorization P AP T = (P UG 1 G m )(P UG 1 G m ) T. Example 2: For the covariance matrix A below, the decomposition A = = UUT = provides U that could be easily employed for (Z 1, Z 2 Z 3 = z 3, Z 4 = z 4 ). Now suppose we now want (Z 1, Z 4 Z 2 =, Z 3 = z 3 ), so that A 1.2 =, A = A / A 22 = 596/46 = , and P U = To return this matrix to upper triangular form, first postmultiply by a Givens transformation G 1 defined by elements (4,3) and (4,4), changing everything in columns 3 and 4, and placing a zero in (4,3). Here G 1 is an identity matrix except for (G 1 ) 33 = (G 1 ) 44 = 1 5 and (G 1 ) 34 = (G 1 ) 43 = 2 5. Next construct G 2 using (3,2) and (3,3), changing columns 2 and 3 and placing a zero in (3,2), and return the upper triangular form. The zeros in (2,2) and (2,3) will change with these two transformations, but this will not affect the triangular form.

6 6 Some Algorithms for the Conditional Mean Vector and Covariance Matrix 4. ARMA covariance model The case where Z t, t = 1,..., n follows the ARMA(p, q) time series model Z t φ 1 Z t 1... φ p Z t p = e t θ 1 e t 1... θ q e t q (6) brings some special structure that can be exploited. If the partitioning followed the same simple structure as in (1), then the conditional distribution can be easily constructed. However, we are interested here in the case where the partitioned vectors are Z 1 = (Z S(1), Z S(2),..., Z S(s) ) T and Z 2 = (Z T (1), Z T (2),..., Z T (n s) ) T and where subset list of indices S(j) may be arbitrary and S and T are complementary. A device due to Ansley can be exploited to compute the conditional distribution efficiently. Let Z t, t = 1,..., n follow the ARMA(p, q) process (6) and denote the covariance matrix by A. Now let the n n matrix B have its first p rows all zeros except for 1 on the diagonal, and the last n p rows of the form φ p φ p 1... φ 2 φ 1 1 aligned so that the 1 appears on the diagonal. Ansley (1979) shows that the covariance matrix of BZ, namely BAB T, is a banded positive definite matrix. The bandwidth of BAB T is max(p, q) = m for the first p rows, and q thereafter. Its Cholesky factor L, such that BAB T = LL T, is lower triangular with the same banding structure, and can be computed in O(nm 2 ). Solving a system of equations in L, say to compute L 1 u, can done in O(nm). The space required is only O(m 2 ). As a result, a bilinear form in the inverse of the covariance matrix u T A 1 v = u T (B T L T L 1 B)v = (L 1 Bu) T (L 1 Bv) can be computed in O(nm 2 ) time and O(m 2 ) space. Before applying this result to the arbitrary partitioning problem, and so following the original partitioning in (1), first notice that Is A 1 I s = (A 1.2 ) 1 (7) which can be established through the usual results for the inverse of a partitioned matrix. The remaining results are more complicated. Defining Q(z) = z T A 1 z, and employing result (4), we have the derivative results 2 Q(z)/ z i z j = 2 (A 1.2 ) 1 ij for 1 i, j s, and Q(z)/ z i z1 = = 2 (A 1.2 ) 1 A 12 A 1 22 i

7 Journal of Statistical Software 7 for 1 i s. For computation, we replace the derivative with a difference expression. In the case of a quadratic function Q(z), the second difference will be exact for the second partial derivative, so that Q(z + δe i + δe j ) Q(z + δe i ) Q(z + δe j ) + Q(z) /δ 2 = 2 Q(z)/ z i z j = 2e T i A 1 e j or e T i A 1 e j = (A 1.2 ) 1 ij where e i is the i th elementary vector, verifying (7) above. A similar result using the first difference, at z 1 =, gives Q( + δe i ) Q( leading to the computational route of for 1 i s. e T i A 1 )/δ = Q(z)/ z i z1 = = 2 (A 1.2 ) 1 A 12 A 1 22 i = (A 1.2 ) 1 A 12 A 1 22 i Now to transform these results to the case of arbitrary parititioning, construct the n s matrix P, which is zero everywhere except for elements (P ) S(i),i = 1 for 1 i s, and, similarly, the n (n s) matrix Q, which is zero everywhere except (Q) T (i),i = 1 for 1 i n s. Putting P and Q together as P Then we have the computational formulas Q forms an n n permutation matrix. P T A 1 P = (L 1 BP ) T (L 1 BP ) = (A 1.2 ) 1 and P T A 1 P Q = (L 1 BP ) T (L 1 B P Q ) = (A 1.2 ) 1 A 12 A The key computational advantage of this method is that, as bilinear forms in A 1, the computational burden is only O(nm 2 + nms + s 2 ) for ARMA(p, q) time series models. One remaining detail is that (A 1.2 ) 1 and (A 1.2 ) 1 A 12 A 1 22 (or with ( )) are not the most convenient forms for most applications, which would involve A 1.2 or (A 1.2 ) 1, and

8 8 Some Algorithms for the Conditional Mean Vector and Covariance Matrix A 12 A 1 22 ( ). Equally useful would be a Cholesky factor of (A 1.2 ) 1 and A 12 A 1 22 ( µ 2 ). To compute these in a stable and efficient manner, construct the matrix L 1 BP L 1 BP P Q = L 1 B P P Q and perform Householder transformations H 1,,..., H s H s H 2 H 1 L 1 BP L 1 B P Q = R u to form the upper triangular matrix R. Then R T R = (A 1.2 ) 1 shows that R is a factor, and R 1 u = A 12 A 1 22 ( ) finishes the computations. Computation of the likelihood from the conditional distribution would then take the computational route of (z 1 µ 1.2 ) T (A 1.2 ) 1 (z 1 µ 1.2 ) = R(z 1 µ 1.2 ) T R(z 1 µ 1.2 ) and generation from the conditional distribution would follow z 1 = µ R 1 v, where, as before, v Normal n (, I n ). The steps using Householder transformations are the same orthogonalization steps for a QR factorization in regression; see, for example, Golub and van Loan (1996, Section 6.2), Monahan (21, Section 5.6), or Stewart (1973, Section 5.3). Finally, if the conditioning variables (or its mean) were changed, the resulting vector u would have to be recomputed, taking only O(n). If a variable were moved from S to T, the mean vector would change and u must be updated. If a variable were moved from T to S, then a new column would be added to P, and the mean vector would also have to be updated. Acknowledgements The author thanks the referee for careful reading and checking the code and Sujit Ghosh for suggesting the problem. References Ansley CF (1979). An Algorithm for the Exact Likelihood of a Mixed Autoregressive Moving Average Process. Biometrika, 66, Clarke MRB (1981). AS 163: A Givens Algorithm for Moving from One Linear Model to Another Without Going Back to the Data. Applied Statistics, 3, Daniel JW, Gragg WB, Kaufman L, Stewart GW (1976). Reorthogonalization and Stable Algorithms for Updating the Gram-Schmidt QR Factorization. Mathematics of Computation, 3, Golub GH, van Loan C (1996). Matrix Computations. Johns Hopkins University Press, Baltimore, 3rd edition.

9 Journal of Statistical Software 9 Goodnight JH (1979). A Tutorial on the Sweep Operator. The American Statistician, 33, Monahan JF (21). Numerical Methods of Statistics. Cambridge University Press, New York. Stewart GW (1973). Introduction to Matrix Computations. Academic Press, New York. Stiefel EL (1963). An Introduction to Numerical Mathematics. Academic Press, New York. Affiliation: John F. Monahan Department of Statistics North Carolina State University Raleigh, NC , United States of America URL: Journal of Statistical Software published by the American Statistical Association Volume 16, Issue 8 Submitted: August 26 Accepted:

### 6. Cholesky factorization

6. Cholesky factorization EE103 (Fall 2011-12) triangular matrices forward and backward substitution the Cholesky factorization solving Ax = b with A positive definite inverse of a positive definite matrix

### MAT Solving Linear Systems Using Matrices and Row Operations

MAT 171 8.5 Solving Linear Systems Using Matrices and Row Operations A. Introduction to Matrices Identifying the Size and Entries of a Matrix B. The Augmented Matrix of a System of Equations Forming Augmented

### Scientific Computing: An Introductory Survey

Scientific Computing: An Introductory Survey Chapter 3 Linear Least Squares Prof. Michael T. Heath Department of Computer Science University of Illinois at Urbana-Champaign Copyright c 2002. Reproduction

### 7. LU factorization. factor-solve method. LU factorization. solving Ax = b with A nonsingular. the inverse of a nonsingular matrix

7. LU factorization EE103 (Fall 2011-12) factor-solve method LU factorization solving Ax = b with A nonsingular the inverse of a nonsingular matrix LU factorization algorithm effect of rounding error sparse

### Linear Systems. Singular and Nonsingular Matrices. Find x 1, x 2, x 3 such that the following three equations hold:

Linear Systems Example: Find x, x, x such that the following three equations hold: x + x + x = 4x + x + x = x + x + x = 6 We can write this using matrix-vector notation as 4 {{ A x x x {{ x = 6 {{ b General

### The basic unit in matrix algebra is a matrix, generally expressed as: a 11 a 12. a 13 A = a 21 a 22 a 23

(copyright by Scott M Lynch, February 2003) Brief Matrix Algebra Review (Soc 504) Matrix algebra is a form of mathematics that allows compact notation for, and mathematical manipulation of, high-dimensional

### CS3220 Lecture Notes: QR factorization and orthogonal transformations

CS3220 Lecture Notes: QR factorization and orthogonal transformations Steve Marschner Cornell University 11 March 2009 In this lecture I ll talk about orthogonal matrices and their properties, discuss

### MATRIX ALGEBRA AND SYSTEMS OF EQUATIONS

MATRIX ALGEBRA AND SYSTEMS OF EQUATIONS Systems of Equations and Matrices Representation of a linear system The general system of m equations in n unknowns can be written a x + a 2 x 2 + + a n x n b a

### Linear Algebra Methods for Data Mining

Linear Algebra Methods for Data Mining Saara Hyvönen, Saara.Hyvonen@cs.helsinki.fi Spring 2007 Lecture 3: QR, least squares, linear regression Linear Algebra Methods for Data Mining, Spring 2007, University

### Lecture 5 - Triangular Factorizations & Operation Counts

LU Factorization Lecture 5 - Triangular Factorizations & Operation Counts We have seen that the process of GE essentially factors a matrix A into LU Now we want to see how this factorization allows us

### MATRIX ALGEBRA AND SYSTEMS OF EQUATIONS. + + x 2. x n. a 11 a 12 a 1n b 1 a 21 a 22 a 2n b 2 a 31 a 32 a 3n b 3. a m1 a m2 a mn b m

MATRIX ALGEBRA AND SYSTEMS OF EQUATIONS 1. SYSTEMS OF EQUATIONS AND MATRICES 1.1. Representation of a linear system. The general system of m equations in n unknowns can be written a 11 x 1 + a 12 x 2 +

### Solution of Linear Systems

Chapter 3 Solution of Linear Systems In this chapter we study algorithms for possibly the most commonly occurring problem in scientific computing, the solution of linear systems of equations. We start

### DETERMINANTS. b 2. x 2

DETERMINANTS 1 Systems of two equations in two unknowns A system of two equations in two unknowns has the form a 11 x 1 + a 12 x 2 = b 1 a 21 x 1 + a 22 x 2 = b 2 This can be written more concisely in

### Introduction to Matrix Algebra

Psychology 7291: Multivariate Statistics (Carey) 8/27/98 Matrix Algebra - 1 Introduction to Matrix Algebra Definitions: A matrix is a collection of numbers ordered by rows and columns. It is customary

### LINEAR ALGEBRA. September 23, 2010

LINEAR ALGEBRA September 3, 00 Contents 0. LU-decomposition.................................... 0. Inverses and Transposes................................. 0.3 Column Spaces and NullSpaces.............................

### 7 Gaussian Elimination and LU Factorization

7 Gaussian Elimination and LU Factorization In this final section on matrix factorization methods for solving Ax = b we want to take a closer look at Gaussian elimination (probably the best known method

### 1 Gaussian Elimination

Contents 1 Gaussian Elimination 1.1 Elementary Row Operations 1.2 Some matrices whose associated system of equations are easy to solve 1.3 Gaussian Elimination 1.4 Gauss-Jordan reduction and the Reduced

### Operation Count; Numerical Linear Algebra

10 Operation Count; Numerical Linear Algebra 10.1 Introduction Many computations are limited simply by the sheer number of required additions, multiplications, or function evaluations. If floating-point

### Numerical Analysis Lecture Notes

Numerical Analysis Lecture Notes Peter J. Olver 4. Gaussian Elimination In this part, our focus will be on the most basic method for solving linear algebraic systems, known as Gaussian Elimination in honor

### Practical Numerical Training UKNum

Practical Numerical Training UKNum 7: Systems of linear equations C. Mordasini Max Planck Institute for Astronomy, Heidelberg Program: 1) Introduction 2) Gauss Elimination 3) Gauss with Pivoting 4) Determinants

### Linear Equations in Linear Algebra

1 Linear Equations in Linear Algebra 1.2 Row Reduction and Echelon Forms ECHELON FORM A rectangular matrix is in echelon form (or row echelon form) if it has the following three properties: 1. All nonzero

### Linear Least Squares

Linear Least Squares Suppose we are given a set of data points {(x i,f i )}, i = 1,...,n. These could be measurements from an experiment or obtained simply by evaluating a function at some points. One

### Recall that two vectors in are perpendicular or orthogonal provided that their dot

Orthogonal Complements and Projections Recall that two vectors in are perpendicular or orthogonal provided that their dot product vanishes That is, if and only if Example 1 The vectors in are orthogonal

### Solving Systems of Linear Equations

LECTURE 5 Solving Systems of Linear Equations Recall that we introduced the notion of matrices as a way of standardizing the expression of systems of linear equations In today s lecture I shall show how

### Linear Dependence Tests

Linear Dependence Tests The book omits a few key tests for checking the linear dependence of vectors. These short notes discuss these tests, as well as the reasoning behind them. Our first test checks

### Solving Sets of Equations. 150 B.C.E., 九章算術 Carl Friedrich Gauss,

Solving Sets of Equations 5 B.C.E., 九章算術 Carl Friedrich Gauss, 777-855 Gaussian-Jordan Elimination In Gauss-Jordan elimination, matrix is reduced to diagonal rather than triangular form Row combinations

### Linear Systems COS 323

Linear Systems COS 323 Last time: Constrained Optimization Linear constrained optimization Linear programming (LP) Simplex method for LP General optimization With equality constraints: Lagrange multipliers

### Similarity and Diagonalization. Similar Matrices

MATH022 Linear Algebra Brief lecture notes 48 Similarity and Diagonalization Similar Matrices Let A and B be n n matrices. We say that A is similar to B if there is an invertible n n matrix P such that

### Institute for Advanced Computer Studies. Department of Computer Science. The Triangular Matrices of Gaussian Elimination. and Related Decompositions

University of Maryland Institute for Advanced Computer Studies Department of Computer Science College Park TR{95{91 TR{3533 The Triangular Matrices of Gaussian Elimination and Related Decompositions G.

### 1 Determinants. Definition 1

Determinants The determinant of a square matrix is a value in R assigned to the matrix, it characterizes matrices which are invertible (det 0) and is related to the volume of a parallelpiped described

### ( % . This matrix consists of \$ 4 5 " 5' the coefficients of the variables as they appear in the original system. The augmented 3 " 2 2 # 2 " 3 4&

Matrices define matrix We will use matrices to help us solve systems of equations. A matrix is a rectangular array of numbers enclosed in parentheses or brackets. In linear algebra, matrices are important

### We seek a factorization of a square matrix A into the product of two matrices which yields an

LU Decompositions We seek a factorization of a square matrix A into the product of two matrices which yields an efficient method for solving the system where A is the coefficient matrix, x is our variable

### 2.1: Determinants by Cofactor Expansion. Math 214 Chapter 2 Notes and Homework. Evaluate a Determinant by Expanding by Cofactors

2.1: Determinants by Cofactor Expansion Math 214 Chapter 2 Notes and Homework Determinants The minor M ij of the entry a ij is the determinant of the submatrix obtained from deleting the i th row and the

### Inner Product Spaces and Orthogonality

Inner Product Spaces and Orthogonality week 3-4 Fall 2006 Dot product of R n The inner product or dot product of R n is a function, defined by u, v a b + a 2 b 2 + + a n b n for u a, a 2,, a n T, v b,

### Factorization Theorems

Chapter 7 Factorization Theorems This chapter highlights a few of the many factorization theorems for matrices While some factorization results are relatively direct, others are iterative While some factorization

### Chapter 8. Matrices II: inverses. 8.1 What is an inverse?

Chapter 8 Matrices II: inverses We have learnt how to add subtract and multiply matrices but we have not defined division. The reason is that in general it cannot always be defined. In this chapter, we

### Topic 1: Matrices and Systems of Linear Equations.

Topic 1: Matrices and Systems of Linear Equations Let us start with a review of some linear algebra concepts we have already learned, such as matrices, determinants, etc Also, we shall review the method

### Basics Inversion and related concepts Random vectors Matrix calculus. Matrix algebra. Patrick Breheny. January 20

Matrix algebra January 20 Introduction Basics The mathematics of multiple regression revolves around ordering and keeping track of large arrays of numbers and solving systems of equations The mathematical

### 6 Gaussian Elimination

G1BINM Introduction to Numerical Methods 6 1 6 Gaussian Elimination 61 Simultaneous linear equations Consider the system of linear equations a 11 x 1 + a 12 x 2 + + a 1n x n = b 1 a 21 x 1 + a 22 x 2 +

### Logistic Regression. Jia Li. Department of Statistics The Pennsylvania State University. Logistic Regression

Logistic Regression Department of Statistics The Pennsylvania State University Email: jiali@stat.psu.edu Logistic Regression Preserve linear classification boundaries. By the Bayes rule: Ĝ(x) = arg max

### BASIC THEORY AND APPLICATIONS OF THE JORDAN CANONICAL FORM

BASIC THEORY AND APPLICATIONS OF THE JORDAN CANONICAL FORM JORDAN BELL Abstract. This paper gives a basic introduction to the Jordan canonical form and its applications. It looks at the Jordan canonical

### Solving Linear Systems, Continued and The Inverse of a Matrix

, Continued and The of a Matrix Calculus III Summer 2013, Session II Monday, July 15, 2013 Agenda 1. The rank of a matrix 2. The inverse of a square matrix Gaussian Gaussian solves a linear system by reducing

### Elementary Matrices and The LU Factorization

lementary Matrices and The LU Factorization Definition: ny matrix obtained by performing a single elementary row operation (RO) on the identity (unit) matrix is called an elementary matrix. There are three

### Diagonalisation. Chapter 3. Introduction. Eigenvalues and eigenvectors. Reading. Definitions

Chapter 3 Diagonalisation Eigenvalues and eigenvectors, diagonalisation of a matrix, orthogonal diagonalisation fo symmetric matrices Reading As in the previous chapter, there is no specific essential

### Floating Point Operations in Matrix-Vector Calculus

Floating Point Operations in Matrix-Vector Calculus (Version 1.3) Raphael Hunger Technical Report 007 Technische Universität München Associate Institute for Signal Processing Univ.-Prof. Dr.-Ing. Wolfgang

### December 4, 2013 MATH 171 BASIC LINEAR ALGEBRA B. KITCHENS

December 4, 2013 MATH 171 BASIC LINEAR ALGEBRA B KITCHENS The equation 1 Lines in two-dimensional space (1) 2x y = 3 describes a line in two-dimensional space The coefficients of x and y in the equation

### Introduction to General and Generalized Linear Models

Introduction to General and Generalized Linear Models General Linear Models - part I Henrik Madsen Poul Thyregod Informatics and Mathematical Modelling Technical University of Denmark DK-2800 Kgs. Lyngby

### Solving a System of Equations

11 Solving a System of Equations 11-1 Introduction The previous chapter has shown how to solve an algebraic equation with one variable. However, sometimes there is more than one unknown that must be determined

### Direct Methods for Solving Linear Systems. Linear Systems of Equations

Direct Methods for Solving Linear Systems Linear Systems of Equations Numerical Analysis (9th Edition) R L Burden & J D Faires Beamer Presentation Slides prepared by John Carroll Dublin City University

### Matrices provide a compact notation for expressing systems of equations or variables. For instance, a linear function might be written as: b k

MATRIX ALGEBRA FOR STATISTICS: PART 1 Matrices provide a compact notation for expressing systems of equations or variables. For instance, a linear function might be written as: y = x 1 b 1 + x 2 + x 3

### Matrices, Determinants and Linear Systems

September 21, 2014 Matrices A matrix A m n is an array of numbers in rows and columns a 11 a 12 a 1n r 1 a 21 a 22 a 2n r 2....... a m1 a m2 a mn r m c 1 c 2 c n We say that the dimension of A is m n (we

### Iterative Methods for Computing Eigenvalues and Eigenvectors

The Waterloo Mathematics Review 9 Iterative Methods for Computing Eigenvalues and Eigenvectors Maysum Panju University of Waterloo mhpanju@math.uwaterloo.ca Abstract: We examine some numerical iterative

### Matrix Inverse and Determinants

DM554 Linear and Integer Programming Lecture 5 and Marco Chiarandini Department of Mathematics & Computer Science University of Southern Denmark Outline 1 2 3 4 and Cramer s rule 2 Outline 1 2 3 4 and

### Explicit inverses of some tridiagonal matrices

Linear Algebra and its Applications 325 (2001) 7 21 wwwelseviercom/locate/laa Explicit inverses of some tridiagonal matrices CM da Fonseca, J Petronilho Depto de Matematica, Faculdade de Ciencias e Technologia,

### 9. Numerical linear algebra background

Convex Optimization Boyd & Vandenberghe 9. Numerical linear algebra background matrix structure and algorithm complexity solving linear equations with factored matrices LU, Cholesky, LDL T factorization

### Linearly Independent Sets and Linearly Dependent Sets

These notes closely follow the presentation of the material given in David C. Lay s textbook Linear Algebra and its Applications (3rd edition). These notes are intended primarily for in-class presentation

### Estimating an ARMA Process

Statistics 910, #12 1 Overview Estimating an ARMA Process 1. Main ideas 2. Fitting autoregressions 3. Fitting with moving average components 4. Standard errors 5. Examples 6. Appendix: Simple estimators

### Chap 4 The Simplex Method

The Essence of the Simplex Method Recall the Wyndor problem Max Z = 3x 1 + 5x 2 S.T. x 1 4 2x 2 12 3x 1 + 2x 2 18 x 1, x 2 0 Chap 4 The Simplex Method 8 corner point solutions. 5 out of them are CPF solutions.

### 4 Solving Systems of Equations by Reducing Matrices

Math 15 Sec S0601/S060 4 Solving Systems of Equations by Reducing Matrices 4.1 Introduction One of the main applications of matrix methods is the solution of systems of linear equations. Consider for example

### Inverses. Stephen Boyd. EE103 Stanford University. October 27, 2015

Inverses Stephen Boyd EE103 Stanford University October 27, 2015 Outline Left and right inverses Inverse Solving linear equations Examples Pseudo-inverse Left and right inverses 2 Left inverses a number

### LECTURE 1 I. Inverse matrices We return now to the problem of solving linear equations. Recall that we are trying to find x such that IA = A

LECTURE I. Inverse matrices We return now to the problem of solving linear equations. Recall that we are trying to find such that A = y. Recall: there is a matri I such that for all R n. It follows that

### Lecture 6. Inverse of Matrix

Lecture 6 Inverse of Matrix Recall that any linear system can be written as a matrix equation In one dimension case, ie, A is 1 1, then can be easily solved as A x b Ax b x b A 1 A b A 1 b provided that

### Direct Methods for Solving Linear Systems. Matrix Factorization

Direct Methods for Solving Linear Systems Matrix Factorization Numerical Analysis (9th Edition) R L Burden & J D Faires Beamer Presentation Slides prepared by John Carroll Dublin City University c 2011

### A Introduction to Matrix Algebra and Principal Components Analysis

A Introduction to Matrix Algebra and Principal Components Analysis Multivariate Methods in Education ERSH 8350 Lecture #2 August 24, 2011 ERSH 8350: Lecture 2 Today s Class An introduction to matrix algebra

### ALGEBRAIC EIGENVALUE PROBLEM

ALGEBRAIC EIGENVALUE PROBLEM BY J. H. WILKINSON, M.A. (Cantab.), Sc.D. Technische Universes! Dsrmstedt FACHBEREICH (NFORMATiK BIBL1OTHEK Sachgebieto:. Standort: CLARENDON PRESS OXFORD 1965 Contents 1.

### Step 1: Set the equation equal to zero if the function lacks. Step 2: Subtract the constant term from both sides:

In most situations the quadratic equations such as: x 2 + 8x + 5, can be solved (factored) through the quadratic formula if factoring it out seems too hard. However, some of these problems may be solved

### Diagonal, Symmetric and Triangular Matrices

Contents 1 Diagonal, Symmetric Triangular Matrices 2 Diagonal Matrices 2.1 Products, Powers Inverses of Diagonal Matrices 2.1.1 Theorem (Powers of Matrices) 2.2 Multiplying Matrices on the Left Right by

### F Matrix Calculus F 1

F Matrix Calculus F 1 Appendix F: MATRIX CALCULUS TABLE OF CONTENTS Page F1 Introduction F 3 F2 The Derivatives of Vector Functions F 3 F21 Derivative of Vector with Respect to Vector F 3 F22 Derivative

### Definition A square matrix M is invertible (or nonsingular) if there exists a matrix M 1 such that

0. Inverse Matrix Definition A square matrix M is invertible (or nonsingular) if there exists a matrix M such that M M = I = M M. Inverse of a 2 2 Matrix Let M and N be the matrices: a b d b M =, N = c

### 1.5 Elementary Matrices and a Method for Finding the Inverse

.5 Elementary Matrices and a Method for Finding the Inverse Definition A n n matrix is called an elementary matrix if it can be obtained from I n by performing a single elementary row operation Reminder:

### Numerical Analysis. Gordon K. Smyth in. Encyclopedia of Biostatistics (ISBN 0471 975761) Edited by. Peter Armitage and Theodore Colton

Numerical Analysis Gordon K. Smyth in Encyclopedia of Biostatistics (ISBN 0471 975761) Edited by Peter Armitage and Theodore Colton John Wiley & Sons, Ltd, Chichester, 1998 Numerical Analysis Numerical

### Notes for STA 437/1005 Methods for Multivariate Data

Notes for STA 437/1005 Methods for Multivariate Data Radford M. Neal, 26 November 2010 Random Vectors Notation: Let X be a random vector with p elements, so that X = [X 1,..., X p ], where denotes transpose.

### (a) The transpose of a lower triangular matrix is upper triangular, and the transpose of an upper triangular matrix is lower triangular.

Theorem.7.: (Properties of Triangular Matrices) (a) The transpose of a lower triangular matrix is upper triangular, and the transpose of an upper triangular matrix is lower triangular. (b) The product

### We know a formula for and some properties of the determinant. Now we see how the determinant can be used.

Cramer s rule, inverse matrix, and volume We know a formula for and some properties of the determinant. Now we see how the determinant can be used. Formula for A We know: a b d b =. c d ad bc c a Can we

### MATH 2030: SYSTEMS OF LINEAR EQUATIONS. ax + by + cz = d. )z = e. while these equations are not linear: xy z = 2, x x = 0,

MATH 23: SYSTEMS OF LINEAR EQUATIONS Systems of Linear Equations In the plane R 2 the general form of the equation of a line is ax + by = c and that the general equation of a plane in R 3 will be we call

### Using the Singular Value Decomposition

Using the Singular Value Decomposition Emmett J. Ientilucci Chester F. Carlson Center for Imaging Science Rochester Institute of Technology emmett@cis.rit.edu May 9, 003 Abstract This report introduces

### Hypothesis Testing in the Classical Regression Model

LECTURE 5 Hypothesis Testing in the Classical Regression Model The Normal Distribution and the Sampling Distributions It is often appropriate to assume that the elements of the disturbance vector ε within

### Mathematics Notes for Class 12 chapter 3. Matrices

1 P a g e Mathematics Notes for Class 12 chapter 3. Matrices A matrix is a rectangular arrangement of numbers (real or complex) which may be represented as matrix is enclosed by [ ] or ( ) or Compact form

### Row Echelon Form and Reduced Row Echelon Form

These notes closely follow the presentation of the material given in David C Lay s textbook Linear Algebra and its Applications (3rd edition) These notes are intended primarily for in-class presentation

### The Moore-Penrose Inverse and Least Squares

University of Puget Sound MATH 420: Advanced Topics in Linear Algebra The Moore-Penrose Inverse and Least Squares April 16, 2014 Creative Commons License c 2014 Permission is granted to others to copy,

### Statistical Machine Learning

Statistical Machine Learning UoC Stats 37700, Winter quarter Lecture 4: classical linear and quadratic discriminants. 1 / 25 Linear separation For two classes in R d : simple idea: separate the classes

### 9 Matrices, determinants, inverse matrix, Cramer s Rule

AAC - Business Mathematics I Lecture #9, December 15, 2007 Katarína Kálovcová 9 Matrices, determinants, inverse matrix, Cramer s Rule Basic properties of matrices: Example: Addition properties: Associative:

### 2.5 Elementary Row Operations and the Determinant

2.5 Elementary Row Operations and the Determinant Recall: Let A be a 2 2 matrtix : A = a b. The determinant of A, denoted by det(a) c d or A, is the number ad bc. So for example if A = 2 4, det(a) = 2(5)

### 8 Square matrices continued: Determinants

8 Square matrices continued: Determinants 8. Introduction Determinants give us important information about square matrices, and, as we ll soon see, are essential for the computation of eigenvalues. You

### General Framework for an Iterative Solution of Ax b. Jacobi s Method

2.6 Iterative Solutions of Linear Systems 143 2.6 Iterative Solutions of Linear Systems Consistent linear systems in real life are solved in one of two ways: by direct calculation (using a matrix factorization,

### Math 2331 Linear Algebra

2.2 The Inverse of a Matrix Math 2331 Linear Algebra 2.2 The Inverse of a Matrix Jiwen He Department of Mathematics, University of Houston jiwenhe@math.uh.edu math.uh.edu/ jiwenhe/math2331 Jiwen He, University

### 2.5 Gaussian Elimination

page 150 150 CHAPTER 2 Matrices and Systems of Linear Equations 37 10 the linear algebra package of Maple, the three elementary 20 23 1 row operations are 12 1 swaprow(a,i,j): permute rows i and j 3 3

### 11. Time series and dynamic linear models

11. Time series and dynamic linear models Objective To introduce the Bayesian approach to the modeling and forecasting of time series. Recommended reading West, M. and Harrison, J. (1997). models, (2 nd

### 4. MATRICES Matrices

4. MATRICES 170 4. Matrices 4.1. Definitions. Definition 4.1.1. A matrix is a rectangular array of numbers. A matrix with m rows and n columns is said to have dimension m n and may be represented as follows:

### The Inverse of a Matrix

The Inverse of a Matrix 7.4 Introduction In number arithmetic every number a ( 0) has a reciprocal b written as a or such that a ba = ab =. Some, but not all, square matrices have inverses. If a square

### Introduction to Flocking {Stochastic Matrices}

Supelec EECI Graduate School in Control Introduction to Flocking {Stochastic Matrices} A. S. Morse Yale University Gif sur - Yvette May 21, 2012 CRAIG REYNOLDS - 1987 BOIDS The Lion King CRAIG REYNOLDS

### Syntax Description Remarks and examples Also see

Title stata.com permutation An aside on permutation matrices and vectors Syntax Description Remarks and examples Also see Syntax Permutation matrix Permutation vector Action notation notation permute rows

### Description Syntax Remarks and examples Also see

Title stata.com permutation An aside on permutation matrices and vectors Description Syntax Remarks and examples Also see Description Permutation matrices are a special kind of orthogonal matrix that,

### Notes on Determinant

ENGG2012B Advanced Engineering Mathematics Notes on Determinant Lecturer: Kenneth Shum Lecture 9-18/02/2013 The determinant of a system of linear equations determines whether the solution is unique, without

### MATH 304 Linear Algebra Lecture 8: Inverse matrix (continued). Elementary matrices. Transpose of a matrix.

MATH 304 Linear Algebra Lecture 8: Inverse matrix (continued). Elementary matrices. Transpose of a matrix. Inverse matrix Definition. Let A be an n n matrix. The inverse of A is an n n matrix, denoted

### 1 Introduction to Matrices

1 Introduction to Matrices In this section, important definitions and results from matrix algebra that are useful in regression analysis are introduced. While all statements below regarding the columns

### MATH 240 Fall, Chapter 1: Linear Equations and Matrices

MATH 240 Fall, 2007 Chapter Summaries for Kolman / Hill, Elementary Linear Algebra, 9th Ed. written by Prof. J. Beachy Sections 1.1 1.5, 2.1 2.3, 4.2 4.9, 3.1 3.5, 5.3 5.5, 6.1 6.3, 6.5, 7.1 7.3 DEFINITIONS

### SOLVING LINEAR SYSTEMS

SOLVING LINEAR SYSTEMS Linear systems Ax = b occur widely in applied mathematics They occur as direct formulations of real world problems; but more often, they occur as a part of the numerical analysis