Some Properties Of the Gaussian Distribution

Size: px
Start display at page:

Download "Some Properties Of the Gaussian Distribution"

Transcription

1 Some Properties Of the Gaussian Distribution Jianin Wu GVU Center and College of Computing Georgia Institute of Technology April, 004 Contents Introduction De nition. Univariate Gaussian Multivariate Gaussian Notation and Parameterization 4 4 Linear Operation and Summation 5 4. Univariate case Multivariate Case Geometry and Mahalanobis Distance 6 6 Conditioning 7 7 Product of Gaussians 9 8 Application I: Parameter Estimation 0 8. Maimum Likelihood Estimation Bayesian Parameter Estimation Application II: Kalman Filter 9. The Model The Estimation A Gaussian Integral 4 B Characteristic Functions 5 C Schur Complement and the Matri Inversion Lemma 6

2 D Vector and Matri Derivatives 7 Introduction The Gaussian distribution is the most widely used probability distribution in statistical pattern recognition and machine learning. The nice properties of the Gaussian distribution might be the main reason for its popularity. In this short paper, I try to organize the basic facts about the Gaussian distribution. There is no advanced theory in this paper. However, in order to understand these facts, some linear algebra and multivariate analysis are needed, which are not always covered su ciently in undergraduate tets. The attempt of this paper is to pool these facts together, and hope that it will be useful for new researchers entering this area. De nition. Univariate Gaussian The probability density function of a univariate Gaussian distribution has the following form:! p () p ( ), () in which is the ected value of, and is the variance. We assume that > 0. We have to rst verify that eq. () is a valid density. It is obvious that p () 0 always holds for R. From eq. (96) in Appendi A we know that R t d p t. Applying this equation, we have p () d p which means that p () is a valid density. The density p p! ( ) d () d (3) p p, (4) is called the standard normal density. In Appendi A, it is showed that the mean value and standard deviation of the standard normal distribution are 0 and respectively. By doing a change of variables, it is easy to show that R p () d and R ( ) p () d. I planned to write a short note listing some properties of Gaussian. However, somehow I decided to keep this paper self-containing. The result is that it becomes very fat.

3 . Multivariate Gaussian The probability density function of a multivariate Gaussian distribution has the following form: p () () d jj ( )T ( ), (5) in which is a d-dimensional vector, is the d-dimensional mean vector, and is the d-by-d covariance matri. We assume that is a symmetric, positive de nite matri. We have to rst verify that eq. (5) is a valid probability density function. It is obvious that p () 0 always holds for R d. Net we diagonalize as U T U in which U is an orthogonal matri containing the eigenvectors of, [ ; : : : ; d ] is a diagonal matri containing the eigenvalues of in its diagonal entries and jj jj. Let s de ne a new random vector as y U ( ). (6) If we treat eq. (6) as a change of variables, the determinant of the Jacobian matri will be jj. Now we are ready to calculate the integral p () d () d jj ( )T ( ) d (7) () d jj jj yt y dy (8) dy y p i dy i (9) i dy (0) i in which y i is the ith component of y, i.e. y (y ; : : : ; y d ). This equation gives the validity of the multivariate Gaussian density. Since y is a random vector, it has a density, denoted as p y (y). Using the inverse transform method, we get p y (y) p + U T y U T () U T () d jj () d yt y U T y T U T y () (3) The density de ned by p y (y) d () yt y. (4) 3

4 is called a spherical Gaussian distribution. Let z be a random vector formed by a subset of the components of y. By marginalization it isclear that p (z). Using this fact, y i () jzj zt z, and speci cally p (y i ) p it is straightforward to show that the mean vector and covariance matri of a spherical Gaussian are 0 and I respectively. Using the inverse transform of eq. (6), we can easily calculate the mean vector and covariance matri of the density p (). E E + U T y + E U T y (5) T E ( ) ( ) T E U T y U T y (6) U T E yy T U (7) U T U (8) (9) 3 Notation and Parameterization When we have a density of the form in eq. (5), it is often written as or, N (; ) (0) N (; ; ) () In most cases we will use the mean vector and covariance matri to ress a Gaussian density. This is called the moment parameterization. There is another parameterization of a Gaussian density, called the canonical parameterization. In the canonical parameterization, a Gaussian density is ressed as p () + T T, () in which d log log jj + T is a normalization constant. The parameters in these two representations are related by (3) (4) (5). (6) Notice that there is a confusion in our notation: has di erent meanings in eq. () and eq. (6). In eq. (), is a parameter in the canonical parameterization of a Gaussian density, which is not necessarily diagonal. In eq. (6), is a diagonal matri formed by the eigenvalues of. It is straightforward to show that the moment parameterization and canonical parameterization of the Gaussian distribution are equivalent. In some cases the canonical parameterization is more convenient to use than the moment parameterization. 4

5 4 Linear Operation and Summation 4. Univariate case Suppose N ; and N ; are two independent univariate Gaussian densities. It is obvious that a + b N a + b; a, in which a and b are two scalars. Now consider a random variable z +. The density of z is calculated as: p z (z) p ( ) p ( ) d d (7) z + p ( + ) p ( + ) d 0 d 0 (8) 0 z 0 +0 p ( 0 + ) p (z 0 ) d 0 (9)! (z ) +! (z ) d (30) 0 s (z ) + +! p p (z ) + + (z ) + + C A d (3) (3), (33) in which the step from eq. (3) to eq. (3) used the result of eq. (96). The sum of two univariate Gaussian random variables is again a Gaussian random variable, with the mean value and variance summed up respectively, i.e. z N + ; +. The summation rule is easily generalized to n Gaussians. 4. Multivariate Case Suppose N ( ; ) is a d-dimensional Gaussian density, A is a q-by-d matri and b is a q-dimensional vector, then z A + b is a q-dimensional Gaussian density: z N A + b; AA T. This fact is proved using the characteristic function tool (see Appendi B). 5

6 The characteristic function of z is z (t) E z it T z (34) E it T (A + b) (35) it T b E h i A T t i T (36) it T b i A T t T A T t T A T t (37) it T (A + b) t T AA T t, (38) in which the step from eq. (37) to eq. (38) used the eq. (07) in Appendi B. Appendi B states that if a characteristic function (t) is of the form it T t T t, then the underlying density p () is a Gaussian with mean vector and covariance matri. Applying this fact to eq. (38), we immediately get z N A + b; AA T : (39) Suppose N ( ; ) and N ( ; ) are two independent d- dimensional Gaussian densities, and de ne a new random vector z + y. We can calculate the probability density function p (z) using the same method as we used in the univariate case. However, the calculation is comple and we have to apply the matri inversion lemma in Appendi C. Characteristic function simpli es the calculation. Using eq. (0) in Appendi B, we get z (t) (t) y (t) (40) it T t T t it T t T t (4) it T ( + ) t T ( + ) t, (4) which immediately gives us z N ( + ; + ). The summation of two multivariate Gaussian random variables is as easy to compute as in the univariate case: sum up the mean vectors and covariance matrices. The rule is same for summing up several multivariate Gaussians. Now we have the tool of linear transformation and let s revisit eq. (6). For convenience we retype the equation here: N (; ), and y U ( ). (43) Using the properties of linear transformations on a Gaussian density, y is indeed a Gaussian (in section. we painfully calculate p (y) using the inverse transform method), and has mean vector 0 and covariance matri I. The transformation of applying eq. (6) is called the whitening transformation since the transformed density has an identity covariance matri. 5 Geometry and Mahalanobis Distance Figure shows a bivariate Gaussian density function. Gaussian density has only one mode, which is the mean vector, and the shape of the density is determined 6

7 Figure : Bivariate Gaussian density by the covariance matri. Figure showed the equal probability contour of a bivariate Gaussian density. All points on a given equal probability contour must have the following term evaluated to a constant value: r (; ) ( ) T ( ) c (44) r (; ) is called the Mahalanobis distance from to, given the covariance matri. Eq. (44) de nes a hyperellipsoid in d-dimensional, which means that the equal probability contour of a Gaussian density is a hyperellipsoid in d-dimensional. The principle aes of this hyperellipsoid are given by the eigenvectors of, and the lengths of these aes are proportional to square root of the eigenvalues associated with these eigenvectors. 6 Conditioning Suppose and are two multivariate Gaussian densities, which have a joint density function p () (d+d) jj T!, in which d and d are the dimensionality of and respectively, and. The matrices and are covariance matrices between and, and satisfying T. 7

8 Figure : Equal probability contour of a bivariate Gaussian density The marginal distributions N ( ; ) and N ( ; ) are easy to get from the joint distribution. We are interested in computing the conditional probability density p ( j ). We will need to compute the inverse of, and this task is completed by using the Schur complement (see Appendi C). For notational simplicity, we denote the Schur complement of as S, de ned as S. Similarly, the Schur complement of is S. Applying eq. (0) and noticing that T, we get (writing as 0, and as 0 ) S S T S + T S (45) and T 0T S 0 + 0T T S Thus we can split the joint distribution as p () d js () d j j j T 0 T S 0 + 0T 0. (46) T S 0 + 0! (47) in which we used the fact that jj j j js j (from eq. (9) in Appendi C). 8

9 Since the second term in the right hand side of eq. (47) is the marginal p ( ) and p ( ; ) p ( j ) p ( ), we now get the conditional probability p ( j ) as 0 + T p ( j ) 0 S 0 +! 0, () d js j (48) or j N + 0 ; S (49) N + ( ) ; (50) 7 Product of Gaussians Suppose p () N (; ; ) and p () N (; ; ) are two independent d-dimensional Gaussian densities. Sometimes we want to compute the density which is proportional to the product of the two Gaussian densities, i.e. p () p () p (), in which is a proper normalization constant to make p (z) a valid density function. In this task the canonical notation (see section 3) will be etremely helpful. Writing the two Gaussians in canonical form p () + T (5) p () the density p (z) is easy to compute, as + T p (z) p () p () 0 + ( + ) T T T, (5) T ( + ). (53) This equation states that in the canonical parameterization, in order to compute product of Gaussians, we just sum the parameters. This result is readily etended to the product of n Gaussians. Suppose we have n Gaussian distributions p i (), whose parameters in the canonical parameterization are i and i, i ; ; : : : ; n. Then p () n i p i () is also a Gaussian density, given by p () 0 + ( n i i ) T T ( n i i ). (54) Now let s go back to the moment parameterization. Suppose we have n Gaussian distributions p i (), in which p i () N (; i ; i ), i ; ; : : : ; n. Then p () n i p i () is Gaussian, p () N (; ; ) (55) 9

10 where + + : : : + n (56) + + : : : + n n (57) Now we have listed some properties of the Gaussian distribution. Net let s see how these properties are applied. 8 Application I: Parameter Estimation 8. Maimum Likelihood Estimation Let us suppose that we have a d-dimensional multivariate Gaussian density N (; ), and n i.i.d (independently, identically distributed) samples D f ; ; : : : ; n g sampled from this distribution. The task is to estimate the parameters and. The log-likelihood function of observing the data set D given parameters and is: l (; jd) (58) ny log p ( i ) (59) i nd log () + n log nx ( i ) T ( i ). (60) Taking derivative of the likelihood function with respect to and gives (see Appendi D): X ( i ) n nx ( i ) ( i ) T, (6) i in which eq. (6) used eq. (5) and the chain rule, and eq. (6) used eq. (33), eq. (34) and the fact that T. The notation in eq. (6) is a little bit confusing. There are two in the right hand side, the rst represents a summation and the second represents the covariance matri. In order to nd the maimum likelihood solution, we want to nd the maimum of the likelihood function. Setting both eq. (6) and eq. (6) to 0 gives us the solution: ML nx i (63) n ML n i nx ( i ML ) ( i ML ) T (64) i 0

11 Eq. (63) and eq. (64) clearly states that the maimum likelihood estimation of the mean vector and covariance matri are just the sample mean and sample covariance matri respectively. 8. Bayesian Parameter Estimation In Bayesian estimation, we assume that the covariance matri is known. Let us suppose that we have a d-dimensional multivariate Gaussian density N (; ), and n i.i.d samples D f ; ; : : : ; n g sampled from this distribution. We also need a prior on the parameter. Let s assume that the prior is that N ( 0 ; 0 ). The task is to estimate the parameters. Note that we assume 0, 0, and are all known. The only parameter to be estimated is the mean vector. In Bayesian estimation, instead of nd a point ^ in the parameter space that gives maimum likelihood, we calculate the posterior density for the parameter p (jd), and use the entire distribution of as our estimation for this parameter. Applying the Bayes law, p (jd) p (Dj) p () (65) ny p ( i ) p 0 () (66) i in which is a normalization constant which does not depend on. Apply the result in section 7, we know that p (jd) is also a Gaussian, and where p (jd) N (; n ; n ) (67) n n + 0 (68) n n n (69) Both n and n can be calculated from know parameters. So we have determined the posterior distribution p (jd) for given the data set D. We choose a Gaussian to be the prior family. Usually, the prior distribution is chosen such that the posterior belongs to the same functional form as the prior. A prior and posterior chosen in this way are said to be conjugate. We have seen that Gaussian have the nice property that both the prior and posterior are Gaussian, i.e. Gaussian is auto-conjugate. After p (jd) is determined, a new sample is classi ed by calculating the probability p (jd) p (j) p (jd) d. (70) Eq. (70) and eq. (7) has the same form. Thus we can guess that p (jd) is a Gaussian again, and p (jd) N (; n ; + n ). (7)

12 The guess is correct, and is easy to verify it by repeating the steps in eq.(7) through eq. (33). 9 Application II: Kalman Filter 9. The Model The Kalman lter address the problem of estimating a state vector in a discrete time process, given a linear dynamic model and a linear measurement model k A k + w k, (7) z k H k + v k. (73) The process noise w k and measurement noise v k are assumed to be Gaussian: w N (0; Q) (74) v N (0; R) (75) At time k, assume we know the distribution of k, the task is to estimate the posterior probability of k at time k, given the current observation z k and the previous state estimation p ( k ). In a broader point of view, the task can be formulated as estimating the posterior probability of k at time k, given all the previous state estimates and all the observations up to time step k. Under certain Markovian assumption, it is not hard to prove that the two problem formulations are equivalent. In the Kalman lter setup, we assume the prior is Gaussian, i.e. at time t 0, p ( 0 ) N (; 0 ; P 0 ). Instead of ; here we use P to represent a covariance matri, in order to match the notations in the Kalman lter literature. 9. The Estimation I will show that, with the help of the properties of Gaussians we have obtained, it is quite easy to derive the Kalman lter equations. The Kalman lter can be separated in two (related) steps. In the rst step, based on the estimation p ( k ) and the dynamic model (7), we get an estimate p k. Note that the minus sign means that this estimation is done before we take into account the measurement. In the second step, based on p k and the measurement model (73), we get the nal estimation p (k ). First, let s estimate p k. Assume that at time k, the estimation we already got is a Gaussian p ( k ) N ; k ; P k : (76)

13 This assumption coincides well with the prior p ( 0 ). We will show that, under this assumption, after the Kalman lter update, p ( k ) will also become a Gaussian, and this makes the assumption reasonable. Applying the linear operation equation (39) on the dynamic model (7), we immediately get the estimation for k : k N k ; P k (77) k A k (78) P k AP k A T + Q (79) The estimate p k conditioned on the observation zk gives p ( k ), the estimation we want. Thus the conditioning property (50) can be used. In order to use eq. (50), we have to build the joint covariance matri rst. Since Cov (z k ) HP k H T + R (applying eq. (39) to eq. (73)) and Cov z k ; k Cov H k + v k ; k (80) Cov H k ; k (8) HP k, (8) the joint covariance matri of k ; z k is: Pk P k H T HP k HP k H T + R. (83) Applying the conditioning equation (50), we get p ( k ) p k jz k (84) N ( k ; P k ) (85) P k P k P k H T HP k H T + R HPk (86) k k + P k H T HP k H T + R zk H k (87) The equations (7779) and (8487) are the Kalman lter updating rules. The term P k H T HP k H T + R appears in both eq. (86) and eq. (87). De ning K k P k H T HP k H T + R, (88) the equations are simpli ed as P k (I K k H) P k (89) k k + K k z k H k. (90) The term K k is called the Kalman gain matri, and the term z k the innovation. H k is called 3

14 A Gaussian Integral We will compute the integral of the univariate Gaussian density in this section. The trick in doing this integration is to consider two independent univariate gaussians at one time. e d s e d e y dy (9) s e ( +y ) ddy (9) s 0 0 re r drd (93) r [ e r ] 0 (94) p, (95) in eq. (93) the polar coordinates are used, and the etra r appeared inside the integral is the determinant of the Jacobian matri. The above integral can be easily etend as f (t) d p t (96) t in which we assume t > 0. Then we have df d dt dt t t t and d (97) d (98) df dt d r p t dt t. (99) The above two equations give us r d t t t. (00) Applying eq. (00), we have p d p r 4 (0) 4

15 It is obvious that since p is an odd function. d 0 (0) Eq. (0) and eq. (0) have proved that the mean and standard deviation of a standard normal distribution are 0 and respectively. B Characteristic Functions The characteristic function of a probability density p () is de ned as its Fourier transform (t) E it T (03) in which i p. Let s computer the characteristic function of a Gaussian density. (t) E it T (04) () d jj ( )T ( ) + it T d (05) it T t T t it T it d (06) () d jj it T t T t (07) Since the characteristic function is de ned as a Fourier transform, the inverse Fourier transform of (t) will be eactly p (), i.e. a density is completely determined by its characteristic function. When we see a characteristic function (t) is of the form it T t T t, we know that the underlying density is a Gaussian with mean vector and covariance matri. Suppose and y are two independent random vectors with the same dimension, and de ne a new random vector z + y. Then p z (z) p () p y (y) ddy (08) z+y p () p y (z ) d. (09) Since convolution in the function space is a product in the Fourier space, we have z (z) () y (y), (0) which means that the characteristic function of the sum of two independent random variables is just product of the characteristic functions of the summands. 5

16 C Schur Complement and the Matri Inversion Lemma The Schur complement is very useful in computing the inverse of a block matri. Suppose M is a block matri ressed as A B M, () C D in which A and D are non-singular square matrices. We want to compute M. Some algebraic manipulations give I 0 I A CA M B () I 0 I I 0 A B I A B CA (3) I C D 0 I A B I A B 0 D CA (4) B 0 I A 0 A 0 0 D CA, (5) B 0 S A in which the term D CA B is called the Schur complement of A, denoted as S A. Equation XMY implies that M Y X. Hence we have M I A B A 0 0 I 0 S A A A BS A I 0 0 S A CA I A + A BS A CA A BS A S A CA S A Taking the determinant of both sides of eq. (6), it gives I 0 CA I (6) (7) (8) jmj jaj js A j. (9) We can also compute M by using the Schur complement of D, in the following way: M S D S D BD D CS D D + D CS D BD (0) jmj jdj js D j. () Eq. (8) and eq. (0) are two di erent representations of the same matri M, which means the corresponding blocks in these two equations must be 6

17 equal, e.g. S D A + A BS A CA. This result is known as the matri inversion lemma: S D A BD C A + A B D CA B CA. () The following result, which comes from equating the upper right block is also useful: A B D CA B A BD C BD. (3) This formula and the matri inversion lemma are useful in the derivation of the Kalman lter equations. D Vector and Matri Derivatives Suppose y is a scalar, A is a matri, and and y are vectors. The partial derivative of y with respect to A is de @a ij where a ij the i; j-th component of the matri A. From the de nition (4), it is easy to get the following yt y (5) For a square matri A that is n-by-n, the matri M ij de ned by removing from A the i-th row and j-th column is called a minor of A. The scalar c ij ( ) i+j M ij is called a cofactor of A. The matri A cof with c ij in its i; j-th entry is called the cofactor matri of A. Finally, the adjoint matri of A is de ned as transpose of the cofactor matri A adj A T cof. (6) There are some well-known facts about the minors, determinant, and adjoint of a matri: jaj X j a ij c ij (7) A jaj A adj. (8) Since M ij has removed the i-th row, it does not depend on a ij, neither does c ij. Thus, we ij c ij, jaj A cof (30) 7

18 which in turn shows that A, A T adj (3) jaj A T. (3) Using the chain rule, we immediately get that for a positive de log jaj A T. (33) Applying the de nition (4), it is easy to show that for a square T A T, (34) since T A i j a ij i j where ( ; ; : : : ; n ). References [] C. M. Bishop. Neural Networks for Pattern Recognition, Oford University Press, Oford, UK, 996. [] R. O. Duda, P. E. Hart, and D. G. Stork. Pattern Classi cation, second edition, Wiley-Interscience, New York, NY, USA, 00. [3] [4] M. I. Jordan. An Introduction to Probabilistic Graphical Models, chapter 3, draft. [5] G. Welch and G. Bishop. An Introduction to the Kalman Filter, TR 95-04, Department of Computer Science, University of North Carolina at Chapel Hill. 8

MATRIX ALGEBRA AND SYSTEMS OF EQUATIONS. + + x 2. x n. a 11 a 12 a 1n b 1 a 21 a 22 a 2n b 2 a 31 a 32 a 3n b 3. a m1 a m2 a mn b m

MATRIX ALGEBRA AND SYSTEMS OF EQUATIONS. + + x 2. x n. a 11 a 12 a 1n b 1 a 21 a 22 a 2n b 2 a 31 a 32 a 3n b 3. a m1 a m2 a mn b m MATRIX ALGEBRA AND SYSTEMS OF EQUATIONS 1. SYSTEMS OF EQUATIONS AND MATRICES 1.1. Representation of a linear system. The general system of m equations in n unknowns can be written a 11 x 1 + a 12 x 2 +

More information

MATRIX ALGEBRA AND SYSTEMS OF EQUATIONS

MATRIX ALGEBRA AND SYSTEMS OF EQUATIONS MATRIX ALGEBRA AND SYSTEMS OF EQUATIONS Systems of Equations and Matrices Representation of a linear system The general system of m equations in n unknowns can be written a x + a 2 x 2 + + a n x n b a

More information

Introduction to Matrix Algebra

Introduction to Matrix Algebra Psychology 7291: Multivariate Statistics (Carey) 8/27/98 Matrix Algebra - 1 Introduction to Matrix Algebra Definitions: A matrix is a collection of numbers ordered by rows and columns. It is customary

More information

Solution to Homework 2

Solution to Homework 2 Solution to Homework 2 Olena Bormashenko September 23, 2011 Section 1.4: 1(a)(b)(i)(k), 4, 5, 14; Section 1.5: 1(a)(b)(c)(d)(e)(n), 2(a)(c), 13, 16, 17, 18, 27 Section 1.4 1. Compute the following, if

More information

The Characteristic Polynomial

The Characteristic Polynomial Physics 116A Winter 2011 The Characteristic Polynomial 1 Coefficients of the characteristic polynomial Consider the eigenvalue problem for an n n matrix A, A v = λ v, v 0 (1) The solution to this problem

More information

15.062 Data Mining: Algorithms and Applications Matrix Math Review

15.062 Data Mining: Algorithms and Applications Matrix Math Review .6 Data Mining: Algorithms and Applications Matrix Math Review The purpose of this document is to give a brief review of selected linear algebra concepts that will be useful for the course and to develop

More information

Component Ordering in Independent Component Analysis Based on Data Power

Component Ordering in Independent Component Analysis Based on Data Power Component Ordering in Independent Component Analysis Based on Data Power Anne Hendrikse Raymond Veldhuis University of Twente University of Twente Fac. EEMCS, Signals and Systems Group Fac. EEMCS, Signals

More information

Au = = = 3u. Aw = = = 2w. so the action of A on u and w is very easy to picture: it simply amounts to a stretching by 3 and 2, respectively.

Au = = = 3u. Aw = = = 2w. so the action of A on u and w is very easy to picture: it simply amounts to a stretching by 3 and 2, respectively. Chapter 7 Eigenvalues and Eigenvectors In this last chapter of our exploration of Linear Algebra we will revisit eigenvalues and eigenvectors of matrices, concepts that were already introduced in Geometry

More information

Linear Algebra Notes for Marsden and Tromba Vector Calculus

Linear Algebra Notes for Marsden and Tromba Vector Calculus Linear Algebra Notes for Marsden and Tromba Vector Calculus n-dimensional Euclidean Space and Matrices Definition of n space As was learned in Math b, a point in Euclidean three space can be thought of

More information

1. Introduction to multivariate data

1. Introduction to multivariate data . Introduction to multivariate data. Books Chat eld, C. and A.J.Collins, Introduction to multivariate analysis. Chapman & Hall Krzanowski, W.J. Principles of multivariate analysis. Oxford.000 Johnson,

More information

CITY UNIVERSITY LONDON. BEng Degree in Computer Systems Engineering Part II BSc Degree in Computer Systems Engineering Part III PART 2 EXAMINATION

CITY UNIVERSITY LONDON. BEng Degree in Computer Systems Engineering Part II BSc Degree in Computer Systems Engineering Part III PART 2 EXAMINATION No: CITY UNIVERSITY LONDON BEng Degree in Computer Systems Engineering Part II BSc Degree in Computer Systems Engineering Part III PART 2 EXAMINATION ENGINEERING MATHEMATICS 2 (resit) EX2005 Date: August

More information

Linear Threshold Units

Linear Threshold Units Linear Threshold Units w x hx (... w n x n w We assume that each feature x j and each weight w j is a real number (we will relax this later) We will study three different algorithms for learning linear

More information

1 Introduction to Matrices

1 Introduction to Matrices 1 Introduction to Matrices In this section, important definitions and results from matrix algebra that are useful in regression analysis are introduced. While all statements below regarding the columns

More information

F Matrix Calculus F 1

F Matrix Calculus F 1 F Matrix Calculus F 1 Appendix F: MATRIX CALCULUS TABLE OF CONTENTS Page F1 Introduction F 3 F2 The Derivatives of Vector Functions F 3 F21 Derivative of Vector with Respect to Vector F 3 F22 Derivative

More information

MATH 551 - APPLIED MATRIX THEORY

MATH 551 - APPLIED MATRIX THEORY MATH 55 - APPLIED MATRIX THEORY FINAL TEST: SAMPLE with SOLUTIONS (25 points NAME: PROBLEM (3 points A web of 5 pages is described by a directed graph whose matrix is given by A Do the following ( points

More information

Similarity and Diagonalization. Similar Matrices

Similarity and Diagonalization. Similar Matrices MATH022 Linear Algebra Brief lecture notes 48 Similarity and Diagonalization Similar Matrices Let A and B be n n matrices. We say that A is similar to B if there is an invertible n n matrix P such that

More information

Multivariate Analysis of Variance (MANOVA): I. Theory

Multivariate Analysis of Variance (MANOVA): I. Theory Gregory Carey, 1998 MANOVA: I - 1 Multivariate Analysis of Variance (MANOVA): I. Theory Introduction The purpose of a t test is to assess the likelihood that the means for two groups are sampled from the

More information

Inner Product Spaces

Inner Product Spaces Math 571 Inner Product Spaces 1. Preliminaries An inner product space is a vector space V along with a function, called an inner product which associates each pair of vectors u, v with a scalar u, v, and

More information

Section 6.1 - Inner Products and Norms

Section 6.1 - Inner Products and Norms Section 6.1 - Inner Products and Norms Definition. Let V be a vector space over F {R, C}. An inner product on V is a function that assigns, to every ordered pair of vectors x and y in V, a scalar in F,

More information

1 2 3 1 1 2 x = + x 2 + x 4 1 0 1

1 2 3 1 1 2 x = + x 2 + x 4 1 0 1 (d) If the vector b is the sum of the four columns of A, write down the complete solution to Ax = b. 1 2 3 1 1 2 x = + x 2 + x 4 1 0 0 1 0 1 2. (11 points) This problem finds the curve y = C + D 2 t which

More information

SF2940: Probability theory Lecture 8: Multivariate Normal Distribution

SF2940: Probability theory Lecture 8: Multivariate Normal Distribution SF2940: Probability theory Lecture 8: Multivariate Normal Distribution Timo Koski 24.09.2015 Timo Koski Matematisk statistik 24.09.2015 1 / 1 Learning outcomes Random vectors, mean vector, covariance matrix,

More information

SF2940: Probability theory Lecture 8: Multivariate Normal Distribution

SF2940: Probability theory Lecture 8: Multivariate Normal Distribution SF2940: Probability theory Lecture 8: Multivariate Normal Distribution Timo Koski 24.09.2014 Timo Koski () Mathematisk statistik 24.09.2014 1 / 75 Learning outcomes Random vectors, mean vector, covariance

More information

Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model

Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model 1 September 004 A. Introduction and assumptions The classical normal linear regression model can be written

More information

Exact Nonparametric Tests for Comparing Means - A Personal Summary

Exact Nonparametric Tests for Comparing Means - A Personal Summary Exact Nonparametric Tests for Comparing Means - A Personal Summary Karl H. Schlag European University Institute 1 December 14, 2006 1 Economics Department, European University Institute. Via della Piazzuola

More information

Notes on Determinant

Notes on Determinant ENGG2012B Advanced Engineering Mathematics Notes on Determinant Lecturer: Kenneth Shum Lecture 9-18/02/2013 The determinant of a system of linear equations determines whether the solution is unique, without

More information

MATH10212 Linear Algebra. Systems of Linear Equations. Definition. An n-dimensional vector is a row or a column of n numbers (or letters): a 1.

MATH10212 Linear Algebra. Systems of Linear Equations. Definition. An n-dimensional vector is a row or a column of n numbers (or letters): a 1. MATH10212 Linear Algebra Textbook: D. Poole, Linear Algebra: A Modern Introduction. Thompson, 2006. ISBN 0-534-40596-7. Systems of Linear Equations Definition. An n-dimensional vector is a row or a column

More information

MATH 423 Linear Algebra II Lecture 38: Generalized eigenvectors. Jordan canonical form (continued).

MATH 423 Linear Algebra II Lecture 38: Generalized eigenvectors. Jordan canonical form (continued). MATH 423 Linear Algebra II Lecture 38: Generalized eigenvectors Jordan canonical form (continued) Jordan canonical form A Jordan block is a square matrix of the form λ 1 0 0 0 0 λ 1 0 0 0 0 λ 0 0 J = 0

More information

LINEAR ALGEBRA. September 23, 2010

LINEAR ALGEBRA. September 23, 2010 LINEAR ALGEBRA September 3, 00 Contents 0. LU-decomposition.................................... 0. Inverses and Transposes................................. 0.3 Column Spaces and NullSpaces.............................

More information

Gaussian Conjugate Prior Cheat Sheet

Gaussian Conjugate Prior Cheat Sheet Gaussian Conjugate Prior Cheat Sheet Tom SF Haines 1 Purpose This document contains notes on how to handle the multivariate Gaussian 1 in a Bayesian setting. It focuses on the conjugate prior, its Bayesian

More information

Chapter 12 Modal Decomposition of State-Space Models 12.1 Introduction The solutions obtained in previous chapters, whether in time domain or transfor

Chapter 12 Modal Decomposition of State-Space Models 12.1 Introduction The solutions obtained in previous chapters, whether in time domain or transfor Lectures on Dynamic Systems and Control Mohammed Dahleh Munther A. Dahleh George Verghese Department of Electrical Engineering and Computer Science Massachuasetts Institute of Technology 1 1 c Chapter

More information

y t by left multiplication with 1 (L) as y t = 1 (L) t =ª(L) t 2.5 Variance decomposition and innovation accounting Consider the VAR(p) model where

y t by left multiplication with 1 (L) as y t = 1 (L) t =ª(L) t 2.5 Variance decomposition and innovation accounting Consider the VAR(p) model where . Variance decomposition and innovation accounting Consider the VAR(p) model where (L)y t = t, (L) =I m L L p L p is the lag polynomial of order p with m m coe±cient matrices i, i =,...p. Provided that

More information

160 CHAPTER 4. VECTOR SPACES

160 CHAPTER 4. VECTOR SPACES 160 CHAPTER 4. VECTOR SPACES 4. Rank and Nullity In this section, we look at relationships between the row space, column space, null space of a matrix and its transpose. We will derive fundamental results

More information

Eigenvalues, Eigenvectors, Matrix Factoring, and Principal Components

Eigenvalues, Eigenvectors, Matrix Factoring, and Principal Components Eigenvalues, Eigenvectors, Matrix Factoring, and Principal Components The eigenvalues and eigenvectors of a square matrix play a key role in some important operations in statistics. In particular, they

More information

CAPM, Arbitrage, and Linear Factor Models

CAPM, Arbitrage, and Linear Factor Models CAPM, Arbitrage, and Linear Factor Models CAPM, Arbitrage, Linear Factor Models 1/ 41 Introduction We now assume all investors actually choose mean-variance e cient portfolios. By equating these investors

More information

Basics of Statistical Machine Learning

Basics of Statistical Machine Learning CS761 Spring 2013 Advanced Machine Learning Basics of Statistical Machine Learning Lecturer: Xiaojin Zhu jerryzhu@cs.wisc.edu Modern machine learning is rooted in statistics. You will find many familiar

More information

Linear Algebra Review. Vectors

Linear Algebra Review. Vectors Linear Algebra Review By Tim K. Marks UCSD Borrows heavily from: Jana Kosecka kosecka@cs.gmu.edu http://cs.gmu.edu/~kosecka/cs682.html Virginia de Sa Cogsci 8F Linear Algebra review UCSD Vectors The length

More information

c 2008 Je rey A. Miron We have described the constraints that a consumer faces, i.e., discussed the budget constraint.

c 2008 Je rey A. Miron We have described the constraints that a consumer faces, i.e., discussed the budget constraint. Lecture 2b: Utility c 2008 Je rey A. Miron Outline: 1. Introduction 2. Utility: A De nition 3. Monotonic Transformations 4. Cardinal Utility 5. Constructing a Utility Function 6. Examples of Utility Functions

More information

Inner Product Spaces and Orthogonality

Inner Product Spaces and Orthogonality Inner Product Spaces and Orthogonality week 3-4 Fall 2006 Dot product of R n The inner product or dot product of R n is a function, defined by u, v a b + a 2 b 2 + + a n b n for u a, a 2,, a n T, v b,

More information

Multivariate Normal Distribution

Multivariate Normal Distribution Multivariate Normal Distribution Lecture 4 July 21, 2011 Advanced Multivariate Statistical Methods ICPSR Summer Session #2 Lecture #4-7/21/2011 Slide 1 of 41 Last Time Matrices and vectors Eigenvalues

More information

The Determinant: a Means to Calculate Volume

The Determinant: a Means to Calculate Volume The Determinant: a Means to Calculate Volume Bo Peng August 20, 2007 Abstract This paper gives a definition of the determinant and lists many of its well-known properties Volumes of parallelepipeds are

More information

Review Jeopardy. Blue vs. Orange. Review Jeopardy

Review Jeopardy. Blue vs. Orange. Review Jeopardy Review Jeopardy Blue vs. Orange Review Jeopardy Jeopardy Round Lectures 0-3 Jeopardy Round $200 How could I measure how far apart (i.e. how different) two observations, y 1 and y 2, are from each other?

More information

Statistical Machine Learning

Statistical Machine Learning Statistical Machine Learning UoC Stats 37700, Winter quarter Lecture 4: classical linear and quadratic discriminants. 1 / 25 Linear separation For two classes in R d : simple idea: separate the classes

More information

1 Another method of estimation: least squares

1 Another method of estimation: least squares 1 Another method of estimation: least squares erm: -estim.tex, Dec8, 009: 6 p.m. (draft - typos/writos likely exist) Corrections, comments, suggestions welcome. 1.1 Least squares in general Assume Y i

More information

13 MATH FACTS 101. 2 a = 1. 7. The elements of a vector have a graphical interpretation, which is particularly easy to see in two or three dimensions.

13 MATH FACTS 101. 2 a = 1. 7. The elements of a vector have a graphical interpretation, which is particularly easy to see in two or three dimensions. 3 MATH FACTS 0 3 MATH FACTS 3. Vectors 3.. Definition We use the overhead arrow to denote a column vector, i.e., a linear segment with a direction. For example, in three-space, we write a vector in terms

More information

Some probability and statistics

Some probability and statistics Appendix A Some probability and statistics A Probabilities, random variables and their distribution We summarize a few of the basic concepts of random variables, usually denoted by capital letters, X,Y,

More information

DATA ANALYSIS II. Matrix Algorithms

DATA ANALYSIS II. Matrix Algorithms DATA ANALYSIS II Matrix Algorithms Similarity Matrix Given a dataset D = {x i }, i=1,..,n consisting of n points in R d, let A denote the n n symmetric similarity matrix between the points, given as where

More information

NOTES ON LINEAR TRANSFORMATIONS

NOTES ON LINEAR TRANSFORMATIONS NOTES ON LINEAR TRANSFORMATIONS Definition 1. Let V and W be vector spaces. A function T : V W is a linear transformation from V to W if the following two properties hold. i T v + v = T v + T v for all

More information

Vector and Matrix Norms

Vector and Matrix Norms Chapter 1 Vector and Matrix Norms 11 Vector Spaces Let F be a field (such as the real numbers, R, or complex numbers, C) with elements called scalars A Vector Space, V, over the field F is a non-empty

More information

Similar matrices and Jordan form

Similar matrices and Jordan form Similar matrices and Jordan form We ve nearly covered the entire heart of linear algebra once we ve finished singular value decompositions we ll have seen all the most central topics. A T A is positive

More information

Understanding and Applying Kalman Filtering

Understanding and Applying Kalman Filtering Understanding and Applying Kalman Filtering Lindsay Kleeman Department of Electrical and Computer Systems Engineering Monash University, Clayton 1 Introduction Objectives: 1. Provide a basic understanding

More information

Introduction to General and Generalized Linear Models

Introduction to General and Generalized Linear Models Introduction to General and Generalized Linear Models General Linear Models - part I Henrik Madsen Poul Thyregod Informatics and Mathematical Modelling Technical University of Denmark DK-2800 Kgs. Lyngby

More information

Chapter 6. Orthogonality

Chapter 6. Orthogonality 6.3 Orthogonal Matrices 1 Chapter 6. Orthogonality 6.3 Orthogonal Matrices Definition 6.4. An n n matrix A is orthogonal if A T A = I. Note. We will see that the columns of an orthogonal matrix must be

More information

Bias in the Estimation of Mean Reversion in Continuous-Time Lévy Processes

Bias in the Estimation of Mean Reversion in Continuous-Time Lévy Processes Bias in the Estimation of Mean Reversion in Continuous-Time Lévy Processes Yong Bao a, Aman Ullah b, Yun Wang c, and Jun Yu d a Purdue University, IN, USA b University of California, Riverside, CA, USA

More information

MATHEMATICS FOR ENGINEERS BASIC MATRIX THEORY TUTORIAL 2

MATHEMATICS FOR ENGINEERS BASIC MATRIX THEORY TUTORIAL 2 MATHEMATICS FO ENGINEES BASIC MATIX THEOY TUTOIAL This is the second of two tutorials on matrix theory. On completion you should be able to do the following. Explain the general method for solving simultaneous

More information

Matrix Differentiation

Matrix Differentiation 1 Introduction Matrix Differentiation ( and some other stuff ) Randal J. Barnes Department of Civil Engineering, University of Minnesota Minneapolis, Minnesota, USA Throughout this presentation I have

More information

Chapter 3: The Multiple Linear Regression Model

Chapter 3: The Multiple Linear Regression Model Chapter 3: The Multiple Linear Regression Model Advanced Econometrics - HEC Lausanne Christophe Hurlin University of Orléans November 23, 2013 Christophe Hurlin (University of Orléans) Advanced Econometrics

More information

STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.cs.toronto.edu/~rsalakhu/ Lecture 6 Three Approaches to Classification Construct

More information

Recall that two vectors in are perpendicular or orthogonal provided that their dot

Recall that two vectors in are perpendicular or orthogonal provided that their dot Orthogonal Complements and Projections Recall that two vectors in are perpendicular or orthogonal provided that their dot product vanishes That is, if and only if Example 1 The vectors in are orthogonal

More information

11 Linear and Quadratic Discriminant Analysis, Logistic Regression, and Partial Least Squares Regression

11 Linear and Quadratic Discriminant Analysis, Logistic Regression, and Partial Least Squares Regression Frank C Porter and Ilya Narsky: Statistical Analysis Techniques in Particle Physics Chap. c11 2013/9/9 page 221 le-tex 221 11 Linear and Quadratic Discriminant Analysis, Logistic Regression, and Partial

More information

Multivariate normal distribution and testing for means (see MKB Ch 3)

Multivariate normal distribution and testing for means (see MKB Ch 3) Multivariate normal distribution and testing for means (see MKB Ch 3) Where are we going? 2 One-sample t-test (univariate).................................................. 3 Two-sample t-test (univariate).................................................

More information

CONTROLLABILITY. Chapter 2. 2.1 Reachable Set and Controllability. Suppose we have a linear system described by the state equation

CONTROLLABILITY. Chapter 2. 2.1 Reachable Set and Controllability. Suppose we have a linear system described by the state equation Chapter 2 CONTROLLABILITY 2 Reachable Set and Controllability Suppose we have a linear system described by the state equation ẋ Ax + Bu (2) x() x Consider the following problem For a given vector x in

More information

December 4, 2013 MATH 171 BASIC LINEAR ALGEBRA B. KITCHENS

December 4, 2013 MATH 171 BASIC LINEAR ALGEBRA B. KITCHENS December 4, 2013 MATH 171 BASIC LINEAR ALGEBRA B KITCHENS The equation 1 Lines in two-dimensional space (1) 2x y = 3 describes a line in two-dimensional space The coefficients of x and y in the equation

More information

by the matrix A results in a vector which is a reflection of the given

by the matrix A results in a vector which is a reflection of the given Eigenvalues & Eigenvectors Example Suppose Then So, geometrically, multiplying a vector in by the matrix A results in a vector which is a reflection of the given vector about the y-axis We observe that

More information

Inner products on R n, and more

Inner products on R n, and more Inner products on R n, and more Peyam Ryan Tabrizian Friday, April 12th, 2013 1 Introduction You might be wondering: Are there inner products on R n that are not the usual dot product x y = x 1 y 1 + +

More information

UNCOUPLING THE PERRON EIGENVECTOR PROBLEM

UNCOUPLING THE PERRON EIGENVECTOR PROBLEM UNCOUPLING THE PERRON EIGENVECTOR PROBLEM Carl D Meyer INTRODUCTION Foranonnegative irreducible matrix m m with spectral radius ρ,afundamental problem concerns the determination of the unique normalized

More information

1.2 Solving a System of Linear Equations

1.2 Solving a System of Linear Equations 1.. SOLVING A SYSTEM OF LINEAR EQUATIONS 1. Solving a System of Linear Equations 1..1 Simple Systems - Basic De nitions As noticed above, the general form of a linear system of m equations in n variables

More information

Chapter 1. Vector autoregressions. 1.1 VARs and the identi cation problem

Chapter 1. Vector autoregressions. 1.1 VARs and the identi cation problem Chapter Vector autoregressions We begin by taking a look at the data of macroeconomics. A way to summarize the dynamics of macroeconomic data is to make use of vector autoregressions. VAR models have become

More information

a 11 x 1 + a 12 x 2 + + a 1n x n = b 1 a 21 x 1 + a 22 x 2 + + a 2n x n = b 2.

a 11 x 1 + a 12 x 2 + + a 1n x n = b 1 a 21 x 1 + a 22 x 2 + + a 2n x n = b 2. Chapter 1 LINEAR EQUATIONS 1.1 Introduction to linear equations A linear equation in n unknowns x 1, x,, x n is an equation of the form a 1 x 1 + a x + + a n x n = b, where a 1, a,..., a n, b are given

More information

October 3rd, 2012. Linear Algebra & Properties of the Covariance Matrix

October 3rd, 2012. Linear Algebra & Properties of the Covariance Matrix Linear Algebra & Properties of the Covariance Matrix October 3rd, 2012 Estimation of r and C Let rn 1, rn, t..., rn T be the historical return rates on the n th asset. rn 1 rṇ 2 r n =. r T n n = 1, 2,...,

More information

Bayes and Naïve Bayes. cs534-machine Learning

Bayes and Naïve Bayes. cs534-machine Learning Bayes and aïve Bayes cs534-machine Learning Bayes Classifier Generative model learns Prediction is made by and where This is often referred to as the Bayes Classifier, because of the use of the Bayes rule

More information

Lecture 1: Systems of Linear Equations

Lecture 1: Systems of Linear Equations MTH Elementary Matrix Algebra Professor Chao Huang Department of Mathematics and Statistics Wright State University Lecture 1 Systems of Linear Equations ² Systems of two linear equations with two variables

More information

Introduction to Matrices for Engineers

Introduction to Matrices for Engineers Introduction to Matrices for Engineers C.T.J. Dodson, School of Mathematics, Manchester Universit 1 What is a Matrix? A matrix is a rectangular arra of elements, usuall numbers, e.g. 1 0-8 4 0-1 1 0 11

More information

Linear Algebra I. Ronald van Luijk, 2012

Linear Algebra I. Ronald van Luijk, 2012 Linear Algebra I Ronald van Luijk, 2012 With many parts from Linear Algebra I by Michael Stoll, 2007 Contents 1. Vector spaces 3 1.1. Examples 3 1.2. Fields 4 1.3. The field of complex numbers. 6 1.4.

More information

Matrices 2. Solving Square Systems of Linear Equations; Inverse Matrices

Matrices 2. Solving Square Systems of Linear Equations; Inverse Matrices Matrices 2. Solving Square Systems of Linear Equations; Inverse Matrices Solving square systems of linear equations; inverse matrices. Linear algebra is essentially about solving systems of linear equations,

More information

The Bivariate Normal Distribution

The Bivariate Normal Distribution The Bivariate Normal Distribution This is Section 4.7 of the st edition (2002) of the book Introduction to Probability, by D. P. Bertsekas and J. N. Tsitsiklis. The material in this section was not included

More information

Section 2.4: Equations of Lines and Planes

Section 2.4: Equations of Lines and Planes Section.4: Equations of Lines and Planes An equation of three variable F (x, y, z) 0 is called an equation of a surface S if For instance, (x 1, y 1, z 1 ) S if and only if F (x 1, y 1, z 1 ) 0. x + y

More information

Orthogonal Projections

Orthogonal Projections Orthogonal Projections and Reflections (with exercises) by D. Klain Version.. Corrections and comments are welcome! Orthogonal Projections Let X,..., X k be a family of linearly independent (column) vectors

More information

Linear Algebra Done Wrong. Sergei Treil. Department of Mathematics, Brown University

Linear Algebra Done Wrong. Sergei Treil. Department of Mathematics, Brown University Linear Algebra Done Wrong Sergei Treil Department of Mathematics, Brown University Copyright c Sergei Treil, 2004, 2009, 2011, 2014 Preface The title of the book sounds a bit mysterious. Why should anyone

More information

CS 2750 Machine Learning. Lecture 1. Machine Learning. http://www.cs.pitt.edu/~milos/courses/cs2750/ CS 2750 Machine Learning.

CS 2750 Machine Learning. Lecture 1. Machine Learning. http://www.cs.pitt.edu/~milos/courses/cs2750/ CS 2750 Machine Learning. Lecture Machine Learning Milos Hauskrecht milos@cs.pitt.edu 539 Sennott Square, x5 http://www.cs.pitt.edu/~milos/courses/cs75/ Administration Instructor: Milos Hauskrecht milos@cs.pitt.edu 539 Sennott

More information

LS.6 Solution Matrices

LS.6 Solution Matrices LS.6 Solution Matrices In the literature, solutions to linear systems often are expressed using square matrices rather than vectors. You need to get used to the terminology. As before, we state the definitions

More information

Partial Derivatives. @x f (x; y) = @ x f (x; y) @x x2 y + @ @x y2 and then we evaluate the derivative as if y is a constant.

Partial Derivatives. @x f (x; y) = @ x f (x; y) @x x2 y + @ @x y2 and then we evaluate the derivative as if y is a constant. Partial Derivatives Partial Derivatives Just as derivatives can be used to eplore the properties of functions of 1 variable, so also derivatives can be used to eplore functions of 2 variables. In this

More information

Orthogonal Diagonalization of Symmetric Matrices

Orthogonal Diagonalization of Symmetric Matrices MATH10212 Linear Algebra Brief lecture notes 57 Gram Schmidt Process enables us to find an orthogonal basis of a subspace. Let u 1,..., u k be a basis of a subspace V of R n. We begin the process of finding

More information

Math 312 Homework 1 Solutions

Math 312 Homework 1 Solutions Math 31 Homework 1 Solutions Last modified: July 15, 01 This homework is due on Thursday, July 1th, 01 at 1:10pm Please turn it in during class, or in my mailbox in the main math office (next to 4W1) Please

More information

Normalization and Mixed Degrees of Integration in Cointegrated Time Series Systems

Normalization and Mixed Degrees of Integration in Cointegrated Time Series Systems Normalization and Mixed Degrees of Integration in Cointegrated Time Series Systems Robert J. Rossana Department of Economics, 04 F/AB, Wayne State University, Detroit MI 480 E-Mail: r.j.rossana@wayne.edu

More information

2x + y = 3. Since the second equation is precisely the same as the first equation, it is enough to find x and y satisfying the system

2x + y = 3. Since the second equation is precisely the same as the first equation, it is enough to find x and y satisfying the system 1. Systems of linear equations We are interested in the solutions to systems of linear equations. A linear equation is of the form 3x 5y + 2z + w = 3. The key thing is that we don t multiply the variables

More information

A Primer on Mathematical Statistics and Univariate Distributions; The Normal Distribution; The GLM with the Normal Distribution

A Primer on Mathematical Statistics and Univariate Distributions; The Normal Distribution; The GLM with the Normal Distribution A Primer on Mathematical Statistics and Univariate Distributions; The Normal Distribution; The GLM with the Normal Distribution PSYC 943 (930): Fundamentals of Multivariate Modeling Lecture 4: September

More information

Chapter 17. Orthogonal Matrices and Symmetries of Space

Chapter 17. Orthogonal Matrices and Symmetries of Space Chapter 17. Orthogonal Matrices and Symmetries of Space Take a random matrix, say 1 3 A = 4 5 6, 7 8 9 and compare the lengths of e 1 and Ae 1. The vector e 1 has length 1, while Ae 1 = (1, 4, 7) has length

More information

Linear Algebra Done Wrong. Sergei Treil. Department of Mathematics, Brown University

Linear Algebra Done Wrong. Sergei Treil. Department of Mathematics, Brown University Linear Algebra Done Wrong Sergei Treil Department of Mathematics, Brown University Copyright c Sergei Treil, 2004, 2009, 2011, 2014 Preface The title of the book sounds a bit mysterious. Why should anyone

More information

Mathematics Course 111: Algebra I Part IV: Vector Spaces

Mathematics Course 111: Algebra I Part IV: Vector Spaces Mathematics Course 111: Algebra I Part IV: Vector Spaces D. R. Wilkins Academic Year 1996-7 9 Vector Spaces A vector space over some field K is an algebraic structure consisting of a set V on which are

More information

Linear Algebra and TI 89

Linear Algebra and TI 89 Linear Algebra and TI 89 Abdul Hassen and Jay Schiffman This short manual is a quick guide to the use of TI89 for Linear Algebra. We do this in two sections. In the first section, we will go over the editing

More information

An Introduction to the Kalman Filter

An Introduction to the Kalman Filter An Introduction to the Kalman Filter Greg Welch 1 and Gary Bishop 2 TR 95041 Department of Computer Science University of North Carolina at Chapel Hill Chapel Hill, NC 275993175 Updated: Monday, July 24,

More information

T ( a i x i ) = a i T (x i ).

T ( a i x i ) = a i T (x i ). Chapter 2 Defn 1. (p. 65) Let V and W be vector spaces (over F ). We call a function T : V W a linear transformation form V to W if, for all x, y V and c F, we have (a) T (x + y) = T (x) + T (y) and (b)

More information

[1] Diagonal factorization

[1] Diagonal factorization 8.03 LA.6: Diagonalization and Orthogonal Matrices [ Diagonal factorization [2 Solving systems of first order differential equations [3 Symmetric and Orthonormal Matrices [ Diagonal factorization Recall:

More information

Applied Linear Algebra I Review page 1

Applied Linear Algebra I Review page 1 Applied Linear Algebra Review 1 I. Determinants A. Definition of a determinant 1. Using sum a. Permutations i. Sign of a permutation ii. Cycle 2. Uniqueness of the determinant function in terms of properties

More information

Spatial Statistics Chapter 3 Basics of areal data and areal data modeling

Spatial Statistics Chapter 3 Basics of areal data and areal data modeling Spatial Statistics Chapter 3 Basics of areal data and areal data modeling Recall areal data also known as lattice data are data Y (s), s D where D is a discrete index set. This usually corresponds to data

More information

Lecture L3 - Vectors, Matrices and Coordinate Transformations

Lecture L3 - Vectors, Matrices and Coordinate Transformations S. Widnall 16.07 Dynamics Fall 2009 Lecture notes based on J. Peraire Version 2.0 Lecture L3 - Vectors, Matrices and Coordinate Transformations By using vectors and defining appropriate operations between

More information

The Matrix Elements of a 3 3 Orthogonal Matrix Revisited

The Matrix Elements of a 3 3 Orthogonal Matrix Revisited Physics 116A Winter 2011 The Matrix Elements of a 3 3 Orthogonal Matrix Revisited 1. Introduction In a class handout entitled, Three-Dimensional Proper and Improper Rotation Matrices, I provided a derivation

More information

Statistics Graduate Courses

Statistics Graduate Courses Statistics Graduate Courses STAT 7002--Topics in Statistics-Biological/Physical/Mathematics (cr.arr.).organized study of selected topics. Subjects and earnable credit may vary from semester to semester.

More information

Representation of functions as power series

Representation of functions as power series Representation of functions as power series Dr. Philippe B. Laval Kennesaw State University November 9, 008 Abstract This document is a summary of the theory and techniques used to represent functions

More information

Epipolar Geometry. Readings: See Sections 10.1 and 15.6 of Forsyth and Ponce. Right Image. Left Image. e(p ) Epipolar Lines. e(q ) q R.

Epipolar Geometry. Readings: See Sections 10.1 and 15.6 of Forsyth and Ponce. Right Image. Left Image. e(p ) Epipolar Lines. e(q ) q R. Epipolar Geometry We consider two perspective images of a scene as taken from a stereo pair of cameras (or equivalently, assume the scene is rigid and imaged with a single camera from two different locations).

More information