ECE 275B Homework # 3 Solutions Winter 2015
|
|
- Jonah Dixon
- 7 years ago
- Views:
Transcription
1 ECE 275B Homework # 3 Solutions Winter 25. Kay. I) Use of the prior pdf p(θ) = δ(θ θ ) means that we are % certain that θ = θ. In this case, one computes θ MSE = θ MAP = θ regardless of the quality and quantity of the subsequently collected data. This provides a model of stubbornness. II) Contrawise, an uninformative prior (typified by the uniform prior) says that we have no a priori knowledge of the parameter. In this case only the data is informative and we obtain θ MAP = θ MLE. (For symmetric, unimodal distributions we have θ MSE = θ MAP = θ MLE.) III) Prior pdf parameters (θ in the case considered above) are so called hyperpriors, meta level priors, or meta priors. 2. Kay.3 Let X = {x[],, x[n ]}, x = N And therefore, From, we obtain p(x θ) = N k= N k= x[k], and x = min k x[k]. Then p(x[k] θ) = e N( x θ) χ {θ < x}. p(x, θ) = p(x θ) p(θ) = e N x +(N )θ χ { < θ < x}. p(θ X ) = p(θ, X ) p(x ) θ MSE = E {θ X } = = = θ p(x, θ)dθ p(x, θ)dθ = = p(θ, X ), p(θ, X )dθ θ p(θ X )dθ x θ e (N )θ dθ x e (N )θ dθ x e (N )x N. This formula breaks down for N = and this case must be separately derived. Note that θ parameterizes the prior pdf and is assumed known and fixed in value. On the other hand, θ is variable and represents any admissible value for the unknown value of the rv Θ(ω).
2 3. Kay.4 Let X = {x[],, x[n ]}, x = max k x[k], and x = min k x[k]. Then And therefore, p(x θ) = N k= It is readily determined that, p(x[k] θ) = χ { x x θ}. θn p(x, θ) = p(x θ) p(θ) = χ { x x θ β}. β θn p(θ X ) = From which is obtained, N x ( N) β χ { x x θ β}. ( N) θn θ MSE = E {θ X } = N N 2 x(2 N) β (2 N) x ( N) β ( N). In the limit as β (for which the prior becomes uninformative), we get, θ MSE = N N 2 x. 4. Kay. is equivalent to Kay Example. and the answer is thus given by equation (.) in Kay eq seq. Because the posterior density is symmetric and unimodal (because it s Gaussian), the MMSE and MAP estimators are the same. 5. Kay.2 Notice that the two Gaussian components of the bimodal mixture distribution have the same variance and are symmetrically located about the origin at the points x and x. From symmetry considerations, then, it is obvious that for ɛ =.5 we have θ MSE = while θ MAP is unable to distinguish between the two equally probable maxima located at points x and x. For the case ɛ =.75, the symmetry is broken and it is obvious that the Gaussian component located at position x dominates so that θ MAP = x. It is easily found that the general solution for the MMSE estimator is given by the convex combination, θ MSE = ɛ x + ( ɛ)( x) = (2ɛ )x. Thus for ɛ =.75 we obtain, θ MSE = x 2. 2
3 6. Kay. Recall that consistency is defined to be convergence in probability and that mean square convergence is a sufficient condition for consistency. The MAP estimator, θ(n), is derived in Kay Example.2. Note that in the limit as N gets large we have that, θ(n) = N x[n]. N It is readily shown that θ(n) is an unbiased estimator of θ n= error which goes like N as N. Thus θ(n) of θ with a mean square is a consistent estimator. Therefore, by the carry over property for convergence in probability, the estimator θ(n) is a consistent estimator of θ. In general, as N certain Bayesian estimators (such as the MMSE and MAP estimators) generally become equivalent to the MLE estimator (as the data eventually overwhelms our prior belief about θ) which, in turn, is generally consistent. 7. Kay.8 From Equation (.4) we know that the MMSE is given by ŝ = C(C + σ 2 I) x, were C is the autocovariance matrix of the vector s. 2 Because we have a matrix equation, we must place the above equation into the requested form using noncommutative matrix operations. 3 We have that (C + σ 2 I)ŝ = (C + σ 2 I)C(C + σ 2 I) x = (C + σ 2 I) [ (C + σ 2 I) σ 2 I ] (C + σ 2 I) x = (C + σ 2 I) [ I σ 2 (C + σ 2 I) ] x = [ C + σ 2 I σ 2 I ] x = Cx. This equation is known as the Wiener Hopf (W H) equation, the solution of which yields the optimal, noncausal Wiener Filter. It is straightforward to solve for the (noncausal) solution to the W H equation using the procedure outlined by Kay. As noted by Kay, the W H equation can be written out at the components level as shown on page 374. By allowing the lower and upper limits of the shown sums to extend to + and respectively, we are considering the entire sampled time series of the process s(t). (This is equivalent to extending the N N 2 Recall that the vector s contains samples, s[n], of the zero mean process s(t) and therefore the elements of C are samples of the autocorrelation function of s(t). Similarly, the x is made up of samples from the noisy measurement process x(t) = s(t) + w(t). 3 On an exam you would lose points if you assumed commutativity where it is not permissible. 3
4 matrix C in the vector matrix W H given above to matrix.) Taking the (formal) Fourier transform of the resulting infinite summed equation, we obtain, This yields the noncausal Wiener filter, The last step follows because, Φ xx (f)ŝ(f) = Φ ssx(f). H(f) = Ŝ(f) X(f) = Φ ss(f) Φ xx (f) = x[n] = s[n] + w[n], Φ ss(f) Φ ss (f) + σ 2. where s and w are independent and w is a white noise process with variance σ 2. The inverse Fourier transform of H(f) will yield the discrete time domain Wiener Filter. In general (without additional assumptions or constraints) this filter will be IIR, noncausal, and infinite dimensional. To obtain a causal filter, the (complicated) technique of spectral factorization is used. To obtain a finite dimensional Wiener filter the process s is further assumed to have a so called rational spectrum. (The Kalman filter is a procedure to obtain a finite dimensional, causal optimal filter directly in the time domain, thereby side stepping the difficult issues involved in rational spectral approximation and factorization.) To obtain an FIR Wiener filter, rather than an IIR filter, an FIR filter structure is assumed at the outset, prior to deriving the W H equation. An excellent introduction to, and discussion of, these issues is given in Introduction to Optimal Estimation, E.W. Kaman and J.K. Su, Springer, 999. More advanced issues pertaining to the relationship between spectral factorization, covariance factorization, Wiener filtering, and Kalman filtering are discussed in Optimal Filtering, B.D.O. Anderson and J.B. Moore, Prentice Hall, 979. The Wiener Filter can also be written as H(f) = + σ2 Φ ss(f) Thus the optimal solution is given in the frequency domain by Ŝ(f) = =. Φ ss (f) X(f) Φ ss (f) + σ2 () X(f). + σ2 Φ ss(f) (2) Note from () that the optimal Wiener filter attempts to reconstruct the signal S(f) by weighting the measured noisy signal X(f) = S(f) + W (f) proportionately to the fraction of power in X(f) that is due to S(f) at each frequency f (the 4
5 remaining fraction of the power at that frequency being due to the corrupting noise W (f)). Note from (2) that at frequencies for which there is a high signal to noise ratio (SNR), Φss(f), the measured signal X(f) (being therefore mostly σ 2 comprised by the desired signal S(f)) is passed almost unattenuated, while at frequencies with a low SNR, Φss(f), the measured signal X(f) is greatly attenuated (being comprised in this instance mainly by the noise σ 2 N(f)). 8. Note that via an application of a matrix identity proved last quarter, Equation (.28) of Kay can be readily reexpressed as Equation (.32). Now note that taking µ θ = and C θ I (which corresponds to assuming a noninformative prior for θ) yields the Gauss-Markov solution of Equation (6.9). Theorem.3 is the Bayesian form of the Gauss Markov theorem. 9. Part a). Because of the symmetry of the posterior density function we have that θ mmse = θ map = arg max p(θ y) θ = arg max θ p(y θ) p(θ) = arg max ln p(y θ) + ln p(θ) θ = arg min y Aθ 2 W + θ θ 2 Σ Thus the problem is equivalent to solving a weighted least squares problem where the least squares loss function is given by, l(θ) = ( ) T ( ) ( ) y Aθ W y Aθ θ θ Σ θ θ = (η Aθ) T Λ (η Aθ) where. = η Aθ 2 Λ, ( ) y η =, A = θ ( ) ( ) A W, and Λ = I Σ Part b). We can now solve this problem using the deterministic weighted least squares theory developed last quarter. The adjoint operator of A is given by A = A T Λ = [ ] A T W, Σ, and its pseudoinverse is, A + = ( A T W A + Σ The optimal estimate is therefore, θ mmse = θ map = A + η = ( A T W A + Σ 5 ) [ ] A T W, Σ. ) ( A T W y + Σ θ )..
6 Note that for Σ = σ 2 I, we have Σ = I σ 2 as σ 2. Thus is the case of an uninformative prior we obtain the classical Gauss Markov MLE solution, θ mle = ( A T W A ) A T W y = A + y.. Kay 2. Optimal Quadratic Estimator. For simplicity, denote the single sample x[] by x. Note that we wish to approximate the highly nonlinear function of x, by the simpler quadratic function of x, θ(x) = cos (2πx) θ(x) = ax 2 + bx + c. A straightforward way to tackle this problem is to take the partial derivative of the mean square-error, { (θ(x) E (ax 2 + bx + c) ) } 2 with respect to the unknown parameters a, b, and c respectively. Then set each of these partial derivatives equal to zero and solve the resulting three equations for the three unknown parameters. Alternatively, let s discuss this problem within the geometric framework that has been a constant key theme of the course. Note that θ lies in the vector space of all real scalar valued nonlinear measurable functions of x (call this space RV), and that we wish to approximate θ within the subspace of all quadratic functions of x. This three dimensional subspace is made up of the one dimensional subspace of the constant random variables (which are all multiples of the number ), the one dimensional subspace of rv s which are multiples of x, and the one dimensional subspace of rv s which are multiples of x 2. Since all the random variables of interest are square integrable, RV (and each subspace) can be interpreted as a Hilbert space with inner product θ, θ 2 = E {θ θ 2 }. Therefore, we can apply our standard optimization in Hilbert space arguments to obtain the optimal approximation. There are actually at least two (equivalent) ways that the problem can be framed within a Hilbert Space framework. In both, we will obtain an approximation of the form, θ = Ay, 6
7 for an appropriately defined data vector y. I. The first way that the problem can be posed was described last quarter in ECE275A. Here we take y = (a b c) T R 3 and A = (x 2 x ) : R 3 RV. From this perspective, we are trying to find the Moore Penrose pseudoinverse solution, ŷ +, to the problem, θ = Ay. Note that in this setup, the linear operator A is known and y is unknown. Precisely this least squares problem was discussed last quarter. Once ŷ + is known we can compute the desired approximation to θ as θ + = Aŷ +. Note that the optimal estimate θ + lies in the range of the operator A, R(A) = quadratic functions of x, which is a Hilbert subspace of RV. II. An alternative way to pose the problem, and the one looked for in the Bayesian framework, is to look for the best linear minimum mean-square estimator of θ from within the Hilbert subspace, L, of RV made up of random variables which are obtained as linear functions of the known random variable y = (x 2 x ) T, L = {θ θ = Ay, A = (a b c)} RV. In this formulation of the problem, the linear operator A is unknown and y is known. Note that the Hilbert subspaces obtained here and in the previous paragraph are one and the same subspace, viz the quadratic functions of x. Only our interpretation of the nature of this subspace as changed (and thus, necessarily, our definitions of the quantities A and y). 4 We will solve the problem using the second setup. The projection theorem demands that the optimal solution θ o = A o y satisfies (θ θ ) o L. As we discussed in class, because the error must be orthogonal to any random variable of the form Ay regardless of the specific value of the operator A, we have the equivalent statement that, E { (θ A o y)y T } =. 4 Note that we could alternatively have defined the rv z = (x 2 x) T, A = (a b), and then looked for the optimal affine estimator θ = Ax + c, the solution of which we know is given by θ = m θ + Σ θz Σ zz (z m z ). Using our second definition of y, we have that y = (z T ) T. This shows that by embedding z in a one more dimensioned larger space we can restrict our search to the class of linear (rather than affine) estimators in the larger space. This is a standard trick. 7
8 This yields, as expected. A o = Σ θy Σ yy, Let the j th moment of x be µ j = E {x j }. Note that x is a zero mean rv with an even (i.e., symmetric about the origin) pdf. Because x j is an odd function for j odd, we have that µ = µ 3 =. Also note that θ = θ(x) = cos (2πx) is an even function of x, so that x j θ is an odd function of x for j odd, yielding (for j = ) E {xθ} =. Finally, because x is uniformly and symmetrically distributed about the origin over a span x, which is equal to 2π radians of the argument of θ cos ( ), 2 2 we have that E {θ} =. Via some straightforward integrations, it can also be ascertained that µ 2 = µ 4 =, and E 8 {x2 θ} =. 2π 2 These facts yield, and Finally, we obtain A o = Σ θy Σ yy = Σ θy = [ ], 2π Σ yy =. 2 This yields the optimal quadratic estimator, ( 9 5 ) ) = (â π 2 2π 2 o bo ĉ o. θ quad = 9 π 2 x π 2. Optimal Linear Estimator. Now we have y = (x ) T, A = (b c). In this case, we have that Σ θy = [E {xθ} E {θ}] = [ ]. Thus, regardless of the value of Σ yy, we have A o = Σ θy Σ yy = [ ]. Thus, the optimal linear estimator is given by θ lin =. This makes sense, if you think about it, because the constant function of x when integrated against θ(x) is zero (i.e., is orthogonal to θ(x)) while a linear function of x times the even function θ(x) is an odd function of x and therefore a linear function of x is also orthogonal 8 2,
9 to θ(x). Thus the projection of θ(x) on any affine (linear plus constant) function of x must be zero. Optimal MMSE. We have that θ MSE = E {θ x} = E {cos (2πx) x} = cos (2πx) = θ. Obviously, because θ is a deterministic function of x, knowledge of the value of x results in complete knowledge of the value of θ. Comparisons of MSE s.. Kay 2.8. MSE(linear) =.5 > MSE(quadratic) =.38 > MSE(optimal) =. I) Note that α = Aθ + b implies that the means are related by m α = Am θ + b and the cross covariances by Σ αx = AΣ θx. With these facts we have that α = m α + Σ αx Σ xx (x m x ) = (Am θ + b) + AΣ θx Σ xx (x m x ) = A ( m θ + Σ θx Σ xx (x m x ) ) + b = A θ + b. II) With α = θ + θ 2, it is straightforward to show that Σ αx = Σ θ x + Σ θ2 x and m α = m θ + m θ2. This yields, α = m α + Σ αx Σ xx (x m x ) = = (m θ + m θ2 ) + (Σ θ x + Σ θ2 x) Σ xx (x m x ) = ( m θ + Σ θ xσ xx (x m x ) ) + ( m θ2 + Σ θ2 xσ xx (x m x ) ) = θ + θ 2,. 2. Note that all quantities are zero mean. 5 (a) Note that the problem is completely symmetric in x and n. Therefore the solution for x yields the solution for n, mutatis mutandis. The optimal linear estimator is generally of the form ay + b, i.e., it is actually affine. However it is easily shown that we can take b =, so that the optimal estimator is drawn from the family of unbiased estimators. Using the Hilbert space framework, to find the optimal estimator x = ay from within the space of linear functions of y, we invoke the projection theorem to obtain the orthogonality condition, (x ay), y = E {(x ay)y} =. 5 But what if a particularly difficult examiner disallows this simplifying assumption? 9
10 From this we obtain the Wiener Hopf equation, E { y 2} a = E {xy}, from which the solution is determined to be, x = From symmetry, we obviously have σ2 x y. σx 2 + σn 2 n = σ2 n y. σx 2 + σn 2 (b) Note that x, y, and n must all be jointly gaussian. Again from symmetry, what holds for x must hold for n (and vice versa), mutatis mutandis. For the gaussian case, the well known optimal mmse, 6 x = E {x y}, is given by x = E {x y} = m x + Σ xy Σ yy (y m y ) which is just the optimal linear estimator derived above. (c) Not all of the possible combinations make sense. Chapter 4 of Kay is a summary review of all the estimation schemes. For the Gauss Markov theorem, see page 4 of Kay. 3. In class lecture and in the texts it is shown that With z = Ay + b, A invertible, we have, x(y) = m x + Σ xy Σ yy (y m y ). m z = Am y + b ; z m z = A (y m y ) ; Σ xz = Σ zy A T ; Σ zz = AΣ yy A T. x(z) = mx + ΣxzΣ zz (z m z ) = m x + Σ xy A T A T Σ yy A A (y m y ) 4. No solution provided. = m x + Σ xy Σ yy (y m y ) = x(y). 6 Can you prove this at the board, fast, in real time? 7 Note the abuse of notation in the statement x(z) = x(y).
Probability and Random Variables. Generation of random variables (r.v.)
Probability and Random Variables Method for generating random variables with a specified probability distribution function. Gaussian And Markov Processes Characterization of Stationary Random Process Linearly
More informationInner Product Spaces
Math 571 Inner Product Spaces 1. Preliminaries An inner product space is a vector space V along with a function, called an inner product which associates each pair of vectors u, v with a scalar u, v, and
More informationLecture 8: Signal Detection and Noise Assumption
ECE 83 Fall Statistical Signal Processing instructor: R. Nowak, scribe: Feng Ju Lecture 8: Signal Detection and Noise Assumption Signal Detection : X = W H : X = S + W where W N(, σ I n n and S = [s, s,...,
More informationOverview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model
Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model 1 September 004 A. Introduction and assumptions The classical normal linear regression model can be written
More informationIntroduction to General and Generalized Linear Models
Introduction to General and Generalized Linear Models General Linear Models - part I Henrik Madsen Poul Thyregod Informatics and Mathematical Modelling Technical University of Denmark DK-2800 Kgs. Lyngby
More informationPolynomial Invariants
Polynomial Invariants Dylan Wilson October 9, 2014 (1) Today we will be interested in the following Question 1.1. What are all the possible polynomials in two variables f(x, y) such that f(x, y) = f(y,
More informationMath 4310 Handout - Quotient Vector Spaces
Math 4310 Handout - Quotient Vector Spaces Dan Collins The textbook defines a subspace of a vector space in Chapter 4, but it avoids ever discussing the notion of a quotient space. This is understandable
More informationSolutions to Exam in Speech Signal Processing EN2300
Solutions to Exam in Speech Signal Processing EN23 Date: Thursday, Dec 2, 8: 3: Place: Allowed: Grades: Language: Solutions: Q34, Q36 Beta Math Handbook (or corresponding), calculator with empty memory.
More informationElasticity Theory Basics
G22.3033-002: Topics in Computer Graphics: Lecture #7 Geometric Modeling New York University Elasticity Theory Basics Lecture #7: 20 October 2003 Lecturer: Denis Zorin Scribe: Adrian Secord, Yotam Gingold
More informationThe CUSUM algorithm a small review. Pierre Granjon
The CUSUM algorithm a small review Pierre Granjon June, 1 Contents 1 The CUSUM algorithm 1.1 Algorithm............................... 1.1.1 The problem......................... 1.1. The different steps......................
More informationCHAPTER 2 Estimating Probabilities
CHAPTER 2 Estimating Probabilities Machine Learning Copyright c 2016. Tom M. Mitchell. All rights reserved. *DRAFT OF January 24, 2016* *PLEASE DO NOT DISTRIBUTE WITHOUT AUTHOR S PERMISSION* This is a
More informationLecture 5 Least-squares
EE263 Autumn 2007-08 Stephen Boyd Lecture 5 Least-squares least-squares (approximate) solution of overdetermined equations projection and orthogonality principle least-squares estimation BLUE property
More informationmin ǫ = E{e 2 [n]}. (11.2)
C H A P T E R 11 Wiener Filtering INTRODUCTION In this chapter we will consider the use of LTI systems in order to perform minimum mean-square-error (MMSE) estimation of a WSS random process of interest,
More informationLinear Threshold Units
Linear Threshold Units w x hx (... w n x n w We assume that each feature x j and each weight w j is a real number (we will relax this later) We will study three different algorithms for learning linear
More informationMATRIX ALGEBRA AND SYSTEMS OF EQUATIONS
MATRIX ALGEBRA AND SYSTEMS OF EQUATIONS Systems of Equations and Matrices Representation of a linear system The general system of m equations in n unknowns can be written a x + a 2 x 2 + + a n x n b a
More information1 Teaching notes on GMM 1.
Bent E. Sørensen January 23, 2007 1 Teaching notes on GMM 1. Generalized Method of Moment (GMM) estimation is one of two developments in econometrics in the 80ies that revolutionized empirical work in
More informationNOTES ON LINEAR TRANSFORMATIONS
NOTES ON LINEAR TRANSFORMATIONS Definition 1. Let V and W be vector spaces. A function T : V W is a linear transformation from V to W if the following two properties hold. i T v + v = T v + T v for all
More informationAdding vectors We can do arithmetic with vectors. We ll start with vector addition and related operations. Suppose you have two vectors
1 Chapter 13. VECTORS IN THREE DIMENSIONAL SPACE Let s begin with some names and notation for things: R is the set (collection) of real numbers. We write x R to mean that x is a real number. A real number
More informationSolutions to Math 51 First Exam January 29, 2015
Solutions to Math 5 First Exam January 29, 25. ( points) (a) Complete the following sentence: A set of vectors {v,..., v k } is defined to be linearly dependent if (2 points) there exist c,... c k R, not
More informationBANACH AND HILBERT SPACE REVIEW
BANACH AND HILBET SPACE EVIEW CHISTOPHE HEIL These notes will briefly review some basic concepts related to the theory of Banach and Hilbert spaces. We are not trying to give a complete development, but
More informationVector and Matrix Norms
Chapter 1 Vector and Matrix Norms 11 Vector Spaces Let F be a field (such as the real numbers, R, or complex numbers, C) with elements called scalars A Vector Space, V, over the field F is a non-empty
More informationMaster s Theory Exam Spring 2006
Spring 2006 This exam contains 7 questions. You should attempt them all. Each question is divided into parts to help lead you through the material. You should attempt to complete as much of each problem
More informationECON 459 Game Theory. Lecture Notes Auctions. Luca Anderlini Spring 2015
ECON 459 Game Theory Lecture Notes Auctions Luca Anderlini Spring 2015 These notes have been used before. If you can still spot any errors or have any suggestions for improvement, please let me know. 1
More informationEquations, Inequalities & Partial Fractions
Contents Equations, Inequalities & Partial Fractions.1 Solving Linear Equations 2.2 Solving Quadratic Equations 1. Solving Polynomial Equations 1.4 Solving Simultaneous Linear Equations 42.5 Solving Inequalities
More informationGaussian Conjugate Prior Cheat Sheet
Gaussian Conjugate Prior Cheat Sheet Tom SF Haines 1 Purpose This document contains notes on how to handle the multivariate Gaussian 1 in a Bayesian setting. It focuses on the conjugate prior, its Bayesian
More informationLecture 2: Homogeneous Coordinates, Lines and Conics
Lecture 2: Homogeneous Coordinates, Lines and Conics 1 Homogeneous Coordinates In Lecture 1 we derived the camera equations λx = P X, (1) where x = (x 1, x 2, 1), X = (X 1, X 2, X 3, 1) and P is a 3 4
More informationSome probability and statistics
Appendix A Some probability and statistics A Probabilities, random variables and their distribution We summarize a few of the basic concepts of random variables, usually denoted by capital letters, X,Y,
More informationBasics of Statistical Machine Learning
CS761 Spring 2013 Advanced Machine Learning Basics of Statistical Machine Learning Lecturer: Xiaojin Zhu jerryzhu@cs.wisc.edu Modern machine learning is rooted in statistics. You will find many familiar
More informationCovariance and Correlation
Covariance and Correlation ( c Robert J. Serfling Not for reproduction or distribution) We have seen how to summarize a data-based relative frequency distribution by measures of location and spread, such
More informationNotes on Orthogonal and Symmetric Matrices MENU, Winter 2013
Notes on Orthogonal and Symmetric Matrices MENU, Winter 201 These notes summarize the main properties and uses of orthogonal and symmetric matrices. We covered quite a bit of material regarding these topics,
More information1 Lecture: Integration of rational functions by decomposition
Lecture: Integration of rational functions by decomposition into partial fractions Recognize and integrate basic rational functions, except when the denominator is a power of an irreducible quadratic.
More informationUnified Lecture # 4 Vectors
Fall 2005 Unified Lecture # 4 Vectors These notes were written by J. Peraire as a review of vectors for Dynamics 16.07. They have been adapted for Unified Engineering by R. Radovitzky. References [1] Feynmann,
More informationSF2940: Probability theory Lecture 8: Multivariate Normal Distribution
SF2940: Probability theory Lecture 8: Multivariate Normal Distribution Timo Koski 24.09.2014 Timo Koski () Mathematisk statistik 24.09.2014 1 / 75 Learning outcomes Random vectors, mean vector, covariance
More informationUnderstanding and Applying Kalman Filtering
Understanding and Applying Kalman Filtering Lindsay Kleeman Department of Electrical and Computer Systems Engineering Monash University, Clayton 1 Introduction Objectives: 1. Provide a basic understanding
More informationStatistical Machine Learning
Statistical Machine Learning UoC Stats 37700, Winter quarter Lecture 4: classical linear and quadratic discriminants. 1 / 25 Linear separation For two classes in R d : simple idea: separate the classes
More information5 Homogeneous systems
5 Homogeneous systems Definition: A homogeneous (ho-mo-jeen -i-us) system of linear algebraic equations is one in which all the numbers on the right hand side are equal to : a x +... + a n x n =.. a m
More informationLeast-Squares Intersection of Lines
Least-Squares Intersection of Lines Johannes Traa - UIUC 2013 This write-up derives the least-squares solution for the intersection of lines. In the general case, a set of lines will not intersect at a
More informationLinear Algebra Notes for Marsden and Tromba Vector Calculus
Linear Algebra Notes for Marsden and Tromba Vector Calculus n-dimensional Euclidean Space and Matrices Definition of n space As was learned in Math b, a point in Euclidean three space can be thought of
More informationSF2940: Probability theory Lecture 8: Multivariate Normal Distribution
SF2940: Probability theory Lecture 8: Multivariate Normal Distribution Timo Koski 24.09.2015 Timo Koski Matematisk statistik 24.09.2015 1 / 1 Learning outcomes Random vectors, mean vector, covariance matrix,
More informationProbability and Statistics Prof. Dr. Somesh Kumar Department of Mathematics Indian Institute of Technology, Kharagpur
Probability and Statistics Prof. Dr. Somesh Kumar Department of Mathematics Indian Institute of Technology, Kharagpur Module No. #01 Lecture No. #15 Special Distributions-VI Today, I am going to introduce
More informationis identically equal to x 2 +3x +2
Partial fractions 3.6 Introduction It is often helpful to break down a complicated algebraic fraction into a sum of simpler fractions. 4x+7 For example it can be shown that has the same value as 1 + 3
More informationMATRIX ALGEBRA AND SYSTEMS OF EQUATIONS. + + x 2. x n. a 11 a 12 a 1n b 1 a 21 a 22 a 2n b 2 a 31 a 32 a 3n b 3. a m1 a m2 a mn b m
MATRIX ALGEBRA AND SYSTEMS OF EQUATIONS 1. SYSTEMS OF EQUATIONS AND MATRICES 1.1. Representation of a linear system. The general system of m equations in n unknowns can be written a 11 x 1 + a 12 x 2 +
More informationRotation Matrices and Homogeneous Transformations
Rotation Matrices and Homogeneous Transformations A coordinate frame in an n-dimensional space is defined by n mutually orthogonal unit vectors. In particular, for a two-dimensional (2D) space, i.e., n
More informationIntroduction to Kalman Filtering
Introduction to Kalman Filtering A set of two lectures Maria Isabel Ribeiro Associate Professor Instituto Superior écnico / Instituto de Sistemas e Robótica June All rights reserved INRODUCION O KALMAN
More informationMA107 Precalculus Algebra Exam 2 Review Solutions
MA107 Precalculus Algebra Exam 2 Review Solutions February 24, 2008 1. The following demand equation models the number of units sold, x, of a product as a function of price, p. x = 4p + 200 a. Please write
More information4.5 Linear Dependence and Linear Independence
4.5 Linear Dependence and Linear Independence 267 32. {v 1, v 2 }, where v 1, v 2 are collinear vectors in R 3. 33. Prove that if S and S are subsets of a vector space V such that S is a subset of S, then
More information1 Short Introduction to Time Series
ECONOMICS 7344, Spring 202 Bent E. Sørensen January 24, 202 Short Introduction to Time Series A time series is a collection of stochastic variables x,.., x t,.., x T indexed by an integer value t. The
More information6 EXTENDING ALGEBRA. 6.0 Introduction. 6.1 The cubic equation. Objectives
6 EXTENDING ALGEBRA Chapter 6 Extending Algebra Objectives After studying this chapter you should understand techniques whereby equations of cubic degree and higher can be solved; be able to factorise
More informationSIGNAL PROCESSING & SIMULATION NEWSLETTER
1 of 10 1/25/2008 3:38 AM SIGNAL PROCESSING & SIMULATION NEWSLETTER Note: This is not a particularly interesting topic for anyone other than those who ar e involved in simulation. So if you have difficulty
More informationName: Section Registered In:
Name: Section Registered In: Math 125 Exam 3 Version 1 April 24, 2006 60 total points possible 1. (5pts) Use Cramer s Rule to solve 3x + 4y = 30 x 2y = 8. Be sure to show enough detail that shows you are
More informationSignal Detection C H A P T E R 14 14.1 SIGNAL DETECTION AS HYPOTHESIS TESTING
C H A P T E R 4 Signal Detection 4. SIGNAL DETECTION AS HYPOTHESIS TESTING In Chapter 3 we considered hypothesis testing in the context of random variables. The detector resulting in the minimum probability
More informationMathematical finance and linear programming (optimization)
Mathematical finance and linear programming (optimization) Geir Dahl September 15, 2009 1 Introduction The purpose of this short note is to explain how linear programming (LP) (=linear optimization) may
More informationFFT Algorithms. Chapter 6. Contents 6.1
Chapter 6 FFT Algorithms Contents Efficient computation of the DFT............................................ 6.2 Applications of FFT................................................... 6.6 Computing DFT
More informationApplications to Data Smoothing and Image Processing I
Applications to Data Smoothing and Image Processing I MA 348 Kurt Bryan Signals and Images Let t denote time and consider a signal a(t) on some time interval, say t. We ll assume that the signal a(t) is
More informationMATH 304 Linear Algebra Lecture 20: Inner product spaces. Orthogonal sets.
MATH 304 Linear Algebra Lecture 20: Inner product spaces. Orthogonal sets. Norm The notion of norm generalizes the notion of length of a vector in R n. Definition. Let V be a vector space. A function α
More informationHow To Prove The Dirichlet Unit Theorem
Chapter 6 The Dirichlet Unit Theorem As usual, we will be working in the ring B of algebraic integers of a number field L. Two factorizations of an element of B are regarded as essentially the same if
More informationMath 241, Exam 1 Information.
Math 241, Exam 1 Information. 9/24/12, LC 310, 11:15-12:05. Exam 1 will be based on: Sections 12.1-12.5, 14.1-14.3. The corresponding assigned homework problems (see http://www.math.sc.edu/ boylan/sccourses/241fa12/241.html)
More informationMATH10212 Linear Algebra. Systems of Linear Equations. Definition. An n-dimensional vector is a row or a column of n numbers (or letters): a 1.
MATH10212 Linear Algebra Textbook: D. Poole, Linear Algebra: A Modern Introduction. Thompson, 2006. ISBN 0-534-40596-7. Systems of Linear Equations Definition. An n-dimensional vector is a row or a column
More information1 if 1 x 0 1 if 0 x 1
Chapter 3 Continuity In this chapter we begin by defining the fundamental notion of continuity for real valued functions of a single real variable. When trying to decide whether a given function is or
More informationLinear Regression. Guy Lebanon
Linear Regression Guy Lebanon Linear Regression Model and Least Squares Estimation Linear regression is probably the most popular model for predicting a RV Y R based on multiple RVs X 1,..., X d R. It
More informationCreating, Solving, and Graphing Systems of Linear Equations and Linear Inequalities
Algebra 1, Quarter 2, Unit 2.1 Creating, Solving, and Graphing Systems of Linear Equations and Linear Inequalities Overview Number of instructional days: 15 (1 day = 45 60 minutes) Content to be learned
More informationDirichlet forms methods for error calculus and sensitivity analysis
Dirichlet forms methods for error calculus and sensitivity analysis Nicolas BOULEAU, Osaka university, november 2004 These lectures propose tools for studying sensitivity of models to scalar or functional
More information3. INNER PRODUCT SPACES
. INNER PRODUCT SPACES.. Definition So far we have studied abstract vector spaces. These are a generalisation of the geometric spaces R and R. But these have more structure than just that of a vector space.
More informationTTT4110 Information and Signal Theory Solution to exam
Norwegian University of Science and Technology Department of Electronics and Telecommunications TTT4 Information and Signal Theory Solution to exam Problem I (a The frequency response is found by taking
More informationSECOND DERIVATIVE TEST FOR CONSTRAINED EXTREMA
SECOND DERIVATIVE TEST FOR CONSTRAINED EXTREMA This handout presents the second derivative test for a local extrema of a Lagrange multiplier problem. The Section 1 presents a geometric motivation for the
More informationDiscrete Mathematics and Probability Theory Fall 2009 Satish Rao, David Tse Note 2
CS 70 Discrete Mathematics and Probability Theory Fall 2009 Satish Rao, David Tse Note 2 Proofs Intuitively, the concept of proof should already be familiar We all like to assert things, and few of us
More informationEnhancing the SNR of the Fiber Optic Rotation Sensor using the LMS Algorithm
1 Enhancing the SNR of the Fiber Optic Rotation Sensor using the LMS Algorithm Hani Mehrpouyan, Student Member, IEEE, Department of Electrical and Computer Engineering Queen s University, Kingston, Ontario,
More informationMATH 4330/5330, Fourier Analysis Section 11, The Discrete Fourier Transform
MATH 433/533, Fourier Analysis Section 11, The Discrete Fourier Transform Now, instead of considering functions defined on a continuous domain, like the interval [, 1) or the whole real line R, we wish
More informationLecture L3 - Vectors, Matrices and Coordinate Transformations
S. Widnall 16.07 Dynamics Fall 2009 Lecture notes based on J. Peraire Version 2.0 Lecture L3 - Vectors, Matrices and Coordinate Transformations By using vectors and defining appropriate operations between
More information1 Review of Least Squares Solutions to Overdetermined Systems
cs4: introduction to numerical analysis /9/0 Lecture 7: Rectangular Systems and Numerical Integration Instructor: Professor Amos Ron Scribes: Mark Cowlishaw, Nathanael Fillmore Review of Least Squares
More informationThe Basics of Graphical Models
The Basics of Graphical Models David M. Blei Columbia University October 3, 2015 Introduction These notes follow Chapter 2 of An Introduction to Probabilistic Graphical Models by Michael Jordan. Many figures
More informationMATH 304 Linear Algebra Lecture 9: Subspaces of vector spaces (continued). Span. Spanning set.
MATH 304 Linear Algebra Lecture 9: Subspaces of vector spaces (continued). Span. Spanning set. Vector space A vector space is a set V equipped with two operations, addition V V (x,y) x + y V and scalar
More informationEE 570: Location and Navigation
EE 570: Location and Navigation On-Line Bayesian Tracking Aly El-Osery 1 Stephen Bruder 2 1 Electrical Engineering Department, New Mexico Tech Socorro, New Mexico, USA 2 Electrical and Computer Engineering
More informationTTT4120 Digital Signal Processing Suggested Solution to Exam Fall 2008
Norwegian University of Science and Technology Department of Electronics and Telecommunications TTT40 Digital Signal Processing Suggested Solution to Exam Fall 008 Problem (a) The input and the input-output
More informationLS.6 Solution Matrices
LS.6 Solution Matrices In the literature, solutions to linear systems often are expressed using square matrices rather than vectors. You need to get used to the terminology. As before, we state the definitions
More informationCCNY. BME I5100: Biomedical Signal Processing. Linear Discrimination. Lucas C. Parra Biomedical Engineering Department City College of New York
BME I5100: Biomedical Signal Processing Linear Discrimination Lucas C. Parra Biomedical Engineering Department CCNY 1 Schedule Week 1: Introduction Linear, stationary, normal - the stuff biology is not
More informationLimits and Continuity
Math 20C Multivariable Calculus Lecture Limits and Continuity Slide Review of Limit. Side limits and squeeze theorem. Continuous functions of 2,3 variables. Review: Limits Slide 2 Definition Given a function
More informationLinear and quadratic Taylor polynomials for functions of several variables.
ams/econ 11b supplementary notes ucsc Linear quadratic Taylor polynomials for functions of several variables. c 010, Yonatan Katznelson Finding the extreme (minimum or maximum) values of a function, is
More information1 Norms and Vector Spaces
008.10.07.01 1 Norms and Vector Spaces Suppose we have a complex vector space V. A norm is a function f : V R which satisfies (i) f(x) 0 for all x V (ii) f(x + y) f(x) + f(y) for all x,y V (iii) f(λx)
More informationGaussian Processes in Machine Learning
Gaussian Processes in Machine Learning Carl Edward Rasmussen Max Planck Institute for Biological Cybernetics, 72076 Tübingen, Germany carl@tuebingen.mpg.de WWW home page: http://www.tuebingen.mpg.de/ carl
More information3.6. Partial Fractions. Introduction. Prerequisites. Learning Outcomes
Partial Fractions 3.6 Introduction It is often helpful to break down a complicated algebraic fraction into a sum of simpler fractions. For 4x + 7 example it can be shown that x 2 + 3x + 2 has the same
More informationMatrix Representations of Linear Transformations and Changes of Coordinates
Matrix Representations of Linear Transformations and Changes of Coordinates 01 Subspaces and Bases 011 Definitions A subspace V of R n is a subset of R n that contains the zero element and is closed under
More informationChristfried Webers. Canberra February June 2015
c Statistical Group and College of Engineering and Computer Science Canberra February June (Many figures from C. M. Bishop, "Pattern Recognition and ") 1of 829 c Part VIII Linear Classification 2 Logistic
More informationAdvanced Signal Processing and Digital Noise Reduction
Advanced Signal Processing and Digital Noise Reduction Saeed V. Vaseghi Queen's University of Belfast UK WILEY HTEUBNER A Partnership between John Wiley & Sons and B. G. Teubner Publishers Chichester New
More informationMaster s thesis tutorial: part III
for the Autonomous Compliant Research group Tinne De Laet, Wilm Decré, Diederik Verscheure Katholieke Universiteit Leuven, Department of Mechanical Engineering, PMA Division 30 oktober 2006 Outline General
More informationCommunication on the Grassmann Manifold: A Geometric Approach to the Noncoherent Multiple-Antenna Channel
IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 48, NO. 2, FEBRUARY 2002 359 Communication on the Grassmann Manifold: A Geometric Approach to the Noncoherent Multiple-Antenna Channel Lizhong Zheng, Student
More informationParametric Statistical Modeling
Parametric Statistical Modeling ECE 275A Statistical Parameter Estimation Ken Kreutz-Delgado ECE Department, UC San Diego Ken Kreutz-Delgado (UC San Diego) ECE 275A SPE Version 1.1 Fall 2012 1 / 12 Why
More informationThe Method of Partial Fractions Math 121 Calculus II Spring 2015
Rational functions. as The Method of Partial Fractions Math 11 Calculus II Spring 015 Recall that a rational function is a quotient of two polynomials such f(x) g(x) = 3x5 + x 3 + 16x x 60. The method
More informationLecture 7: Finding Lyapunov Functions 1
Massachusetts Institute of Technology Department of Electrical Engineering and Computer Science 6.243j (Fall 2003): DYNAMICS OF NONLINEAR SYSTEMS by A. Megretski Lecture 7: Finding Lyapunov Functions 1
More information5 Numerical Differentiation
D. Levy 5 Numerical Differentiation 5. Basic Concepts This chapter deals with numerical approximations of derivatives. The first questions that comes up to mind is: why do we need to approximate derivatives
More informationMathematics Course 111: Algebra I Part IV: Vector Spaces
Mathematics Course 111: Algebra I Part IV: Vector Spaces D. R. Wilkins Academic Year 1996-7 9 Vector Spaces A vector space over some field K is an algebraic structure consisting of a set V on which are
More informationMath 312 Homework 1 Solutions
Math 31 Homework 1 Solutions Last modified: July 15, 01 This homework is due on Thursday, July 1th, 01 at 1:10pm Please turn it in during class, or in my mailbox in the main math office (next to 4W1) Please
More informationBy choosing to view this document, you agree to all provisions of the copyright laws protecting it.
This material is posted here with permission of the IEEE Such permission of the IEEE does not in any way imply IEEE endorsement of any of Helsinki University of Technology's products or services Internal
More informationCURVE FITTING LEAST SQUARES APPROXIMATION
CURVE FITTING LEAST SQUARES APPROXIMATION Data analysis and curve fitting: Imagine that we are studying a physical system involving two quantities: x and y Also suppose that we expect a linear relationship
More informationPYKC Jan-7-10. Lecture 1 Slide 1
Aims and Objectives E 2.5 Signals & Linear Systems Peter Cheung Department of Electrical & Electronic Engineering Imperial College London! By the end of the course, you would have understood: Basic signal
More informationEconometrics Simple Linear Regression
Econometrics Simple Linear Regression Burcu Eke UC3M Linear equations with one variable Recall what a linear equation is: y = b 0 + b 1 x is a linear equation with one variable, or equivalently, a straight
More informationThe Bivariate Normal Distribution
The Bivariate Normal Distribution This is Section 4.7 of the st edition (2002) of the book Introduction to Probability, by D. P. Bertsekas and J. N. Tsitsiklis. The material in this section was not included
More informationOverview. Essential Questions. Precalculus, Quarter 4, Unit 4.5 Build Arithmetic and Geometric Sequences and Series
Sequences and Series Overview Number of instruction days: 4 6 (1 day = 53 minutes) Content to Be Learned Write arithmetic and geometric sequences both recursively and with an explicit formula, use them
More informationAnalysis of Mean-Square Error and Transient Speed of the LMS Adaptive Algorithm
IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 48, NO. 7, JULY 2002 1873 Analysis of Mean-Square Error Transient Speed of the LMS Adaptive Algorithm Onkar Dabeer, Student Member, IEEE, Elias Masry, Fellow,
More informationAutocovariance and Autocorrelation
Chapter 3 Autocovariance and Autocorrelation If the {X n } process is weakly stationary, the covariance of X n and X n+k depends only on the lag k. This leads to the following definition of the autocovariance
More information