2. Linear Minimum Mean Squared Error Estimation

Size: px
Start display at page:

Download "2. Linear Minimum Mean Squared Error Estimation"

Transcription

1 . Linear inimum ean Squared Error Estimation.1. Linear minimum mean squared error estimators Situation considered: - A random sequence X ( 1,, X( whose realizations can be observed. - A random variable Y which has to be estimated. - We seek an estimate of Y with a linear estimator of the form: Ŷ = h 0 + h m X( m - A measure of the goodness of Ŷ is the mean squared error (SE: E[ ( Ŷ Y. Covariance and variance of random variables: Let U and V denote two random variables with expectation µ U E[ U and µ V E[ V. - The covariance of U and V is defined to be: Σ UV E[ ( U µ U ( V µ V = E[ UV µ U µ V. Let U [ U ( 1,, U( T and V [ V ( 1,, V( ' T denote two random vectors. The covariance matrix of U and V is defined as A direct way to obtain Σ UV : where Σ U ( 1V ( 1 Σ U ( 1V( ' Σ UV Σ U( V ( 1 Σ U( V( ' Σ UV = E[ ( U µ U ( V µ V T = E[ UV T µ U ( µ V T µ U E[ U = [ E[ U ( 1,, E[ U( T µ V E[ V Examples: U = X [ X ( 1,, X( T and V = Y. In the sequel we shall frequently make use of the following covariance matrix and vector: (i Σ XX = E[ ( X µ X ( X µ X T σ X ( 1 Σ X ( 1X( = Σ X( X ( 1 σ X( - The variance of U is defined to be: σ U E[ ( U µ U = = E[ U ( µ U Σ UU (ii Σ XY = E[ ( X µ X ( Y µ Y T = Σ X ( 1Y Σ X( Y Linear minimum mean squared error estimator (LSEE A LSEE of Y is a linear estimator, i.e. an estimator of the form Ŷ = h 0 + h m X( m, which minimizes the SE E[ ( Ŷ Y. -1 -

2 A linear estimator is entirely determined by the ( + 1 -dimensional vector h [ h 0,, h T. Orthogonality principle: Orthogonality principle: A necessary condition for h [ h 0,, h T to be the coefficient vector of the LSEE is that its entries fulfils the ( + 1 identities: E[ Y Ŷ = E Y h 0 + h m X( m = 0 E[ ( Y ŶX( j = E Y h 0 + h m X( m X( j = 0, j = 1,, Because the coefficient vector of the LSEE minimizes E[ ( Ŷ Y, its components must satisfy the set of equations: E [ ( Ŷ Y = 0 j = 0,,. h j Notice that the two expressions in (.1 can be rewritten as: E[ Y Ŷ = 0 (.a E[ ( Y ŶX( j = 0 j = 1,, Important consequences of the orthogonality principle: E[ ( Y ŶŶ = 0 E[ ( Y Ŷ = E[ Y E[ Ŷ = E[ ( Y ŶY (.3a (.3b Ŷ (.b Y Y (.1a (.1b Ŷ E Y [ Geometrical interpretation: Let U and V denote two random variables with finite second moment, i.e. E[ U < and E[ V <. Then, the expectation E[ UV can be viewed as the scalar or inner product of U and V. Within this interpretation: - U and V are uncorrelated, i.e. E[ UV = 0 if and only if, they are orthogonal, - E[ U is the norm (length of U. Interpretation of both equations in (.3: - (.3a: the estimation error Y Ŷ and the estimate Ŷ are orthogonal. - (.3b: results from Pythagoras Theorem. Computation of the coefficient vector of the LSEE: The coefficients of the LSEE satisfy the relationships: µ Y = h 0 h m µ X( m = h + ( h - T µ 0 X Σ XY Σ XX h - = where h - [ h 1,, h T and X [ X 1,, X T. Both identities follow by appropriately reformulating relations (.1a and (.1b and using a matrix notation for the latter one. -3-4

3 Thus, provided exists the coefficients of the LSEE are given by: Example: Linear prediction of a WSS process Let Y( n denote a WSS process with - zero mean, i.e E[ Y( n = 0, - autocorrelation function E[ Y( ny( n + k = R YY ( k We seek the LSEE for the present value of Y( n based on the past observations Y( n 1,, Y( n of the process. Hence, - Y = Y( n - X ( m = Y( n m,,,, i.e. - Σ XX = ( Σ XX 1 h - = ( Σ XX 1 Σ XY (.4a h 0 µ Y ( h - T T = µ X = µ Y Σ XY ( ΣXX 1 µ X X = [ Y( n 1,, Y( n T Because µ Y = 0 and µ X = 0, it follows from (.4b that h 0 = 0 Computation of Σ XY and Σ XX : - Σ XY = [ E[ Y( n 1Y( n,, E[ Y( n Y( n T = [ R YY ( 1,, R YY ( T = (.4b E[ Y( n 1 E[ Y( n 1Y( n E[ Y( n 1Y( n E[ Y( n Y( n 1 E[ Y( n E[ Y( n Y( n E[ Y( n Y( n 1 E[ Y( n Y( n E[ Y( n Residual error using a LSEE: The SE resulting when using a LSEE is E[ ( Ŷ Y = σ Y ( h - T Σ XY (.5 E[ ( Y Ŷ = E[ ( Y ŶY = E[ Y E[ Ŷ Y.. inimum mean squared error estimators Conditional expectation: Let U and V denote two random variables. The conditional expectation of V given U = u is observed is defined to be E[ V u vp( v udv. Notice that E[ V U is a random variable. In the sequel we shall make use of the following important property of conditional expectations: EEV [ [ U = E[ V = R YY ( 0 R YY ( 1 R YY ( R YY ( 1 R YY ( 1 R YY ( 0 R YY ( 1 R YY ( R YY ( R YY ( 1 R YY ( 0 R YY ( 3 R YY ( 1 R YY ( R YY ( 3 R YY (

4 inimum mean squared error estimator (SEE: The SEE of Y based on the observation of X ( 1,, X( is of the form: Hence if X ( 1 = x( 1,, X( = x ( is observed, then = Let Ŷ denote an arbitrary estimator. Then, = 0 = E[ ( Ŷ Y + E[ ( Y Y Thus, E[ ( Ŷ Y ( Y Y = 0. Y ( X( 1,, X( = E[ Y X( 1,, X( Y ( x( 1,, x ( = E[ Y x( 1,, x ( E[ ( Ŷ Y = E[ (( Ŷ Y ( Y Y yp( y x( 1,, x ( dy = E[ ( Ŷ Y E[ ( Ŷ Y ( Y Y + E[ ( Y Y with equality if, and only if, Ŷ = E[ ( Ŷ Y E[ ( Y Y Y. We still have to prove that Example: ultivariate Gaussian variables: [ Y, X( 1,, X( T N ( µσ, with - µ [ µ Y, µ X ( 1, µ, X( T σ - Σ Y Σ XY From Equation (6. in [Shanmugan it follows that Bivariate case: = 1, X ( 1 = X - Σ XX = σ X, Σ XY - Σ XY = ρσ X σ Y, where ρ is the correlation coefficient of Y and σ X σ Y X. In this case, ( Σ XY T Y Σ XX = E[ Y X = µ Y + ( Σ XY T ( Σ XX 1 ( X µ X Y ρσ Y = E[ Y X = µ Y ( X µ σ X X ρσ Y µ Y µ σ X ρσ Y = X X σ X h 0 h 1 We can observe that Y is linear, i.e. is the LSEE Ŷ = Y in the bivariate case. This is also true in the general multivariate Gaussian case. In fact, Ŷ = Y if, and only if, [ Y, X( 1,, X( T is a Gaussian random vector. -7-8

5 .3. Time-discrete Wiener filters Problem: Estimation of a WSS random sequence Y( n based on the observation of another sequence X( n. Without loss of generality we assume that E[ Y( n = E[ X( n = 0. The goodness of the estimator Ŷ ( n is described by the SE E[ ( Ŷ ( n Y( n. We distinguish between two cases: - Prediction: Ŷ ( n depends on one or several past observations of X( n only, i.e. Ŷ ( n = Ŷ ( X( n 1, X( n, with n 1, n, < n - Filtering: Ŷ ( n depends on the present observation and/or one or many future observation(s of X( n, i.e. Ŷ ( n = Ŷ ( X( n 1, X( n, where at least one n i n If all n i n, the filter is causal otherwise it is noncausal. xn ( Prediction yn ( Filtering.3.1. Noncausal Wiener filters Linear inimum ean Squared Error Filter We seek a linear filter Ŷ ( n = hm ( X( n m = m = which minimizes the SE E[ ( Ŷ ( n Y( n. Such a filter exists. It is called a Wiener filter in honour of his inventor. Orthogonality principle (time domain: The coefficients of a Wiener filter satisfy the conditions: E[ ( Y( n Ŷ ( n X( n k = It follows from these identities (see also (.3 that hn ( * X( n = E Y( n hm ( X( n m X( n k = 0, k =, 1, 0, 1, m = E[ ( Y( n Ŷ ( n Ŷ ( n = 0 E[ ( Y( n Ŷ ( n = E[ Y( n E[ Ŷ ( n = E[ ( Y( n Ŷ ( n Y( n Ŷ ( n Y( n Ŷ ( n Y( n n n Typical application: WSS signal embedded in additive white noise X( n = Y( n + W( n. where, - W( n is a white noise sequence, - Y( n is a WSS process With the definitions R XX ( k E[ X( nx( n + k, R XY ( k E[ X( ny( n + k we can recast the orthogonality conditions as follows: R XY ( k = hm ( R XX ( k m k =, 1, 0, 1, m = R XY ( k = hk ( *R XX ( k Wiener-Hopf equation - Y( n and W( n are uncorrelated. However, the theoretical treatment is more general as shown below

6 Orthogonality principle (frequency domain: where Transfer function of the Wiener filter: SE of the Wiener filter (time-domain formulation: S XY ( f = H( f S XX ( f S XY ( f F { R XY ( k } H( f S XY ( f = S XX ( f E[ ( Y( n Ŷ ( n = σ Y m = hm ( R XY ( m E[ ( Y( n Ŷ ( n = E[ Y( n E[ Ŷ ( ny( n E[ ( Y( n Ŷ ( n is the value p( 0 of the sequence pk ( = R YY ( k hm ( R YX ( k m Hence, m = = R YY ( k hk ( *R YX ( k S XY ( f P( f = S YY ( f H( f S YX ( f = S YY ( f S XX ( f 1 E[ ( Y( n Ŷ ( n = p( 0 = P( f df 1 1 E[ ( Y( n Ŷ ( n S XY ( f = S YY ( f df S XX ( f 1 SE of the Wiener filter (frequency-domain formulation: We can rewrite the above identity as: E[ ( Y( n Ŷ ( n = R YY ( 0 hm ( R YX ( m m =.3.. Causal Wiener filters A. X( n is a white noise. We first assume that X( n is a white noise with unit variance, i.e. E[ X( nx( n+ k = δ( k. Derivation of the causal Wiener filter from the noncausal Wiener filter: Let us consider the noncausal Wiener filter Ŷ ( n = hm ( X( n m m = whose transfer function is given by S XY ( f H( f = =. S XX ( f S ( f XY Then, the causal Wiener filter Ŷ c ( n results by cancelling the noncausal part of the non-causal Wiener filter: Ŷ c ( n = hm ( X( n m m =

7 Sketch of the proof: Ŷ ( n can be written as 1 Ŷ ( n = hm ( X( n m + m = m = 0 hm ( X( n m Because X( n is a white noise, the causal part V = Ŷ c ( n and the noncausal part U = Ŷ( n Ŷ c ( n of Ŷ ( n are orthogonal. It follows from this property U V Ŷ c ( n (Causal X( n Whitening Filter Z( n E[ Z( nz( n + k = δ( k gn ( This operation is called whitening and the filter gn ( is called a whitening filter. equivalent there exists another causal filter g ( n so that X ( n = g ( n*z( n : Z( n g ( n X( n Y( n Ŷ c ( n Yˆ c ( n U Y( n = Ŷ( n Ŷ c ( n Y( n Ŷ ( n Ŷ ( n that Ŷ c ( n and Y( n are orthogonal, i.e. that Ŷ c ( n minimizes the SE within the class of linear causal estimators. Notice that if then G( f = S XX ( f 1 G ( f = S XX ( f G( f F { gn ( } G ( f F { g ( n } We shall see that a whitening filter exists such that (.6 G ( f = G( f 1 Causal Wiener filter aking use of the result in Part A, the block diagram of the causal Wiener filter is B. X( n is an arbitrary WSS process whose spectrum satisfies the Paley- Wiener condition. Usually, the above truncation procedure to obtain Ŷ c ( n does not apply because U and V are correlated and therefore not orthogonal in the general case. Causal whitening filter: However, we can show (see the Spectral Decomposition Theorem below that provided S XX ( f satisfies the Paley-Wiener condition (see below then X( n can be converted into an equivalent white noise sequence Z( n with unit variance by filtering it with an appropriate causal filter gn (, X( n (Causal Whitening Filter gn ( Z( n Causal part of F 1 { S ZY ( f } S ZY ( f is obtained from S XY ( f according to S ZY ( f = G( f S XY ( f Ŷ c ( n

8 Hence, the block diagram of the causal Wiener filter is: X( n Spectral Decomposition Theorem: Let S XX ( f satisfies the so-called Paley-Wiener condition: log( S XX ( f df > 1 Then S XX ( f can be written as S XX ( f = G( f + G( f - with G( f + and G( f - satisfying G( f + = G( f - = S XX ( f oreover, the sequences gn ( + F 1 { G( f + } gn ( - F 1 { G( f - } g 1 ( n F 1 { 1 G ( f + } g 1 ( n F 1 { 1 G ( f - } satisfy Whitening Filter gn ( 1 gn ( + = g 1 ( n = 0 n < 0 gn ( - = g 1 ( n = 0 n > 0 Z( n Causal part of F 1 { G( f S XY ( f }. Causal sequences Anticausal sequences Ŷ c ( n Whitening filter (cont d: The sought whitening filter used to obtained Z( n is gn ( = g 1 ( n + and g ( n = gn ( +. It can be easily verified that both sequences satisfy the identities in ( Finite Wiener filters Finite linear filter: Wiener-Hopf equation: By applying the orthogonality principle we obtain the Wiener-Hopf system of equations: where and Σ XX Coefficient vector of the finite Wiener filter: provided Σ XX is invertible. Ŷ ( n = hm ( X( n m m = Σ XY 1 = Σ XX h h [ h( 1,, h ( T Σ XY [ R XY ( 1,, R XY ( T R XX ( 0 R XX ( 1 R XX ( R XX ( 1 + R XX ( 1 R XX ( 0 R XX ( 1 R XX ( R XX ( R XX ( 1 R XX ( 0 R XX ( 1 + R XX ( 1 + R XX ( R XX ( 1 + R XX ( 0 h = ( Σ XX 1 Σ XY

min ǫ = E{e 2 [n]}. (11.2)

min ǫ = E{e 2 [n]}. (11.2) C H A P T E R 11 Wiener Filtering INTRODUCTION In this chapter we will consider the use of LTI systems in order to perform minimum mean-square-error (MMSE) estimation of a WSS random process of interest,

More information

α = u v. In other words, Orthogonal Projection

α = u v. In other words, Orthogonal Projection Orthogonal Projection Given any nonzero vector v, it is possible to decompose an arbitrary vector u into a component that points in the direction of v and one that points in a direction orthogonal to v

More information

Solutions to Exam in Speech Signal Processing EN2300

Solutions to Exam in Speech Signal Processing EN2300 Solutions to Exam in Speech Signal Processing EN23 Date: Thursday, Dec 2, 8: 3: Place: Allowed: Grades: Language: Solutions: Q34, Q36 Beta Math Handbook (or corresponding), calculator with empty memory.

More information

Probability and Random Variables. Generation of random variables (r.v.)

Probability and Random Variables. Generation of random variables (r.v.) Probability and Random Variables Method for generating random variables with a specified probability distribution function. Gaussian And Markov Processes Characterization of Stationary Random Process Linearly

More information

Introduction to General and Generalized Linear Models

Introduction to General and Generalized Linear Models Introduction to General and Generalized Linear Models General Linear Models - part I Henrik Madsen Poul Thyregod Informatics and Mathematical Modelling Technical University of Denmark DK-2800 Kgs. Lyngby

More information

The Bivariate Normal Distribution

The Bivariate Normal Distribution The Bivariate Normal Distribution This is Section 4.7 of the st edition (2002) of the book Introduction to Probability, by D. P. Bertsekas and J. N. Tsitsiklis. The material in this section was not included

More information

Chapter 4: Vector Autoregressive Models

Chapter 4: Vector Autoregressive Models Chapter 4: Vector Autoregressive Models 1 Contents: Lehrstuhl für Department Empirische of Wirtschaftsforschung Empirical Research and und Econometrics Ökonometrie IV.1 Vector Autoregressive Models (VAR)...

More information

TTT4120 Digital Signal Processing Suggested Solution to Exam Fall 2008

TTT4120 Digital Signal Processing Suggested Solution to Exam Fall 2008 Norwegian University of Science and Technology Department of Electronics and Telecommunications TTT40 Digital Signal Processing Suggested Solution to Exam Fall 008 Problem (a) The input and the input-output

More information

Pricing Formula for 3-Period Discrete Barrier Options

Pricing Formula for 3-Period Discrete Barrier Options Pricing Formula for 3-Period Discrete Barrier Options Chun-Yuan Chiu Down-and-Out Call Options We first give the pricing formula as an integral and then simplify the integral to obtain a formula similar

More information

Inner product. Definition of inner product

Inner product. Definition of inner product Math 20F Linear Algebra Lecture 25 1 Inner product Review: Definition of inner product. Slide 1 Norm and distance. Orthogonal vectors. Orthogonal complement. Orthogonal basis. Definition of inner product

More information

MATH 304 Linear Algebra Lecture 20: Inner product spaces. Orthogonal sets.

MATH 304 Linear Algebra Lecture 20: Inner product spaces. Orthogonal sets. MATH 304 Linear Algebra Lecture 20: Inner product spaces. Orthogonal sets. Norm The notion of norm generalizes the notion of length of a vector in R n. Definition. Let V be a vector space. A function α

More information

Inner Product Spaces

Inner Product Spaces Math 571 Inner Product Spaces 1. Preliminaries An inner product space is a vector space V along with a function, called an inner product which associates each pair of vectors u, v with a scalar u, v, and

More information

Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model

Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model 1 September 004 A. Introduction and assumptions The classical normal linear regression model can be written

More information

Limits and Continuity

Limits and Continuity Math 20C Multivariable Calculus Lecture Limits and Continuity Slide Review of Limit. Side limits and squeeze theorem. Continuous functions of 2,3 variables. Review: Limits Slide 2 Definition Given a function

More information

Linear Threshold Units

Linear Threshold Units Linear Threshold Units w x hx (... w n x n w We assume that each feature x j and each weight w j is a real number (we will relax this later) We will study three different algorithms for learning linear

More information

Linear Algebra Review. Vectors

Linear Algebra Review. Vectors Linear Algebra Review By Tim K. Marks UCSD Borrows heavily from: Jana Kosecka kosecka@cs.gmu.edu http://cs.gmu.edu/~kosecka/cs682.html Virginia de Sa Cogsci 8F Linear Algebra review UCSD Vectors The length

More information

Lecture 8: Signal Detection and Noise Assumption

Lecture 8: Signal Detection and Noise Assumption ECE 83 Fall Statistical Signal Processing instructor: R. Nowak, scribe: Feng Ju Lecture 8: Signal Detection and Noise Assumption Signal Detection : X = W H : X = S + W where W N(, σ I n n and S = [s, s,...,

More information

TTT4110 Information and Signal Theory Solution to exam

TTT4110 Information and Signal Theory Solution to exam Norwegian University of Science and Technology Department of Electronics and Telecommunications TTT4 Information and Signal Theory Solution to exam Problem I (a The frequency response is found by taking

More information

Some probability and statistics

Some probability and statistics Appendix A Some probability and statistics A Probabilities, random variables and their distribution We summarize a few of the basic concepts of random variables, usually denoted by capital letters, X,Y,

More information

Advanced Signal Processing and Digital Noise Reduction

Advanced Signal Processing and Digital Noise Reduction Advanced Signal Processing and Digital Noise Reduction Saeed V. Vaseghi Queen's University of Belfast UK WILEY HTEUBNER A Partnership between John Wiley & Sons and B. G. Teubner Publishers Chichester New

More information

Lecture 14: Section 3.3

Lecture 14: Section 3.3 Lecture 14: Section 3.3 Shuanglin Shao October 23, 2013 Definition. Two nonzero vectors u and v in R n are said to be orthogonal (or perpendicular) if u v = 0. We will also agree that the zero vector in

More information

BEHAVIOR BASED CREDIT CARD FRAUD DETECTION USING SUPPORT VECTOR MACHINES

BEHAVIOR BASED CREDIT CARD FRAUD DETECTION USING SUPPORT VECTOR MACHINES BEHAVIOR BASED CREDIT CARD FRAUD DETECTION USING SUPPORT VECTOR MACHINES 123 CHAPTER 7 BEHAVIOR BASED CREDIT CARD FRAUD DETECTION USING SUPPORT VECTOR MACHINES 7.1 Introduction Even though using SVM presents

More information

Analysis of Mean-Square Error and Transient Speed of the LMS Adaptive Algorithm

Analysis of Mean-Square Error and Transient Speed of the LMS Adaptive Algorithm IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 48, NO. 7, JULY 2002 1873 Analysis of Mean-Square Error Transient Speed of the LMS Adaptive Algorithm Onkar Dabeer, Student Member, IEEE, Elias Masry, Fellow,

More information

3.1 Least squares in matrix form

3.1 Least squares in matrix form 118 3 Multiple Regression 3.1 Least squares in matrix form E Uses Appendix A.2 A.4, A.6, A.7. 3.1.1 Introduction More than one explanatory variable In the foregoing chapter we considered the simple regression

More information

SF2940: Probability theory Lecture 8: Multivariate Normal Distribution

SF2940: Probability theory Lecture 8: Multivariate Normal Distribution SF2940: Probability theory Lecture 8: Multivariate Normal Distribution Timo Koski 24.09.2015 Timo Koski Matematisk statistik 24.09.2015 1 / 1 Learning outcomes Random vectors, mean vector, covariance matrix,

More information

Similarity and Diagonalization. Similar Matrices

Similarity and Diagonalization. Similar Matrices MATH022 Linear Algebra Brief lecture notes 48 Similarity and Diagonalization Similar Matrices Let A and B be n n matrices. We say that A is similar to B if there is an invertible n n matrix P such that

More information

Univariate and Multivariate Methods PEARSON. Addison Wesley

Univariate and Multivariate Methods PEARSON. Addison Wesley Time Series Analysis Univariate and Multivariate Methods SECOND EDITION William W. S. Wei Department of Statistics The Fox School of Business and Management Temple University PEARSON Addison Wesley Boston

More information

By choosing to view this document, you agree to all provisions of the copyright laws protecting it.

By choosing to view this document, you agree to all provisions of the copyright laws protecting it. This material is posted here with permission of the IEEE Such permission of the IEEE does not in any way imply IEEE endorsement of any of Helsinki University of Technology's products or services Internal

More information

Rotation Matrices and Homogeneous Transformations

Rotation Matrices and Homogeneous Transformations Rotation Matrices and Homogeneous Transformations A coordinate frame in an n-dimensional space is defined by n mutually orthogonal unit vectors. In particular, for a two-dimensional (2D) space, i.e., n

More information

Nonlinear Iterative Partial Least Squares Method

Nonlinear Iterative Partial Least Squares Method Numerical Methods for Determining Principal Component Analysis Abstract Factors Béchu, S., Richard-Plouet, M., Fernandez, V., Walton, J., and Fairley, N. (2016) Developments in numerical treatments for

More information

Recall that two vectors in are perpendicular or orthogonal provided that their dot

Recall that two vectors in are perpendicular or orthogonal provided that their dot Orthogonal Complements and Projections Recall that two vectors in are perpendicular or orthogonal provided that their dot product vanishes That is, if and only if Example 1 The vectors in are orthogonal

More information

Econometrics Simple Linear Regression

Econometrics Simple Linear Regression Econometrics Simple Linear Regression Burcu Eke UC3M Linear equations with one variable Recall what a linear equation is: y = b 0 + b 1 x is a linear equation with one variable, or equivalently, a straight

More information

SF2940: Probability theory Lecture 8: Multivariate Normal Distribution

SF2940: Probability theory Lecture 8: Multivariate Normal Distribution SF2940: Probability theory Lecture 8: Multivariate Normal Distribution Timo Koski 24.09.2014 Timo Koski () Mathematisk statistik 24.09.2014 1 / 75 Learning outcomes Random vectors, mean vector, covariance

More information

Adding vectors We can do arithmetic with vectors. We ll start with vector addition and related operations. Suppose you have two vectors

Adding vectors We can do arithmetic with vectors. We ll start with vector addition and related operations. Suppose you have two vectors 1 Chapter 13. VECTORS IN THREE DIMENSIONAL SPACE Let s begin with some names and notation for things: R is the set (collection) of real numbers. We write x R to mean that x is a real number. A real number

More information

Communication on the Grassmann Manifold: A Geometric Approach to the Noncoherent Multiple-Antenna Channel

Communication on the Grassmann Manifold: A Geometric Approach to the Noncoherent Multiple-Antenna Channel IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 48, NO. 2, FEBRUARY 2002 359 Communication on the Grassmann Manifold: A Geometric Approach to the Noncoherent Multiple-Antenna Channel Lizhong Zheng, Student

More information

MATH10212 Linear Algebra. Systems of Linear Equations. Definition. An n-dimensional vector is a row or a column of n numbers (or letters): a 1.

MATH10212 Linear Algebra. Systems of Linear Equations. Definition. An n-dimensional vector is a row or a column of n numbers (or letters): a 1. MATH10212 Linear Algebra Textbook: D. Poole, Linear Algebra: A Modern Introduction. Thompson, 2006. ISBN 0-534-40596-7. Systems of Linear Equations Definition. An n-dimensional vector is a row or a column

More information

Sections 2.11 and 5.8

Sections 2.11 and 5.8 Sections 211 and 58 Timothy Hanson Department of Statistics, University of South Carolina Stat 704: Data Analysis I 1/25 Gesell data Let X be the age in in months a child speaks his/her first word and

More information

Regression Analysis: A Complete Example

Regression Analysis: A Complete Example Regression Analysis: A Complete Example This section works out an example that includes all the topics we have discussed so far in this chapter. A complete example of regression analysis. PhotoDisc, Inc./Getty

More information

FEGYVERNEKI SÁNDOR, PROBABILITY THEORY AND MATHEmATICAL

FEGYVERNEKI SÁNDOR, PROBABILITY THEORY AND MATHEmATICAL FEGYVERNEKI SÁNDOR, PROBABILITY THEORY AND MATHEmATICAL STATIsTICs 4 IV. RANDOm VECTORs 1. JOINTLY DIsTRIBUTED RANDOm VARIABLEs If are two rom variables defined on the same sample space we define the joint

More information

1 Norms and Vector Spaces

1 Norms and Vector Spaces 008.10.07.01 1 Norms and Vector Spaces Suppose we have a complex vector space V. A norm is a function f : V R which satisfies (i) f(x) 0 for all x V (ii) f(x + y) f(x) + f(y) for all x,y V (iii) f(λx)

More information

To give it a definition, an implicit function of x and y is simply any relationship that takes the form:

To give it a definition, an implicit function of x and y is simply any relationship that takes the form: 2 Implicit function theorems and applications 21 Implicit functions The implicit function theorem is one of the most useful single tools you ll meet this year After a while, it will be second nature to

More information

Least Squares Estimation

Least Squares Estimation Least Squares Estimation SARA A VAN DE GEER Volume 2, pp 1041 1045 in Encyclopedia of Statistics in Behavioral Science ISBN-13: 978-0-470-86080-9 ISBN-10: 0-470-86080-4 Editors Brian S Everitt & David

More information

Inner Product Spaces and Orthogonality

Inner Product Spaces and Orthogonality Inner Product Spaces and Orthogonality week 3-4 Fall 2006 Dot product of R n The inner product or dot product of R n is a function, defined by u, v a b + a 2 b 2 + + a n b n for u a, a 2,, a n T, v b,

More information

On the Efficiency of Competitive Stock Markets Where Traders Have Diverse Information

On the Efficiency of Competitive Stock Markets Where Traders Have Diverse Information Finance 400 A. Penati - G. Pennacchi Notes on On the Efficiency of Competitive Stock Markets Where Traders Have Diverse Information by Sanford Grossman This model shows how the heterogeneous information

More information

Linearly Independent Sets and Linearly Dependent Sets

Linearly Independent Sets and Linearly Dependent Sets These notes closely follow the presentation of the material given in David C. Lay s textbook Linear Algebra and its Applications (3rd edition). These notes are intended primarily for in-class presentation

More information

Derivative Approximation by Finite Differences

Derivative Approximation by Finite Differences Derivative Approximation by Finite Differences David Eberly Geometric Tools, LLC http://wwwgeometrictoolscom/ Copyright c 998-26 All Rights Reserved Created: May 3, 2 Last Modified: April 25, 25 Contents

More information

Chapter 8 - Power Density Spectrum

Chapter 8 - Power Density Spectrum EE385 Class Notes 8/8/03 John Stensby Chapter 8 - Power Density Spectrum Let X(t) be a WSS random process. X(t) has an average power, given in watts, of E[X(t) ], a constant. his total average power is

More information

NOTES ON LINEAR TRANSFORMATIONS

NOTES ON LINEAR TRANSFORMATIONS NOTES ON LINEAR TRANSFORMATIONS Definition 1. Let V and W be vector spaces. A function T : V W is a linear transformation from V to W if the following two properties hold. i T v + v = T v + T v for all

More information

Review Jeopardy. Blue vs. Orange. Review Jeopardy

Review Jeopardy. Blue vs. Orange. Review Jeopardy Review Jeopardy Blue vs. Orange Review Jeopardy Jeopardy Round Lectures 0-3 Jeopardy Round $200 How could I measure how far apart (i.e. how different) two observations, y 1 and y 2, are from each other?

More information

Linear Algebra Notes for Marsden and Tromba Vector Calculus

Linear Algebra Notes for Marsden and Tromba Vector Calculus Linear Algebra Notes for Marsden and Tromba Vector Calculus n-dimensional Euclidean Space and Matrices Definition of n space As was learned in Math b, a point in Euclidean three space can be thought of

More information

Understanding and Applying Kalman Filtering

Understanding and Applying Kalman Filtering Understanding and Applying Kalman Filtering Lindsay Kleeman Department of Electrical and Computer Systems Engineering Monash University, Clayton 1 Introduction Objectives: 1. Provide a basic understanding

More information

Numerical Methods For Image Restoration

Numerical Methods For Image Restoration Numerical Methods For Image Restoration CIRAM Alessandro Lanza University of Bologna, Italy Faculty of Engineering CIRAM Outline 1. Image Restoration as an inverse problem 2. Image degradation models:

More information

Introduction: Overview of Kernel Methods

Introduction: Overview of Kernel Methods Introduction: Overview of Kernel Methods Statistical Data Analysis with Positive Definite Kernels Kenji Fukumizu Institute of Statistical Mathematics, ROIS Department of Statistical Science, Graduate University

More information

BANACH AND HILBERT SPACE REVIEW

BANACH AND HILBERT SPACE REVIEW BANACH AND HILBET SPACE EVIEW CHISTOPHE HEIL These notes will briefly review some basic concepts related to the theory of Banach and Hilbert spaces. We are not trying to give a complete development, but

More information

MULTIVARIATE PROBABILITY DISTRIBUTIONS

MULTIVARIATE PROBABILITY DISTRIBUTIONS MULTIVARIATE PROBABILITY DISTRIBUTIONS. PRELIMINARIES.. Example. Consider an experiment that consists of tossing a die and a coin at the same time. We can consider a number of random variables defined

More information

Data Mining and Data Warehousing. Henryk Maciejewski. Data Mining Predictive modelling: regression

Data Mining and Data Warehousing. Henryk Maciejewski. Data Mining Predictive modelling: regression Data Mining and Data Warehousing Henryk Maciejewski Data Mining Predictive modelling: regression Algorithms for Predictive Modelling Contents Regression Classification Auxiliary topics: Estimation of prediction

More information

Factor analysis. Angela Montanari

Factor analysis. Angela Montanari Factor analysis Angela Montanari 1 Introduction Factor analysis is a statistical model that allows to explain the correlations between a large number of observed correlated variables through a small number

More information

1 VECTOR SPACES AND SUBSPACES

1 VECTOR SPACES AND SUBSPACES 1 VECTOR SPACES AND SUBSPACES What is a vector? Many are familiar with the concept of a vector as: Something which has magnitude and direction. an ordered pair or triple. a description for quantities such

More information

Introduction to Matrix Algebra

Introduction to Matrix Algebra Psychology 7291: Multivariate Statistics (Carey) 8/27/98 Matrix Algebra - 1 Introduction to Matrix Algebra Definitions: A matrix is a collection of numbers ordered by rows and columns. It is customary

More information

MIMO CHANNEL CAPACITY

MIMO CHANNEL CAPACITY MIMO CHANNEL CAPACITY Ochi Laboratory Nguyen Dang Khoa (D1) 1 Contents Introduction Review of information theory Fixed MIMO channel Fading MIMO channel Summary and Conclusions 2 1. Introduction The use

More information

Quadratic forms Cochran s theorem, degrees of freedom, and all that

Quadratic forms Cochran s theorem, degrees of freedom, and all that Quadratic forms Cochran s theorem, degrees of freedom, and all that Dr. Frank Wood Frank Wood, fwood@stat.columbia.edu Linear Regression Models Lecture 1, Slide 1 Why We Care Cochran s theorem tells us

More information

(Refer Slide Time: 01:11-01:27)

(Refer Slide Time: 01:11-01:27) Digital Signal Processing Prof. S. C. Dutta Roy Department of Electrical Engineering Indian Institute of Technology, Delhi Lecture - 6 Digital systems (contd.); inverse systems, stability, FIR and IIR,

More information

Orthogonal Projections

Orthogonal Projections Orthogonal Projections and Reflections (with exercises) by D. Klain Version.. Corrections and comments are welcome! Orthogonal Projections Let X,..., X k be a family of linearly independent (column) vectors

More information

is in plane V. However, it may be more convenient to introduce a plane coordinate system in V.

is in plane V. However, it may be more convenient to introduce a plane coordinate system in V. .4 COORDINATES EXAMPLE Let V be the plane in R with equation x +2x 2 +x 0, a two-dimensional subspace of R. We can describe a vector in this plane by its spatial (D)coordinates; for example, vector x 5

More information

Dimensionality Reduction: Principal Components Analysis

Dimensionality Reduction: Principal Components Analysis Dimensionality Reduction: Principal Components Analysis In data mining one often encounters situations where there are a large number of variables in the database. In such situations it is very likely

More information

CONTROL SYSTEMS, ROBOTICS, AND AUTOMATION - Vol. V - Relations Between Time Domain and Frequency Domain Prediction Error Methods - Tomas McKelvey

CONTROL SYSTEMS, ROBOTICS, AND AUTOMATION - Vol. V - Relations Between Time Domain and Frequency Domain Prediction Error Methods - Tomas McKelvey COTROL SYSTEMS, ROBOTICS, AD AUTOMATIO - Vol. V - Relations Between Time Domain and Frequency Domain RELATIOS BETWEE TIME DOMAI AD FREQUECY DOMAI PREDICTIO ERROR METHODS Tomas McKelvey Signal Processing,

More information

a 11 x 1 + a 12 x 2 + + a 1n x n = b 1 a 21 x 1 + a 22 x 2 + + a 2n x n = b 2.

a 11 x 1 + a 12 x 2 + + a 1n x n = b 1 a 21 x 1 + a 22 x 2 + + a 2n x n = b 2. Chapter 1 LINEAR EQUATIONS 1.1 Introduction to linear equations A linear equation in n unknowns x 1, x,, x n is an equation of the form a 1 x 1 + a x + + a n x n = b, where a 1, a,..., a n, b are given

More information

Computer exercise 2: Least Mean Square (LMS)

Computer exercise 2: Least Mean Square (LMS) 1 Computer exercise 2: Least Mean Square (LMS) This computer exercise deals with the LMS algorithm, which is derived from the method of steepest descent by replacing R = E{u(n)u H (n)} and p = E{u(n)d

More information

1 Short Introduction to Time Series

1 Short Introduction to Time Series ECONOMICS 7344, Spring 202 Bent E. Sørensen January 24, 202 Short Introduction to Time Series A time series is a collection of stochastic variables x,.., x t,.., x T indexed by an integer value t. The

More information

Signal Detection C H A P T E R 14 14.1 SIGNAL DETECTION AS HYPOTHESIS TESTING

Signal Detection C H A P T E R 14 14.1 SIGNAL DETECTION AS HYPOTHESIS TESTING C H A P T E R 4 Signal Detection 4. SIGNAL DETECTION AS HYPOTHESIS TESTING In Chapter 3 we considered hypothesis testing in the context of random variables. The detector resulting in the minimum probability

More information

Power Spectral Density

Power Spectral Density C H A P E R 0 Power Spectral Density INRODUCION Understanding how the strength of a signal is distributed in the frequency domain, relative to the strengths of other ambient signals, is central to the

More information

MATRIX ALGEBRA AND SYSTEMS OF EQUATIONS

MATRIX ALGEBRA AND SYSTEMS OF EQUATIONS MATRIX ALGEBRA AND SYSTEMS OF EQUATIONS Systems of Equations and Matrices Representation of a linear system The general system of m equations in n unknowns can be written a x + a 2 x 2 + + a n x n b a

More information

3 Orthogonal Vectors and Matrices

3 Orthogonal Vectors and Matrices 3 Orthogonal Vectors and Matrices The linear algebra portion of this course focuses on three matrix factorizations: QR factorization, singular valued decomposition (SVD), and LU factorization The first

More information

Principles of Digital Communication

Principles of Digital Communication Principles of Digital Communication Robert G. Gallager January 5, 2008 ii Preface: introduction and objectives The digital communication industry is an enormous and rapidly growing industry, roughly comparable

More information

Chapter 6. Linear Transformation. 6.1 Intro. to Linear Transformation

Chapter 6. Linear Transformation. 6.1 Intro. to Linear Transformation Chapter 6 Linear Transformation 6 Intro to Linear Transformation Homework: Textbook, 6 Ex, 5, 9,, 5,, 7, 9,5, 55, 57, 6(a,b), 6; page 7- In this section, we discuss linear transformations 89 9 CHAPTER

More information

MATHEMATICAL METHODS OF STATISTICS

MATHEMATICAL METHODS OF STATISTICS MATHEMATICAL METHODS OF STATISTICS By HARALD CRAMER TROFESSOK IN THE UNIVERSITY OF STOCKHOLM Princeton PRINCETON UNIVERSITY PRESS 1946 TABLE OF CONTENTS. First Part. MATHEMATICAL INTRODUCTION. CHAPTERS

More information

The Algorithms of Speech Recognition, Programming and Simulating in MATLAB

The Algorithms of Speech Recognition, Programming and Simulating in MATLAB FACULTY OF ENGINEERING AND SUSTAINABLE DEVELOPMENT. The Algorithms of Speech Recognition, Programming and Simulating in MATLAB Tingxiao Yang January 2012 Bachelor s Thesis in Electronics Bachelor s Program

More information

Master s Theory Exam Spring 2006

Master s Theory Exam Spring 2006 Spring 2006 This exam contains 7 questions. You should attempt them all. Each question is divided into parts to help lead you through the material. You should attempt to complete as much of each problem

More information

Systems of Linear Equations

Systems of Linear Equations Systems of Linear Equations Beifang Chen Systems of linear equations Linear systems A linear equation in variables x, x,, x n is an equation of the form a x + a x + + a n x n = b, where a, a,, a n and

More information

Math 241, Exam 1 Information.

Math 241, Exam 1 Information. Math 241, Exam 1 Information. 9/24/12, LC 310, 11:15-12:05. Exam 1 will be based on: Sections 12.1-12.5, 14.1-14.3. The corresponding assigned homework problems (see http://www.math.sc.edu/ boylan/sccourses/241fa12/241.html)

More information

Principle Component Analysis and Partial Least Squares: Two Dimension Reduction Techniques for Regression

Principle Component Analysis and Partial Least Squares: Two Dimension Reduction Techniques for Regression Principle Component Analysis and Partial Least Squares: Two Dimension Reduction Techniques for Regression Saikat Maitra and Jun Yan Abstract: Dimension reduction is one of the major tasks for multivariate

More information

8. Linear least-squares

8. Linear least-squares 8. Linear least-squares EE13 (Fall 211-12) definition examples and applications solution of a least-squares problem, normal equations 8-1 Definition overdetermined linear equations if b range(a), cannot

More information

Numerical Analysis Lecture Notes

Numerical Analysis Lecture Notes Numerical Analysis Lecture Notes Peter J. Olver 5. Inner Products and Norms The norm of a vector is a measure of its size. Besides the familiar Euclidean norm based on the dot product, there are a number

More information

MATRIX ALGEBRA AND SYSTEMS OF EQUATIONS. + + x 2. x n. a 11 a 12 a 1n b 1 a 21 a 22 a 2n b 2 a 31 a 32 a 3n b 3. a m1 a m2 a mn b m

MATRIX ALGEBRA AND SYSTEMS OF EQUATIONS. + + x 2. x n. a 11 a 12 a 1n b 1 a 21 a 22 a 2n b 2 a 31 a 32 a 3n b 3. a m1 a m2 a mn b m MATRIX ALGEBRA AND SYSTEMS OF EQUATIONS 1. SYSTEMS OF EQUATIONS AND MATRICES 1.1. Representation of a linear system. The general system of m equations in n unknowns can be written a 11 x 1 + a 12 x 2 +

More information

MATH 4330/5330, Fourier Analysis Section 11, The Discrete Fourier Transform

MATH 4330/5330, Fourier Analysis Section 11, The Discrete Fourier Transform MATH 433/533, Fourier Analysis Section 11, The Discrete Fourier Transform Now, instead of considering functions defined on a continuous domain, like the interval [, 1) or the whole real line R, we wish

More information

1 Inner Products and Norms on Real Vector Spaces

1 Inner Products and Norms on Real Vector Spaces Math 373: Principles Techniques of Applied Mathematics Spring 29 The 2 Inner Product 1 Inner Products Norms on Real Vector Spaces Recall that an inner product on a real vector space V is a function from

More information

NEW YORK STATE TEACHER CERTIFICATION EXAMINATIONS

NEW YORK STATE TEACHER CERTIFICATION EXAMINATIONS NEW YORK STATE TEACHER CERTIFICATION EXAMINATIONS TEST DESIGN AND FRAMEWORK September 2014 Authorized for Distribution by the New York State Education Department This test design and framework document

More information

3. Let A and B be two n n orthogonal matrices. Then prove that AB and BA are both orthogonal matrices. Prove a similar result for unitary matrices.

3. Let A and B be two n n orthogonal matrices. Then prove that AB and BA are both orthogonal matrices. Prove a similar result for unitary matrices. Exercise 1 1. Let A be an n n orthogonal matrix. Then prove that (a) the rows of A form an orthonormal basis of R n. (b) the columns of A form an orthonormal basis of R n. (c) for any two vectors x,y R

More information

Since it is necessary to consider the ability of the lter to predict many data over a period of time a more meaningful metric is the expected value of

Since it is necessary to consider the ability of the lter to predict many data over a period of time a more meaningful metric is the expected value of Chapter 11 Tutorial: The Kalman Filter Tony Lacey. 11.1 Introduction The Kalman lter ë1ë has long been regarded as the optimal solution to many tracing and data prediction tass, ë2ë. Its use in the analysis

More information

CAPM, Arbitrage, and Linear Factor Models

CAPM, Arbitrage, and Linear Factor Models CAPM, Arbitrage, and Linear Factor Models CAPM, Arbitrage, Linear Factor Models 1/ 41 Introduction We now assume all investors actually choose mean-variance e cient portfolios. By equating these investors

More information

Computational Foundations of Cognitive Science

Computational Foundations of Cognitive Science Computational Foundations of Cognitive Science Lecture 15: Convolutions and Kernels Frank Keller School of Informatics University of Edinburgh keller@inf.ed.ac.uk February 23, 2010 Frank Keller Computational

More information

Chapter 3: The Multiple Linear Regression Model

Chapter 3: The Multiple Linear Regression Model Chapter 3: The Multiple Linear Regression Model Advanced Econometrics - HEC Lausanne Christophe Hurlin University of Orléans November 23, 2013 Christophe Hurlin (University of Orléans) Advanced Econometrics

More information

How To Prove The Dirichlet Unit Theorem

How To Prove The Dirichlet Unit Theorem Chapter 6 The Dirichlet Unit Theorem As usual, we will be working in the ring B of algebraic integers of a number field L. Two factorizations of an element of B are regarded as essentially the same if

More information

Trend and Seasonal Components

Trend and Seasonal Components Chapter 2 Trend and Seasonal Components If the plot of a TS reveals an increase of the seasonal and noise fluctuations with the level of the process then some transformation may be necessary before doing

More information

Analysis/resynthesis with the short time Fourier transform

Analysis/resynthesis with the short time Fourier transform Analysis/resynthesis with the short time Fourier transform summer 2006 lecture on analysis, modeling and transformation of audio signals Axel Röbel Institute of communication science TU-Berlin IRCAM Analysis/Synthesis

More information

THREE DIMENSIONAL GEOMETRY

THREE DIMENSIONAL GEOMETRY Chapter 8 THREE DIMENSIONAL GEOMETRY 8.1 Introduction In this chapter we present a vector algebra approach to three dimensional geometry. The aim is to present standard properties of lines and planes,

More information

Vector Spaces 4.4 Spanning and Independence

Vector Spaces 4.4 Spanning and Independence Vector Spaces 4.4 and Independence October 18 Goals Discuss two important basic concepts: Define linear combination of vectors. Define Span(S) of a set S of vectors. Define linear Independence of a set

More information

Internet Appendix to CAPM for estimating cost of equity capital: Interpreting the empirical evidence

Internet Appendix to CAPM for estimating cost of equity capital: Interpreting the empirical evidence Internet Appendix to CAPM for estimating cost of equity capital: Interpreting the empirical evidence This document contains supplementary material to the paper titled CAPM for estimating cost of equity

More information

4.5 Linear Dependence and Linear Independence

4.5 Linear Dependence and Linear Independence 4.5 Linear Dependence and Linear Independence 267 32. {v 1, v 2 }, where v 1, v 2 are collinear vectors in R 3. 33. Prove that if S and S are subsets of a vector space V such that S is a subset of S, then

More information

The Ideal Class Group

The Ideal Class Group Chapter 5 The Ideal Class Group We will use Minkowski theory, which belongs to the general area of geometry of numbers, to gain insight into the ideal class group of a number field. We have already mentioned

More information