2. Linear Minimum Mean Squared Error Estimation
|
|
- Isaac James
- 7 years ago
- Views:
Transcription
1 . Linear inimum ean Squared Error Estimation.1. Linear minimum mean squared error estimators Situation considered: - A random sequence X ( 1,, X( whose realizations can be observed. - A random variable Y which has to be estimated. - We seek an estimate of Y with a linear estimator of the form: Ŷ = h 0 + h m X( m - A measure of the goodness of Ŷ is the mean squared error (SE: E[ ( Ŷ Y. Covariance and variance of random variables: Let U and V denote two random variables with expectation µ U E[ U and µ V E[ V. - The covariance of U and V is defined to be: Σ UV E[ ( U µ U ( V µ V = E[ UV µ U µ V. Let U [ U ( 1,, U( T and V [ V ( 1,, V( ' T denote two random vectors. The covariance matrix of U and V is defined as A direct way to obtain Σ UV : where Σ U ( 1V ( 1 Σ U ( 1V( ' Σ UV Σ U( V ( 1 Σ U( V( ' Σ UV = E[ ( U µ U ( V µ V T = E[ UV T µ U ( µ V T µ U E[ U = [ E[ U ( 1,, E[ U( T µ V E[ V Examples: U = X [ X ( 1,, X( T and V = Y. In the sequel we shall frequently make use of the following covariance matrix and vector: (i Σ XX = E[ ( X µ X ( X µ X T σ X ( 1 Σ X ( 1X( = Σ X( X ( 1 σ X( - The variance of U is defined to be: σ U E[ ( U µ U = = E[ U ( µ U Σ UU (ii Σ XY = E[ ( X µ X ( Y µ Y T = Σ X ( 1Y Σ X( Y Linear minimum mean squared error estimator (LSEE A LSEE of Y is a linear estimator, i.e. an estimator of the form Ŷ = h 0 + h m X( m, which minimizes the SE E[ ( Ŷ Y. -1 -
2 A linear estimator is entirely determined by the ( + 1 -dimensional vector h [ h 0,, h T. Orthogonality principle: Orthogonality principle: A necessary condition for h [ h 0,, h T to be the coefficient vector of the LSEE is that its entries fulfils the ( + 1 identities: E[ Y Ŷ = E Y h 0 + h m X( m = 0 E[ ( Y ŶX( j = E Y h 0 + h m X( m X( j = 0, j = 1,, Because the coefficient vector of the LSEE minimizes E[ ( Ŷ Y, its components must satisfy the set of equations: E [ ( Ŷ Y = 0 j = 0,,. h j Notice that the two expressions in (.1 can be rewritten as: E[ Y Ŷ = 0 (.a E[ ( Y ŶX( j = 0 j = 1,, Important consequences of the orthogonality principle: E[ ( Y ŶŶ = 0 E[ ( Y Ŷ = E[ Y E[ Ŷ = E[ ( Y ŶY (.3a (.3b Ŷ (.b Y Y (.1a (.1b Ŷ E Y [ Geometrical interpretation: Let U and V denote two random variables with finite second moment, i.e. E[ U < and E[ V <. Then, the expectation E[ UV can be viewed as the scalar or inner product of U and V. Within this interpretation: - U and V are uncorrelated, i.e. E[ UV = 0 if and only if, they are orthogonal, - E[ U is the norm (length of U. Interpretation of both equations in (.3: - (.3a: the estimation error Y Ŷ and the estimate Ŷ are orthogonal. - (.3b: results from Pythagoras Theorem. Computation of the coefficient vector of the LSEE: The coefficients of the LSEE satisfy the relationships: µ Y = h 0 h m µ X( m = h + ( h - T µ 0 X Σ XY Σ XX h - = where h - [ h 1,, h T and X [ X 1,, X T. Both identities follow by appropriately reformulating relations (.1a and (.1b and using a matrix notation for the latter one. -3-4
3 Thus, provided exists the coefficients of the LSEE are given by: Example: Linear prediction of a WSS process Let Y( n denote a WSS process with - zero mean, i.e E[ Y( n = 0, - autocorrelation function E[ Y( ny( n + k = R YY ( k We seek the LSEE for the present value of Y( n based on the past observations Y( n 1,, Y( n of the process. Hence, - Y = Y( n - X ( m = Y( n m,,,, i.e. - Σ XX = ( Σ XX 1 h - = ( Σ XX 1 Σ XY (.4a h 0 µ Y ( h - T T = µ X = µ Y Σ XY ( ΣXX 1 µ X X = [ Y( n 1,, Y( n T Because µ Y = 0 and µ X = 0, it follows from (.4b that h 0 = 0 Computation of Σ XY and Σ XX : - Σ XY = [ E[ Y( n 1Y( n,, E[ Y( n Y( n T = [ R YY ( 1,, R YY ( T = (.4b E[ Y( n 1 E[ Y( n 1Y( n E[ Y( n 1Y( n E[ Y( n Y( n 1 E[ Y( n E[ Y( n Y( n E[ Y( n Y( n 1 E[ Y( n Y( n E[ Y( n Residual error using a LSEE: The SE resulting when using a LSEE is E[ ( Ŷ Y = σ Y ( h - T Σ XY (.5 E[ ( Y Ŷ = E[ ( Y ŶY = E[ Y E[ Ŷ Y.. inimum mean squared error estimators Conditional expectation: Let U and V denote two random variables. The conditional expectation of V given U = u is observed is defined to be E[ V u vp( v udv. Notice that E[ V U is a random variable. In the sequel we shall make use of the following important property of conditional expectations: EEV [ [ U = E[ V = R YY ( 0 R YY ( 1 R YY ( R YY ( 1 R YY ( 1 R YY ( 0 R YY ( 1 R YY ( R YY ( R YY ( 1 R YY ( 0 R YY ( 3 R YY ( 1 R YY ( R YY ( 3 R YY (
4 inimum mean squared error estimator (SEE: The SEE of Y based on the observation of X ( 1,, X( is of the form: Hence if X ( 1 = x( 1,, X( = x ( is observed, then = Let Ŷ denote an arbitrary estimator. Then, = 0 = E[ ( Ŷ Y + E[ ( Y Y Thus, E[ ( Ŷ Y ( Y Y = 0. Y ( X( 1,, X( = E[ Y X( 1,, X( Y ( x( 1,, x ( = E[ Y x( 1,, x ( E[ ( Ŷ Y = E[ (( Ŷ Y ( Y Y yp( y x( 1,, x ( dy = E[ ( Ŷ Y E[ ( Ŷ Y ( Y Y + E[ ( Y Y with equality if, and only if, Ŷ = E[ ( Ŷ Y E[ ( Y Y Y. We still have to prove that Example: ultivariate Gaussian variables: [ Y, X( 1,, X( T N ( µσ, with - µ [ µ Y, µ X ( 1, µ, X( T σ - Σ Y Σ XY From Equation (6. in [Shanmugan it follows that Bivariate case: = 1, X ( 1 = X - Σ XX = σ X, Σ XY - Σ XY = ρσ X σ Y, where ρ is the correlation coefficient of Y and σ X σ Y X. In this case, ( Σ XY T Y Σ XX = E[ Y X = µ Y + ( Σ XY T ( Σ XX 1 ( X µ X Y ρσ Y = E[ Y X = µ Y ( X µ σ X X ρσ Y µ Y µ σ X ρσ Y = X X σ X h 0 h 1 We can observe that Y is linear, i.e. is the LSEE Ŷ = Y in the bivariate case. This is also true in the general multivariate Gaussian case. In fact, Ŷ = Y if, and only if, [ Y, X( 1,, X( T is a Gaussian random vector. -7-8
5 .3. Time-discrete Wiener filters Problem: Estimation of a WSS random sequence Y( n based on the observation of another sequence X( n. Without loss of generality we assume that E[ Y( n = E[ X( n = 0. The goodness of the estimator Ŷ ( n is described by the SE E[ ( Ŷ ( n Y( n. We distinguish between two cases: - Prediction: Ŷ ( n depends on one or several past observations of X( n only, i.e. Ŷ ( n = Ŷ ( X( n 1, X( n, with n 1, n, < n - Filtering: Ŷ ( n depends on the present observation and/or one or many future observation(s of X( n, i.e. Ŷ ( n = Ŷ ( X( n 1, X( n, where at least one n i n If all n i n, the filter is causal otherwise it is noncausal. xn ( Prediction yn ( Filtering.3.1. Noncausal Wiener filters Linear inimum ean Squared Error Filter We seek a linear filter Ŷ ( n = hm ( X( n m = m = which minimizes the SE E[ ( Ŷ ( n Y( n. Such a filter exists. It is called a Wiener filter in honour of his inventor. Orthogonality principle (time domain: The coefficients of a Wiener filter satisfy the conditions: E[ ( Y( n Ŷ ( n X( n k = It follows from these identities (see also (.3 that hn ( * X( n = E Y( n hm ( X( n m X( n k = 0, k =, 1, 0, 1, m = E[ ( Y( n Ŷ ( n Ŷ ( n = 0 E[ ( Y( n Ŷ ( n = E[ Y( n E[ Ŷ ( n = E[ ( Y( n Ŷ ( n Y( n Ŷ ( n Y( n Ŷ ( n Y( n n n Typical application: WSS signal embedded in additive white noise X( n = Y( n + W( n. where, - W( n is a white noise sequence, - Y( n is a WSS process With the definitions R XX ( k E[ X( nx( n + k, R XY ( k E[ X( ny( n + k we can recast the orthogonality conditions as follows: R XY ( k = hm ( R XX ( k m k =, 1, 0, 1, m = R XY ( k = hk ( *R XX ( k Wiener-Hopf equation - Y( n and W( n are uncorrelated. However, the theoretical treatment is more general as shown below
6 Orthogonality principle (frequency domain: where Transfer function of the Wiener filter: SE of the Wiener filter (time-domain formulation: S XY ( f = H( f S XX ( f S XY ( f F { R XY ( k } H( f S XY ( f = S XX ( f E[ ( Y( n Ŷ ( n = σ Y m = hm ( R XY ( m E[ ( Y( n Ŷ ( n = E[ Y( n E[ Ŷ ( ny( n E[ ( Y( n Ŷ ( n is the value p( 0 of the sequence pk ( = R YY ( k hm ( R YX ( k m Hence, m = = R YY ( k hk ( *R YX ( k S XY ( f P( f = S YY ( f H( f S YX ( f = S YY ( f S XX ( f 1 E[ ( Y( n Ŷ ( n = p( 0 = P( f df 1 1 E[ ( Y( n Ŷ ( n S XY ( f = S YY ( f df S XX ( f 1 SE of the Wiener filter (frequency-domain formulation: We can rewrite the above identity as: E[ ( Y( n Ŷ ( n = R YY ( 0 hm ( R YX ( m m =.3.. Causal Wiener filters A. X( n is a white noise. We first assume that X( n is a white noise with unit variance, i.e. E[ X( nx( n+ k = δ( k. Derivation of the causal Wiener filter from the noncausal Wiener filter: Let us consider the noncausal Wiener filter Ŷ ( n = hm ( X( n m m = whose transfer function is given by S XY ( f H( f = =. S XX ( f S ( f XY Then, the causal Wiener filter Ŷ c ( n results by cancelling the noncausal part of the non-causal Wiener filter: Ŷ c ( n = hm ( X( n m m =
7 Sketch of the proof: Ŷ ( n can be written as 1 Ŷ ( n = hm ( X( n m + m = m = 0 hm ( X( n m Because X( n is a white noise, the causal part V = Ŷ c ( n and the noncausal part U = Ŷ( n Ŷ c ( n of Ŷ ( n are orthogonal. It follows from this property U V Ŷ c ( n (Causal X( n Whitening Filter Z( n E[ Z( nz( n + k = δ( k gn ( This operation is called whitening and the filter gn ( is called a whitening filter. equivalent there exists another causal filter g ( n so that X ( n = g ( n*z( n : Z( n g ( n X( n Y( n Ŷ c ( n Yˆ c ( n U Y( n = Ŷ( n Ŷ c ( n Y( n Ŷ ( n Ŷ ( n that Ŷ c ( n and Y( n are orthogonal, i.e. that Ŷ c ( n minimizes the SE within the class of linear causal estimators. Notice that if then G( f = S XX ( f 1 G ( f = S XX ( f G( f F { gn ( } G ( f F { g ( n } We shall see that a whitening filter exists such that (.6 G ( f = G( f 1 Causal Wiener filter aking use of the result in Part A, the block diagram of the causal Wiener filter is B. X( n is an arbitrary WSS process whose spectrum satisfies the Paley- Wiener condition. Usually, the above truncation procedure to obtain Ŷ c ( n does not apply because U and V are correlated and therefore not orthogonal in the general case. Causal whitening filter: However, we can show (see the Spectral Decomposition Theorem below that provided S XX ( f satisfies the Paley-Wiener condition (see below then X( n can be converted into an equivalent white noise sequence Z( n with unit variance by filtering it with an appropriate causal filter gn (, X( n (Causal Whitening Filter gn ( Z( n Causal part of F 1 { S ZY ( f } S ZY ( f is obtained from S XY ( f according to S ZY ( f = G( f S XY ( f Ŷ c ( n
8 Hence, the block diagram of the causal Wiener filter is: X( n Spectral Decomposition Theorem: Let S XX ( f satisfies the so-called Paley-Wiener condition: log( S XX ( f df > 1 Then S XX ( f can be written as S XX ( f = G( f + G( f - with G( f + and G( f - satisfying G( f + = G( f - = S XX ( f oreover, the sequences gn ( + F 1 { G( f + } gn ( - F 1 { G( f - } g 1 ( n F 1 { 1 G ( f + } g 1 ( n F 1 { 1 G ( f - } satisfy Whitening Filter gn ( 1 gn ( + = g 1 ( n = 0 n < 0 gn ( - = g 1 ( n = 0 n > 0 Z( n Causal part of F 1 { G( f S XY ( f }. Causal sequences Anticausal sequences Ŷ c ( n Whitening filter (cont d: The sought whitening filter used to obtained Z( n is gn ( = g 1 ( n + and g ( n = gn ( +. It can be easily verified that both sequences satisfy the identities in ( Finite Wiener filters Finite linear filter: Wiener-Hopf equation: By applying the orthogonality principle we obtain the Wiener-Hopf system of equations: where and Σ XX Coefficient vector of the finite Wiener filter: provided Σ XX is invertible. Ŷ ( n = hm ( X( n m m = Σ XY 1 = Σ XX h h [ h( 1,, h ( T Σ XY [ R XY ( 1,, R XY ( T R XX ( 0 R XX ( 1 R XX ( R XX ( 1 + R XX ( 1 R XX ( 0 R XX ( 1 R XX ( R XX ( R XX ( 1 R XX ( 0 R XX ( 1 + R XX ( 1 + R XX ( R XX ( 1 + R XX ( 0 h = ( Σ XX 1 Σ XY
min ǫ = E{e 2 [n]}. (11.2)
C H A P T E R 11 Wiener Filtering INTRODUCTION In this chapter we will consider the use of LTI systems in order to perform minimum mean-square-error (MMSE) estimation of a WSS random process of interest,
More informationα = u v. In other words, Orthogonal Projection
Orthogonal Projection Given any nonzero vector v, it is possible to decompose an arbitrary vector u into a component that points in the direction of v and one that points in a direction orthogonal to v
More informationSolutions to Exam in Speech Signal Processing EN2300
Solutions to Exam in Speech Signal Processing EN23 Date: Thursday, Dec 2, 8: 3: Place: Allowed: Grades: Language: Solutions: Q34, Q36 Beta Math Handbook (or corresponding), calculator with empty memory.
More informationProbability and Random Variables. Generation of random variables (r.v.)
Probability and Random Variables Method for generating random variables with a specified probability distribution function. Gaussian And Markov Processes Characterization of Stationary Random Process Linearly
More informationIntroduction to General and Generalized Linear Models
Introduction to General and Generalized Linear Models General Linear Models - part I Henrik Madsen Poul Thyregod Informatics and Mathematical Modelling Technical University of Denmark DK-2800 Kgs. Lyngby
More informationThe Bivariate Normal Distribution
The Bivariate Normal Distribution This is Section 4.7 of the st edition (2002) of the book Introduction to Probability, by D. P. Bertsekas and J. N. Tsitsiklis. The material in this section was not included
More informationChapter 4: Vector Autoregressive Models
Chapter 4: Vector Autoregressive Models 1 Contents: Lehrstuhl für Department Empirische of Wirtschaftsforschung Empirical Research and und Econometrics Ökonometrie IV.1 Vector Autoregressive Models (VAR)...
More informationTTT4120 Digital Signal Processing Suggested Solution to Exam Fall 2008
Norwegian University of Science and Technology Department of Electronics and Telecommunications TTT40 Digital Signal Processing Suggested Solution to Exam Fall 008 Problem (a) The input and the input-output
More informationPricing Formula for 3-Period Discrete Barrier Options
Pricing Formula for 3-Period Discrete Barrier Options Chun-Yuan Chiu Down-and-Out Call Options We first give the pricing formula as an integral and then simplify the integral to obtain a formula similar
More informationInner product. Definition of inner product
Math 20F Linear Algebra Lecture 25 1 Inner product Review: Definition of inner product. Slide 1 Norm and distance. Orthogonal vectors. Orthogonal complement. Orthogonal basis. Definition of inner product
More informationMATH 304 Linear Algebra Lecture 20: Inner product spaces. Orthogonal sets.
MATH 304 Linear Algebra Lecture 20: Inner product spaces. Orthogonal sets. Norm The notion of norm generalizes the notion of length of a vector in R n. Definition. Let V be a vector space. A function α
More informationInner Product Spaces
Math 571 Inner Product Spaces 1. Preliminaries An inner product space is a vector space V along with a function, called an inner product which associates each pair of vectors u, v with a scalar u, v, and
More informationOverview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model
Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model 1 September 004 A. Introduction and assumptions The classical normal linear regression model can be written
More informationLimits and Continuity
Math 20C Multivariable Calculus Lecture Limits and Continuity Slide Review of Limit. Side limits and squeeze theorem. Continuous functions of 2,3 variables. Review: Limits Slide 2 Definition Given a function
More informationLinear Threshold Units
Linear Threshold Units w x hx (... w n x n w We assume that each feature x j and each weight w j is a real number (we will relax this later) We will study three different algorithms for learning linear
More informationLinear Algebra Review. Vectors
Linear Algebra Review By Tim K. Marks UCSD Borrows heavily from: Jana Kosecka kosecka@cs.gmu.edu http://cs.gmu.edu/~kosecka/cs682.html Virginia de Sa Cogsci 8F Linear Algebra review UCSD Vectors The length
More informationLecture 8: Signal Detection and Noise Assumption
ECE 83 Fall Statistical Signal Processing instructor: R. Nowak, scribe: Feng Ju Lecture 8: Signal Detection and Noise Assumption Signal Detection : X = W H : X = S + W where W N(, σ I n n and S = [s, s,...,
More informationTTT4110 Information and Signal Theory Solution to exam
Norwegian University of Science and Technology Department of Electronics and Telecommunications TTT4 Information and Signal Theory Solution to exam Problem I (a The frequency response is found by taking
More informationSome probability and statistics
Appendix A Some probability and statistics A Probabilities, random variables and their distribution We summarize a few of the basic concepts of random variables, usually denoted by capital letters, X,Y,
More informationAdvanced Signal Processing and Digital Noise Reduction
Advanced Signal Processing and Digital Noise Reduction Saeed V. Vaseghi Queen's University of Belfast UK WILEY HTEUBNER A Partnership between John Wiley & Sons and B. G. Teubner Publishers Chichester New
More informationLecture 14: Section 3.3
Lecture 14: Section 3.3 Shuanglin Shao October 23, 2013 Definition. Two nonzero vectors u and v in R n are said to be orthogonal (or perpendicular) if u v = 0. We will also agree that the zero vector in
More informationBEHAVIOR BASED CREDIT CARD FRAUD DETECTION USING SUPPORT VECTOR MACHINES
BEHAVIOR BASED CREDIT CARD FRAUD DETECTION USING SUPPORT VECTOR MACHINES 123 CHAPTER 7 BEHAVIOR BASED CREDIT CARD FRAUD DETECTION USING SUPPORT VECTOR MACHINES 7.1 Introduction Even though using SVM presents
More informationAnalysis of Mean-Square Error and Transient Speed of the LMS Adaptive Algorithm
IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 48, NO. 7, JULY 2002 1873 Analysis of Mean-Square Error Transient Speed of the LMS Adaptive Algorithm Onkar Dabeer, Student Member, IEEE, Elias Masry, Fellow,
More information3.1 Least squares in matrix form
118 3 Multiple Regression 3.1 Least squares in matrix form E Uses Appendix A.2 A.4, A.6, A.7. 3.1.1 Introduction More than one explanatory variable In the foregoing chapter we considered the simple regression
More informationSF2940: Probability theory Lecture 8: Multivariate Normal Distribution
SF2940: Probability theory Lecture 8: Multivariate Normal Distribution Timo Koski 24.09.2015 Timo Koski Matematisk statistik 24.09.2015 1 / 1 Learning outcomes Random vectors, mean vector, covariance matrix,
More informationSimilarity and Diagonalization. Similar Matrices
MATH022 Linear Algebra Brief lecture notes 48 Similarity and Diagonalization Similar Matrices Let A and B be n n matrices. We say that A is similar to B if there is an invertible n n matrix P such that
More informationUnivariate and Multivariate Methods PEARSON. Addison Wesley
Time Series Analysis Univariate and Multivariate Methods SECOND EDITION William W. S. Wei Department of Statistics The Fox School of Business and Management Temple University PEARSON Addison Wesley Boston
More informationBy choosing to view this document, you agree to all provisions of the copyright laws protecting it.
This material is posted here with permission of the IEEE Such permission of the IEEE does not in any way imply IEEE endorsement of any of Helsinki University of Technology's products or services Internal
More informationRotation Matrices and Homogeneous Transformations
Rotation Matrices and Homogeneous Transformations A coordinate frame in an n-dimensional space is defined by n mutually orthogonal unit vectors. In particular, for a two-dimensional (2D) space, i.e., n
More informationNonlinear Iterative Partial Least Squares Method
Numerical Methods for Determining Principal Component Analysis Abstract Factors Béchu, S., Richard-Plouet, M., Fernandez, V., Walton, J., and Fairley, N. (2016) Developments in numerical treatments for
More informationRecall that two vectors in are perpendicular or orthogonal provided that their dot
Orthogonal Complements and Projections Recall that two vectors in are perpendicular or orthogonal provided that their dot product vanishes That is, if and only if Example 1 The vectors in are orthogonal
More informationEconometrics Simple Linear Regression
Econometrics Simple Linear Regression Burcu Eke UC3M Linear equations with one variable Recall what a linear equation is: y = b 0 + b 1 x is a linear equation with one variable, or equivalently, a straight
More informationSF2940: Probability theory Lecture 8: Multivariate Normal Distribution
SF2940: Probability theory Lecture 8: Multivariate Normal Distribution Timo Koski 24.09.2014 Timo Koski () Mathematisk statistik 24.09.2014 1 / 75 Learning outcomes Random vectors, mean vector, covariance
More informationAdding vectors We can do arithmetic with vectors. We ll start with vector addition and related operations. Suppose you have two vectors
1 Chapter 13. VECTORS IN THREE DIMENSIONAL SPACE Let s begin with some names and notation for things: R is the set (collection) of real numbers. We write x R to mean that x is a real number. A real number
More informationCommunication on the Grassmann Manifold: A Geometric Approach to the Noncoherent Multiple-Antenna Channel
IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 48, NO. 2, FEBRUARY 2002 359 Communication on the Grassmann Manifold: A Geometric Approach to the Noncoherent Multiple-Antenna Channel Lizhong Zheng, Student
More informationMATH10212 Linear Algebra. Systems of Linear Equations. Definition. An n-dimensional vector is a row or a column of n numbers (or letters): a 1.
MATH10212 Linear Algebra Textbook: D. Poole, Linear Algebra: A Modern Introduction. Thompson, 2006. ISBN 0-534-40596-7. Systems of Linear Equations Definition. An n-dimensional vector is a row or a column
More informationSections 2.11 and 5.8
Sections 211 and 58 Timothy Hanson Department of Statistics, University of South Carolina Stat 704: Data Analysis I 1/25 Gesell data Let X be the age in in months a child speaks his/her first word and
More informationRegression Analysis: A Complete Example
Regression Analysis: A Complete Example This section works out an example that includes all the topics we have discussed so far in this chapter. A complete example of regression analysis. PhotoDisc, Inc./Getty
More informationFEGYVERNEKI SÁNDOR, PROBABILITY THEORY AND MATHEmATICAL
FEGYVERNEKI SÁNDOR, PROBABILITY THEORY AND MATHEmATICAL STATIsTICs 4 IV. RANDOm VECTORs 1. JOINTLY DIsTRIBUTED RANDOm VARIABLEs If are two rom variables defined on the same sample space we define the joint
More information1 Norms and Vector Spaces
008.10.07.01 1 Norms and Vector Spaces Suppose we have a complex vector space V. A norm is a function f : V R which satisfies (i) f(x) 0 for all x V (ii) f(x + y) f(x) + f(y) for all x,y V (iii) f(λx)
More informationTo give it a definition, an implicit function of x and y is simply any relationship that takes the form:
2 Implicit function theorems and applications 21 Implicit functions The implicit function theorem is one of the most useful single tools you ll meet this year After a while, it will be second nature to
More informationLeast Squares Estimation
Least Squares Estimation SARA A VAN DE GEER Volume 2, pp 1041 1045 in Encyclopedia of Statistics in Behavioral Science ISBN-13: 978-0-470-86080-9 ISBN-10: 0-470-86080-4 Editors Brian S Everitt & David
More informationInner Product Spaces and Orthogonality
Inner Product Spaces and Orthogonality week 3-4 Fall 2006 Dot product of R n The inner product or dot product of R n is a function, defined by u, v a b + a 2 b 2 + + a n b n for u a, a 2,, a n T, v b,
More informationOn the Efficiency of Competitive Stock Markets Where Traders Have Diverse Information
Finance 400 A. Penati - G. Pennacchi Notes on On the Efficiency of Competitive Stock Markets Where Traders Have Diverse Information by Sanford Grossman This model shows how the heterogeneous information
More informationLinearly Independent Sets and Linearly Dependent Sets
These notes closely follow the presentation of the material given in David C. Lay s textbook Linear Algebra and its Applications (3rd edition). These notes are intended primarily for in-class presentation
More informationDerivative Approximation by Finite Differences
Derivative Approximation by Finite Differences David Eberly Geometric Tools, LLC http://wwwgeometrictoolscom/ Copyright c 998-26 All Rights Reserved Created: May 3, 2 Last Modified: April 25, 25 Contents
More informationChapter 8 - Power Density Spectrum
EE385 Class Notes 8/8/03 John Stensby Chapter 8 - Power Density Spectrum Let X(t) be a WSS random process. X(t) has an average power, given in watts, of E[X(t) ], a constant. his total average power is
More informationNOTES ON LINEAR TRANSFORMATIONS
NOTES ON LINEAR TRANSFORMATIONS Definition 1. Let V and W be vector spaces. A function T : V W is a linear transformation from V to W if the following two properties hold. i T v + v = T v + T v for all
More informationReview Jeopardy. Blue vs. Orange. Review Jeopardy
Review Jeopardy Blue vs. Orange Review Jeopardy Jeopardy Round Lectures 0-3 Jeopardy Round $200 How could I measure how far apart (i.e. how different) two observations, y 1 and y 2, are from each other?
More informationLinear Algebra Notes for Marsden and Tromba Vector Calculus
Linear Algebra Notes for Marsden and Tromba Vector Calculus n-dimensional Euclidean Space and Matrices Definition of n space As was learned in Math b, a point in Euclidean three space can be thought of
More informationUnderstanding and Applying Kalman Filtering
Understanding and Applying Kalman Filtering Lindsay Kleeman Department of Electrical and Computer Systems Engineering Monash University, Clayton 1 Introduction Objectives: 1. Provide a basic understanding
More informationNumerical Methods For Image Restoration
Numerical Methods For Image Restoration CIRAM Alessandro Lanza University of Bologna, Italy Faculty of Engineering CIRAM Outline 1. Image Restoration as an inverse problem 2. Image degradation models:
More informationIntroduction: Overview of Kernel Methods
Introduction: Overview of Kernel Methods Statistical Data Analysis with Positive Definite Kernels Kenji Fukumizu Institute of Statistical Mathematics, ROIS Department of Statistical Science, Graduate University
More informationBANACH AND HILBERT SPACE REVIEW
BANACH AND HILBET SPACE EVIEW CHISTOPHE HEIL These notes will briefly review some basic concepts related to the theory of Banach and Hilbert spaces. We are not trying to give a complete development, but
More informationMULTIVARIATE PROBABILITY DISTRIBUTIONS
MULTIVARIATE PROBABILITY DISTRIBUTIONS. PRELIMINARIES.. Example. Consider an experiment that consists of tossing a die and a coin at the same time. We can consider a number of random variables defined
More informationData Mining and Data Warehousing. Henryk Maciejewski. Data Mining Predictive modelling: regression
Data Mining and Data Warehousing Henryk Maciejewski Data Mining Predictive modelling: regression Algorithms for Predictive Modelling Contents Regression Classification Auxiliary topics: Estimation of prediction
More informationFactor analysis. Angela Montanari
Factor analysis Angela Montanari 1 Introduction Factor analysis is a statistical model that allows to explain the correlations between a large number of observed correlated variables through a small number
More information1 VECTOR SPACES AND SUBSPACES
1 VECTOR SPACES AND SUBSPACES What is a vector? Many are familiar with the concept of a vector as: Something which has magnitude and direction. an ordered pair or triple. a description for quantities such
More informationIntroduction to Matrix Algebra
Psychology 7291: Multivariate Statistics (Carey) 8/27/98 Matrix Algebra - 1 Introduction to Matrix Algebra Definitions: A matrix is a collection of numbers ordered by rows and columns. It is customary
More informationMIMO CHANNEL CAPACITY
MIMO CHANNEL CAPACITY Ochi Laboratory Nguyen Dang Khoa (D1) 1 Contents Introduction Review of information theory Fixed MIMO channel Fading MIMO channel Summary and Conclusions 2 1. Introduction The use
More informationQuadratic forms Cochran s theorem, degrees of freedom, and all that
Quadratic forms Cochran s theorem, degrees of freedom, and all that Dr. Frank Wood Frank Wood, fwood@stat.columbia.edu Linear Regression Models Lecture 1, Slide 1 Why We Care Cochran s theorem tells us
More information(Refer Slide Time: 01:11-01:27)
Digital Signal Processing Prof. S. C. Dutta Roy Department of Electrical Engineering Indian Institute of Technology, Delhi Lecture - 6 Digital systems (contd.); inverse systems, stability, FIR and IIR,
More informationOrthogonal Projections
Orthogonal Projections and Reflections (with exercises) by D. Klain Version.. Corrections and comments are welcome! Orthogonal Projections Let X,..., X k be a family of linearly independent (column) vectors
More informationis in plane V. However, it may be more convenient to introduce a plane coordinate system in V.
.4 COORDINATES EXAMPLE Let V be the plane in R with equation x +2x 2 +x 0, a two-dimensional subspace of R. We can describe a vector in this plane by its spatial (D)coordinates; for example, vector x 5
More informationDimensionality Reduction: Principal Components Analysis
Dimensionality Reduction: Principal Components Analysis In data mining one often encounters situations where there are a large number of variables in the database. In such situations it is very likely
More informationCONTROL SYSTEMS, ROBOTICS, AND AUTOMATION - Vol. V - Relations Between Time Domain and Frequency Domain Prediction Error Methods - Tomas McKelvey
COTROL SYSTEMS, ROBOTICS, AD AUTOMATIO - Vol. V - Relations Between Time Domain and Frequency Domain RELATIOS BETWEE TIME DOMAI AD FREQUECY DOMAI PREDICTIO ERROR METHODS Tomas McKelvey Signal Processing,
More informationa 11 x 1 + a 12 x 2 + + a 1n x n = b 1 a 21 x 1 + a 22 x 2 + + a 2n x n = b 2.
Chapter 1 LINEAR EQUATIONS 1.1 Introduction to linear equations A linear equation in n unknowns x 1, x,, x n is an equation of the form a 1 x 1 + a x + + a n x n = b, where a 1, a,..., a n, b are given
More informationComputer exercise 2: Least Mean Square (LMS)
1 Computer exercise 2: Least Mean Square (LMS) This computer exercise deals with the LMS algorithm, which is derived from the method of steepest descent by replacing R = E{u(n)u H (n)} and p = E{u(n)d
More information1 Short Introduction to Time Series
ECONOMICS 7344, Spring 202 Bent E. Sørensen January 24, 202 Short Introduction to Time Series A time series is a collection of stochastic variables x,.., x t,.., x T indexed by an integer value t. The
More informationSignal Detection C H A P T E R 14 14.1 SIGNAL DETECTION AS HYPOTHESIS TESTING
C H A P T E R 4 Signal Detection 4. SIGNAL DETECTION AS HYPOTHESIS TESTING In Chapter 3 we considered hypothesis testing in the context of random variables. The detector resulting in the minimum probability
More informationPower Spectral Density
C H A P E R 0 Power Spectral Density INRODUCION Understanding how the strength of a signal is distributed in the frequency domain, relative to the strengths of other ambient signals, is central to the
More informationMATRIX ALGEBRA AND SYSTEMS OF EQUATIONS
MATRIX ALGEBRA AND SYSTEMS OF EQUATIONS Systems of Equations and Matrices Representation of a linear system The general system of m equations in n unknowns can be written a x + a 2 x 2 + + a n x n b a
More information3 Orthogonal Vectors and Matrices
3 Orthogonal Vectors and Matrices The linear algebra portion of this course focuses on three matrix factorizations: QR factorization, singular valued decomposition (SVD), and LU factorization The first
More informationPrinciples of Digital Communication
Principles of Digital Communication Robert G. Gallager January 5, 2008 ii Preface: introduction and objectives The digital communication industry is an enormous and rapidly growing industry, roughly comparable
More informationChapter 6. Linear Transformation. 6.1 Intro. to Linear Transformation
Chapter 6 Linear Transformation 6 Intro to Linear Transformation Homework: Textbook, 6 Ex, 5, 9,, 5,, 7, 9,5, 55, 57, 6(a,b), 6; page 7- In this section, we discuss linear transformations 89 9 CHAPTER
More informationMATHEMATICAL METHODS OF STATISTICS
MATHEMATICAL METHODS OF STATISTICS By HARALD CRAMER TROFESSOK IN THE UNIVERSITY OF STOCKHOLM Princeton PRINCETON UNIVERSITY PRESS 1946 TABLE OF CONTENTS. First Part. MATHEMATICAL INTRODUCTION. CHAPTERS
More informationThe Algorithms of Speech Recognition, Programming and Simulating in MATLAB
FACULTY OF ENGINEERING AND SUSTAINABLE DEVELOPMENT. The Algorithms of Speech Recognition, Programming and Simulating in MATLAB Tingxiao Yang January 2012 Bachelor s Thesis in Electronics Bachelor s Program
More informationMaster s Theory Exam Spring 2006
Spring 2006 This exam contains 7 questions. You should attempt them all. Each question is divided into parts to help lead you through the material. You should attempt to complete as much of each problem
More informationSystems of Linear Equations
Systems of Linear Equations Beifang Chen Systems of linear equations Linear systems A linear equation in variables x, x,, x n is an equation of the form a x + a x + + a n x n = b, where a, a,, a n and
More informationMath 241, Exam 1 Information.
Math 241, Exam 1 Information. 9/24/12, LC 310, 11:15-12:05. Exam 1 will be based on: Sections 12.1-12.5, 14.1-14.3. The corresponding assigned homework problems (see http://www.math.sc.edu/ boylan/sccourses/241fa12/241.html)
More informationPrinciple Component Analysis and Partial Least Squares: Two Dimension Reduction Techniques for Regression
Principle Component Analysis and Partial Least Squares: Two Dimension Reduction Techniques for Regression Saikat Maitra and Jun Yan Abstract: Dimension reduction is one of the major tasks for multivariate
More information8. Linear least-squares
8. Linear least-squares EE13 (Fall 211-12) definition examples and applications solution of a least-squares problem, normal equations 8-1 Definition overdetermined linear equations if b range(a), cannot
More informationNumerical Analysis Lecture Notes
Numerical Analysis Lecture Notes Peter J. Olver 5. Inner Products and Norms The norm of a vector is a measure of its size. Besides the familiar Euclidean norm based on the dot product, there are a number
More informationMATRIX ALGEBRA AND SYSTEMS OF EQUATIONS. + + x 2. x n. a 11 a 12 a 1n b 1 a 21 a 22 a 2n b 2 a 31 a 32 a 3n b 3. a m1 a m2 a mn b m
MATRIX ALGEBRA AND SYSTEMS OF EQUATIONS 1. SYSTEMS OF EQUATIONS AND MATRICES 1.1. Representation of a linear system. The general system of m equations in n unknowns can be written a 11 x 1 + a 12 x 2 +
More informationMATH 4330/5330, Fourier Analysis Section 11, The Discrete Fourier Transform
MATH 433/533, Fourier Analysis Section 11, The Discrete Fourier Transform Now, instead of considering functions defined on a continuous domain, like the interval [, 1) or the whole real line R, we wish
More information1 Inner Products and Norms on Real Vector Spaces
Math 373: Principles Techniques of Applied Mathematics Spring 29 The 2 Inner Product 1 Inner Products Norms on Real Vector Spaces Recall that an inner product on a real vector space V is a function from
More informationNEW YORK STATE TEACHER CERTIFICATION EXAMINATIONS
NEW YORK STATE TEACHER CERTIFICATION EXAMINATIONS TEST DESIGN AND FRAMEWORK September 2014 Authorized for Distribution by the New York State Education Department This test design and framework document
More information3. Let A and B be two n n orthogonal matrices. Then prove that AB and BA are both orthogonal matrices. Prove a similar result for unitary matrices.
Exercise 1 1. Let A be an n n orthogonal matrix. Then prove that (a) the rows of A form an orthonormal basis of R n. (b) the columns of A form an orthonormal basis of R n. (c) for any two vectors x,y R
More informationSince it is necessary to consider the ability of the lter to predict many data over a period of time a more meaningful metric is the expected value of
Chapter 11 Tutorial: The Kalman Filter Tony Lacey. 11.1 Introduction The Kalman lter ë1ë has long been regarded as the optimal solution to many tracing and data prediction tass, ë2ë. Its use in the analysis
More informationCAPM, Arbitrage, and Linear Factor Models
CAPM, Arbitrage, and Linear Factor Models CAPM, Arbitrage, Linear Factor Models 1/ 41 Introduction We now assume all investors actually choose mean-variance e cient portfolios. By equating these investors
More informationComputational Foundations of Cognitive Science
Computational Foundations of Cognitive Science Lecture 15: Convolutions and Kernels Frank Keller School of Informatics University of Edinburgh keller@inf.ed.ac.uk February 23, 2010 Frank Keller Computational
More informationChapter 3: The Multiple Linear Regression Model
Chapter 3: The Multiple Linear Regression Model Advanced Econometrics - HEC Lausanne Christophe Hurlin University of Orléans November 23, 2013 Christophe Hurlin (University of Orléans) Advanced Econometrics
More informationHow To Prove The Dirichlet Unit Theorem
Chapter 6 The Dirichlet Unit Theorem As usual, we will be working in the ring B of algebraic integers of a number field L. Two factorizations of an element of B are regarded as essentially the same if
More informationTrend and Seasonal Components
Chapter 2 Trend and Seasonal Components If the plot of a TS reveals an increase of the seasonal and noise fluctuations with the level of the process then some transformation may be necessary before doing
More informationAnalysis/resynthesis with the short time Fourier transform
Analysis/resynthesis with the short time Fourier transform summer 2006 lecture on analysis, modeling and transformation of audio signals Axel Röbel Institute of communication science TU-Berlin IRCAM Analysis/Synthesis
More informationTHREE DIMENSIONAL GEOMETRY
Chapter 8 THREE DIMENSIONAL GEOMETRY 8.1 Introduction In this chapter we present a vector algebra approach to three dimensional geometry. The aim is to present standard properties of lines and planes,
More informationVector Spaces 4.4 Spanning and Independence
Vector Spaces 4.4 and Independence October 18 Goals Discuss two important basic concepts: Define linear combination of vectors. Define Span(S) of a set S of vectors. Define linear Independence of a set
More informationInternet Appendix to CAPM for estimating cost of equity capital: Interpreting the empirical evidence
Internet Appendix to CAPM for estimating cost of equity capital: Interpreting the empirical evidence This document contains supplementary material to the paper titled CAPM for estimating cost of equity
More information4.5 Linear Dependence and Linear Independence
4.5 Linear Dependence and Linear Independence 267 32. {v 1, v 2 }, where v 1, v 2 are collinear vectors in R 3. 33. Prove that if S and S are subsets of a vector space V such that S is a subset of S, then
More informationThe Ideal Class Group
Chapter 5 The Ideal Class Group We will use Minkowski theory, which belongs to the general area of geometry of numbers, to gain insight into the ideal class group of a number field. We have already mentioned
More information