ORTHOGONALITY CRITERIA FOR OPTIMALITY

Size: px
Start display at page:

Download "ORTHOGONALITY CRITERIA FOR OPTIMALITY"

Transcription

1 MINIMUM MEAN-SQUARED ERROR LINEAR ESTIMATION Consider the task of linearly estimating random variable Y from random variables X 1,...,X N with minimum mean squared error (MSE). That is, we wish to choose coefficients a 1,...,a N such that the estimate N ^Y = ai X i i=1 has the smallest possible MSE: E(Y-^Y) 2 = E Y- N 2 ai X i i=1 Optimal linear estimators. We will now show how to choose the a i 's to minimize the MSE. Let us assume that Y and the X i 's have zero mean. If not, we could subtract their means to obtain zero mean random variables, apply optimal linear estimation, and then add the mean of Y to the estimate of Y-EY. Equivalently, we could optimize an estimator of the form Σi=1 N a i X i + b, which is called affine. We consider X = (X 1,...,X N ) t and a = (a 1,...,a N ) t to be column vectors. In this vector notation, ^Y = a t X We assume that the N N covariance matrix K = [EX i X j ] of X is known, as is the vector of covariances r = (EX 1 Y,..., EX N Y) t March 6, 2005 DPCM-20 ORTHOGONALITY CRITERIA FOR OPTIMALITY To find the optimal choice of a we will prove the following. Orthogonality Criteria for Linear Estimators a is the optimal linear predictor coefficients iff E (Y-a t X)X j = 0, j = 1,...,N i.e. iff the estimation error is orthogonal to all observations. The above equation is called the orthogonality criteria or principle. Proof: If statement: Suppose a satisfies the above, and b is an arbitrary set of coefficents. Then, E (Y-b t X) 2 = E ((Y-a t X) + (a t X-b t X)) 2 = E (Y-a t X) E (Y-a t X)(a t X-b t X)) + E (a t X-b t X) 2 = E (Y-a t X) E (Y-a t N X) (aj -b j )X j + E (a t X-b t X) 2 j=1 = E (Y-a t X) 2 N + 2 (aj -b j ) E (Y-a t X)X j + E (a t X-b t X) 2 j=1 = E (Y-a t X) 2 + E (a t X-b t X) 2 because E (Y-a t X)X j = 0 E (Y-a t X) 2 because E (a t X-b t X) 2 0 March 6, 2005 DPCM-21

2 Since this is true for every choice of b, it must be that a is optimal. In other words, if a satisfies the orthogonality criteria, it is optimal. Only if statement: Suppose b is optimal and a satisfies the orthogonality criteria. Then as shown above, E (Y-b t X) 2 = E (Y-a t X) 2 + E (a t X-b t X) 2 Since b is optimal E (Y-b t X) 2 E (Y-a t X) 2. If follows from the above two equations that E (a t X-b t X) 2 = 0 which implies E X 2 (a t -b t ) 2 = 0 which implies b = a. That is, if b is optimal, it satisfies the orthogonality criteria. March 6, 2005 DPCM-22 Corollary: Any linear combination of the X i 's is orthogonal to the error of an optimal estimator. That is, if b 1,...,b k are arbitrary numbers, and a is the vector of coefficients of an optimal estimator for Y based on X, then E (Y-a t K X) bj X j = 0 j=1 Proof: Suppose b 1,...,b k are arbitrary numbers, and a is the vector of coefficients of an optimal estimator for Y based on X, then E (Y-a t K K X) bj X j = bj E (Y-a t X) X j = 0 by orthogonality principle j=1 j=1 March 6, 2005 DPCM-23

3 USING THE ORTHOGONALITY PRINCIPLE TO FIND THE OPTIMAL ESTIMATOR From the orthogonality principle we know that if a is the vector of coefficients of the optimal linear estimator, then E(Y-a t X)X j = 0, j = 1,...,N This implies E YX j = a t E XX j = (E XX j ) t a, j = 1,...,N or equivalently E YX j = (EX 1 X j,...,e X N X j ) t a, j = 1,...,N or or EYX 1... EYX N = r = K a EX 1 X 1 EX N X 1 a EX 1 X N EX N X N.. a N Where K is the N N covariance matrix of X and r = E YX = (EYX 1,...,EYX N ) t. If K is invertible, i.e. nonsingular, then a = K -1 r is the vector of optimal linear prediction coefficients. March 6, 2005 DPCM-24 If K is not invertible, i.e. singular, then it can be shown that at least one of the X i 's is a linear combination of the others. Such X i 's can be removed from X because any estimator that used them could also be computed from the others. With these X i 's eliminated, the new covariance matrix K' is nonsingular and we can find the optimal a by inverting K'. The MSE of the optimal estimator: Derivation: MSE = EY 2 - r t K -1 r MSE = E(Y-a t X) 2 = E(Y-a t X)(Y-a t X) = E(Y-a t X)Y + E(Y-a t X)(-a t X) = E(Y-a t X)Y by corollary to orthogonality principle = EY 2 - a t EXY = EY 2 - a t r = EY 2 - r t K -1 r since for optimal estimator r = Ka, and so a t = (K -1 r) t = r t K -1 since K & K -1 are symmetric March 6, 2005 DPCM-25

4 OPTIMAL LINEAR PREDICTION Let us now apply what we've learned to the task of designing the optimal predictor of X N+1 from X 1,...,X N. Let us assume that X is a zero-mean, wide-sense stationary random process with autocorrelation function R X (k) EX i X i+k. Linear estimation theory shows that the Nth-order linear predictor of X i based on X i-1,...,x i-n that minimizes the mean squared prediction error has coefficients a = (a 1,,a N ) t given by a = K -1 r where K is the N N correlation/covariance matrix of X 1,,X N, i.e. K ij = E X i X j = R X ( i-j ), K -1 is its inverse, r = (R X (1),,R X (N)) t. It can be shown that if K is not invertible, then the random process X is deterministic in the sense that there are coefficients b 1,...,b N-1 such that X N = b 1 X b N-1 X N-1. When K is invertible, the resulting minimum mean square prediction error is M N = σ 2 - a t r = σ 2 - r t K -1 r = "N-step prediction error" and the prediction gain is G N = 10 log σ2 10 M N March 6, 2005 DPCM-26 BEHAVIOR OF THE N-STEP PREDICTION ERRORS Clearly, M N decreases with N to some limit, as illustrated below, and the prediction gains G N increase with N to some limit. mean squared error of best Nth order linear predictor N As mentioned in the transform coding section, it can also be shown that M N = K(N+1) K (N) where K (N) and K (N+1) denote, respectively, the N N and (N+1) (N+1) covariance matrices of X, and K denotes the determinant of the matrix K. This is derived in supplementary notes. March 6, 2005 DPCM-27

5 EXAMPLE: FIRST-ORDER AUTOREGRESSIVE (AR) GAUSSIAN SOURCE. Assume X is stationary and X i = ρ X i-1 + Z i ( * ) where -1 < ρ < 1, Z i 's are IID Gaussian with zero mean and variances σz. 2 Z i is independent of X j, j < i. (ρ =.9 is a typical value.) Then 1) EX i = 0 : 2) σ 2 X = EX 2 i = σ2 Z 1-ρ2 ( = 5.26 σ 2 Z if ρ =.9) 3) R X (k) = autocorrelation function = E X i X i+k = σ 2 X ρ k ρ = correlation coefficent = cov(x ix i+1 ) σ 2 X = R X(1) σ 2 X March 6, 2005 DPCM-28 4) Best predictor for X i based on X i-1, X i-2,..., X i-n is ~ X i = ρx i-1 with M N = σ 2 Z Derivation: It's not easy to use a = K -1 r. However, we may directly use the orthogonality principle to check that ~ X i is optimal. Recall that the orthogonal indicates that ~ X i is optimal if and only if We find E (X i - ~ X i )X i-j, for j = 1,...,N. E (X i - ρx i-1 )X i-j = E (ρx i-1 +Z i - ρx i-1 )X i-j = E Z i X i-j = E Z i E X i-j = 0, since Z i indep. of past X's It follows from orthog. principal that ~ X i = ρx i-1 is optimal. The MSPE is M N = E(X i - ρx i-1 ) 2 = E(ρ X i-1 + Z i - ρx i-1 ) 2 = E Z 2 i = σ 2 Z Therefore, for N 1, the prediction gain (which is the gain of DPCM over SQ) is G N = 10 log σ2 X 1 10 M = 10 log 10 N 1-ρ 2 ( = 7.2 db if ρ =.9) March 6, 2005 DPCM-29

6 prediction gain, db G N vs. ρ correlation coefficient March 6, 2005 DPCM-30 Jayant & Noll, p. 271 TYPICAL PREDICTION GAINS FOR SPEECH For each of several speakers, R X (k) was empirically estimated and the prediction gains were computed for various N's. The range of G N values found is shown below. The speech was lowpass filtered, probably to 3kHz. March 6, 2005 DPCM-31

7 TYPICAL PREDICTION GAINS FOR IMAGES Prediction gains in interframe image coding from Jayant and Noll, Fig 6.8, p B -- prediction from pixel immediately to the left ~ X =.965 X B C -- prediction from corresponding pixel in previous frame ~ X =.977 X C BC -- prediction from both pixels ~ X =.379 X B X C BCD -- prediction from the two aforementioned pixels plus the pixel to the left of the corresponding in the previous frame ~ X =.746 X B X C X D For larger values of N, the predictor is based on N pixels from the present and the previous frame. March 6, 2005 DPCM-32 ASYMPTOTIC LIMIT OF M N For a wide sense stationary random process, it can be shown (see Transform Coding Lectures) that as N, M N decreases to Q = lim N M N = "one-step prediction error" = exp 1 2π -π π ln S X (ω) dω where S X (ω) is the power spectral density of X, i.e. S X (ω) = discrete-time Fourier transform of R X (k) = RX (k) e -jkω k=- For an Nth-order AR process, Q = M N = M N+1 =... March 6, 2005 DPCM-33

8 ASYMPTOTIC OPTA OF DPCM In view of the fact that the asymptotic limit of M N is Q, we have the following. When R is large and the pdf of ~ U is similar to a scaled version of that of X, e.g. when X is Gaussian, the asymptotic OPTA function of DPCM optimized over N is δ dpcm (R) 1 12 Q α X 2-2R Q σ 2 δ sq (R) = 1 G δ sq (R) S dpcm (R) S sq (R) + G db where G = 10 log 10 σ2 Q db = prediction gain That is, the gain of DPCM over SQ approximately equals the prediction gain. March 6, 2005 DPCM-34 THE ORIGINS OF THE "DPCM" NAME The letters "PCM" comes from "Pulse Code Modulation", which is a modulation technique for transmitting analog sources such as speech. In PCM, an analog source, such as speech, is sampled; the samples are quantized; fixed-length binary codewords are produced; and each bit of each binary codeword determines which of two specific pulses are sent (e.g. one pulse might be the negative of the other, or the pulse might be sinusoidal with different frequencies). Thus PCM is a modulation technique for transmitting analog sources that competes with AM, FM and various other forms of pulse modulation. Invented in the 1940's, "PCM" is now viewed as three systems: a sampler, a quantizer and a digital modulator. However, the quantizer by itself is often referred to as PCM. DPCM was originally proposed as an improved DPCM type modulator consisting of a sampler, a DPCM quantizer as we know it, and a digital modulator. Nowadays DPCM usually refers just to the quantization part of the system. Predictive Coding DPCM is also considered to be a kind of predictive coding. March 6, 2005 DPCM-35

9 DELTA MODULATION Delta modulation is the special case of DPCM where the quantizer has just two levels: +, -. In its most basic form, the predictor produces, simply, ~ X i = ^X i-1. Fancier predictors are also used. Delta modulation is used when the source is highly correlated, for example speech coding with a high sampling rate. SPEECH CODING Adaptive versions of DPCM and Delta-Modulation have been standardized for use in the telephone industry for digitizing telephone speech. The speech is bandlimited to 3 khz before sampling at 8 khz. However, this kind of speech coding is no longer considered state-of-the-art. March 6, 2005 DPCM-36 EXAMPLES OF DPCM IMAGE CODING Example: Prediction for the present pixel X ij Y i-1,j-1 Y i-1,j ~ X ij = a 1 Y i,j-1 + a 2 Y i-1,j + a 3 Y i-1,j-1 Y i,j-1 X i,j (from R. Jain, Fund'ls of Image Proc, p. 493) All but second curve from top represents performance predicted with predictor matched to the stated source. For the second curve from the top, the values for a 1, a 2 and a 3 were designed for the actual measured correlations for a test set of images. DPCM has been extensively studied for image coding. But it is not often used today. Transform coding is more common. March 6, 2005 DPCM-37

10 DPCM IN VIDEO CODING Though not usually called DPCM, the most commonly used video coding methods use a form of DPCM, e.g. MPEG, H.26X, HDTV, satellite "Direct TV", DVD. Prediction is done on a frame-by-frame basis, with the prediction of a frame being the decoded reconstruction of the previous frame, or a "motion-compensated" version thereof. In the latter case, the coder has a forward-adaptive component in addition to the backward adaption. Example: (Netravali & Haskell, Digital Pictures, p. 331) left pixel, curr. frame Prediction Coefficients left pixel prev. frame same pixel prev frame MSPE Pred. gain db 1-1/2 1/ db 3/4-1/2 3/ db 7/8-5/8 3/ db 1Based on educated guess that σ 2 = March 6, 2005 DPCM-38 COMPARISON OF DPCM AND TRANSFORM CODING Consider a stationary, Gaussian source and large R. For k-dimensional transform coding: δ tr (k,r) 1 12 K (k) 1/k α G 2-2R For DPCM with kth-order linear prediction δ dpcm (k,r) 1 12 M k α G 2-2R where M k is MSPE of optimal kth-order linear prediction for X i from X i-k,,x i-1. Fact A: K (k) 1/k M k. Proof: K (k) 1/k = (σ 2 k-1 X Mi ) 1/k this was proved in the transform coding notes i=1 M k because all terms being averaged are M k It follows from this fact that DPCM with kth-order prediction is at least as good as k-dimensional transform coding. March 6, 2005 DPCM-39

11 Fact B: lim M k = lim k k K (k) 1/k = exp{ 1 2π -π = Q = 1-step prediction error" π ln S(ω) dω } Proof: This was proved in the transform coding notes. It follows from this fact that for Gaussian sources, large k and large R, DPCM and Transform Coding have the same performance, i.e. δ dpcm (,R) = δ tr (,R) However, consider the example of a first-order AR Gaussian source. Then, since M 1 = lim M k = Q, k δ dpcm (1,R) δ tr (,R) Thus, in this case a simple form of DPCM attains performance as good as high dimensional transform coding. March 6, 2005 DPCM-40 In the Gaussian case, the SNR gain in db of DPCM with first-order linear prediction over k-dimensional transform coding is δ tr (k,r) 10 log 10 δ dpcm (1,R) = 10 log (1-ρ2 ) (k-1)/k 10 1-ρ 2 = - 10 k log 10 (1-ρ2) which is plotted below. db 8 7 gain of DPCM over transform coding for ρ = k=transform code dimen., order of DPCM linear predictor Similarly, for an Nth-order AR Gaussian source, δ dpcm (N,R) δ tr (,R) March 6, 2005 DPCM-41

12 Given that DPCM is more efficient for the AR Gauss source, why is transform coding more commonly used nowadays than DPCM, for example for image coding? My feeling is that it because transform coding can more easily be made to take perceptual criteria into account. For example, with transform image coding, it is easy to exploit the fact that the eye is more sensitive to errors in low spatial frequencies than in high spatial frequencies. For example, JPEG does this by using coarser scalar quantizers for high frequency coefficients than for low frequency coefficients. March 6, 2005 DPCM-42

MATH10212 Linear Algebra. Systems of Linear Equations. Definition. An n-dimensional vector is a row or a column of n numbers (or letters): a 1.

MATH10212 Linear Algebra. Systems of Linear Equations. Definition. An n-dimensional vector is a row or a column of n numbers (or letters): a 1. MATH10212 Linear Algebra Textbook: D. Poole, Linear Algebra: A Modern Introduction. Thompson, 2006. ISBN 0-534-40596-7. Systems of Linear Equations Definition. An n-dimensional vector is a row or a column

More information

Sections 2.11 and 5.8

Sections 2.11 and 5.8 Sections 211 and 58 Timothy Hanson Department of Statistics, University of South Carolina Stat 704: Data Analysis I 1/25 Gesell data Let X be the age in in months a child speaks his/her first word and

More information

Solutions to Exam in Speech Signal Processing EN2300

Solutions to Exam in Speech Signal Processing EN2300 Solutions to Exam in Speech Signal Processing EN23 Date: Thursday, Dec 2, 8: 3: Place: Allowed: Grades: Language: Solutions: Q34, Q36 Beta Math Handbook (or corresponding), calculator with empty memory.

More information

Review Jeopardy. Blue vs. Orange. Review Jeopardy

Review Jeopardy. Blue vs. Orange. Review Jeopardy Review Jeopardy Blue vs. Orange Review Jeopardy Jeopardy Round Lectures 0-3 Jeopardy Round $200 How could I measure how far apart (i.e. how different) two observations, y 1 and y 2, are from each other?

More information

SF2940: Probability theory Lecture 8: Multivariate Normal Distribution

SF2940: Probability theory Lecture 8: Multivariate Normal Distribution SF2940: Probability theory Lecture 8: Multivariate Normal Distribution Timo Koski 24.09.2014 Timo Koski () Mathematisk statistik 24.09.2014 1 / 75 Learning outcomes Random vectors, mean vector, covariance

More information

Vector and Matrix Norms

Vector and Matrix Norms Chapter 1 Vector and Matrix Norms 11 Vector Spaces Let F be a field (such as the real numbers, R, or complex numbers, C) with elements called scalars A Vector Space, V, over the field F is a non-empty

More information

Introduction to image coding

Introduction to image coding Introduction to image coding Image coding aims at reducing amount of data required for image representation, storage or transmission. This is achieved by removing redundant data from an image, i.e. by

More information

1 Introduction to Matrices

1 Introduction to Matrices 1 Introduction to Matrices In this section, important definitions and results from matrix algebra that are useful in regression analysis are introduced. While all statements below regarding the columns

More information

Lecture 8: Signal Detection and Noise Assumption

Lecture 8: Signal Detection and Noise Assumption ECE 83 Fall Statistical Signal Processing instructor: R. Nowak, scribe: Feng Ju Lecture 8: Signal Detection and Noise Assumption Signal Detection : X = W H : X = S + W where W N(, σ I n n and S = [s, s,...,

More information

SF2940: Probability theory Lecture 8: Multivariate Normal Distribution

SF2940: Probability theory Lecture 8: Multivariate Normal Distribution SF2940: Probability theory Lecture 8: Multivariate Normal Distribution Timo Koski 24.09.2015 Timo Koski Matematisk statistik 24.09.2015 1 / 1 Learning outcomes Random vectors, mean vector, covariance matrix,

More information

Probability and Random Variables. Generation of random variables (r.v.)

Probability and Random Variables. Generation of random variables (r.v.) Probability and Random Variables Method for generating random variables with a specified probability distribution function. Gaussian And Markov Processes Characterization of Stationary Random Process Linearly

More information

MATRIX ALGEBRA AND SYSTEMS OF EQUATIONS. + + x 2. x n. a 11 a 12 a 1n b 1 a 21 a 22 a 2n b 2 a 31 a 32 a 3n b 3. a m1 a m2 a mn b m

MATRIX ALGEBRA AND SYSTEMS OF EQUATIONS. + + x 2. x n. a 11 a 12 a 1n b 1 a 21 a 22 a 2n b 2 a 31 a 32 a 3n b 3. a m1 a m2 a mn b m MATRIX ALGEBRA AND SYSTEMS OF EQUATIONS 1. SYSTEMS OF EQUATIONS AND MATRICES 1.1. Representation of a linear system. The general system of m equations in n unknowns can be written a 11 x 1 + a 12 x 2 +

More information

Voice---is analog in character and moves in the form of waves. 3-important wave-characteristics:

Voice---is analog in character and moves in the form of waves. 3-important wave-characteristics: Voice Transmission --Basic Concepts-- Voice---is analog in character and moves in the form of waves. 3-important wave-characteristics: Amplitude Frequency Phase Voice Digitization in the POTS Traditional

More information

MATRIX ALGEBRA AND SYSTEMS OF EQUATIONS

MATRIX ALGEBRA AND SYSTEMS OF EQUATIONS MATRIX ALGEBRA AND SYSTEMS OF EQUATIONS Systems of Equations and Matrices Representation of a linear system The general system of m equations in n unknowns can be written a x + a 2 x 2 + + a n x n b a

More information

Lecture 2 Matrix Operations

Lecture 2 Matrix Operations Lecture 2 Matrix Operations transpose, sum & difference, scalar multiplication matrix multiplication, matrix-vector product matrix inverse 2 1 Matrix transpose transpose of m n matrix A, denoted A T or

More information

Au = = = 3u. Aw = = = 2w. so the action of A on u and w is very easy to picture: it simply amounts to a stretching by 3 and 2, respectively.

Au = = = 3u. Aw = = = 2w. so the action of A on u and w is very easy to picture: it simply amounts to a stretching by 3 and 2, respectively. Chapter 7 Eigenvalues and Eigenvectors In this last chapter of our exploration of Linear Algebra we will revisit eigenvalues and eigenvectors of matrices, concepts that were already introduced in Geometry

More information

Introduction to Matrix Algebra

Introduction to Matrix Algebra Psychology 7291: Multivariate Statistics (Carey) 8/27/98 Matrix Algebra - 1 Introduction to Matrix Algebra Definitions: A matrix is a collection of numbers ordered by rows and columns. It is customary

More information

Matrix Representations of Linear Transformations and Changes of Coordinates

Matrix Representations of Linear Transformations and Changes of Coordinates Matrix Representations of Linear Transformations and Changes of Coordinates 01 Subspaces and Bases 011 Definitions A subspace V of R n is a subset of R n that contains the zero element and is closed under

More information

Linear Algebra Notes for Marsden and Tromba Vector Calculus

Linear Algebra Notes for Marsden and Tromba Vector Calculus Linear Algebra Notes for Marsden and Tromba Vector Calculus n-dimensional Euclidean Space and Matrices Definition of n space As was learned in Math b, a point in Euclidean three space can be thought of

More information

MATH 304 Linear Algebra Lecture 18: Rank and nullity of a matrix.

MATH 304 Linear Algebra Lecture 18: Rank and nullity of a matrix. MATH 304 Linear Algebra Lecture 18: Rank and nullity of a matrix. Nullspace Let A = (a ij ) be an m n matrix. Definition. The nullspace of the matrix A, denoted N(A), is the set of all n-dimensional column

More information

The Bivariate Normal Distribution

The Bivariate Normal Distribution The Bivariate Normal Distribution This is Section 4.7 of the st edition (2002) of the book Introduction to Probability, by D. P. Bertsekas and J. N. Tsitsiklis. The material in this section was not included

More information

Correlation in Random Variables

Correlation in Random Variables Correlation in Random Variables Lecture 11 Spring 2002 Correlation in Random Variables Suppose that an experiment produces two random variables, X and Y. What can we say about the relationship between

More information

Similarity and Diagonalization. Similar Matrices

Similarity and Diagonalization. Similar Matrices MATH022 Linear Algebra Brief lecture notes 48 Similarity and Diagonalization Similar Matrices Let A and B be n n matrices. We say that A is similar to B if there is an invertible n n matrix P such that

More information

Systems of Linear Equations

Systems of Linear Equations Systems of Linear Equations Beifang Chen Systems of linear equations Linear systems A linear equation in variables x, x,, x n is an equation of the form a x + a x + + a n x n = b, where a, a,, a n and

More information

Quadratic forms Cochran s theorem, degrees of freedom, and all that

Quadratic forms Cochran s theorem, degrees of freedom, and all that Quadratic forms Cochran s theorem, degrees of freedom, and all that Dr. Frank Wood Frank Wood, fwood@stat.columbia.edu Linear Regression Models Lecture 1, Slide 1 Why We Care Cochran s theorem tells us

More information

α = u v. In other words, Orthogonal Projection

α = u v. In other words, Orthogonal Projection Orthogonal Projection Given any nonzero vector v, it is possible to decompose an arbitrary vector u into a component that points in the direction of v and one that points in a direction orthogonal to v

More information

Inner Product Spaces

Inner Product Spaces Math 571 Inner Product Spaces 1. Preliminaries An inner product space is a vector space V along with a function, called an inner product which associates each pair of vectors u, v with a scalar u, v, and

More information

Understanding and Applying Kalman Filtering

Understanding and Applying Kalman Filtering Understanding and Applying Kalman Filtering Lindsay Kleeman Department of Electrical and Computer Systems Engineering Monash University, Clayton 1 Introduction Objectives: 1. Provide a basic understanding

More information

is in plane V. However, it may be more convenient to introduce a plane coordinate system in V.

is in plane V. However, it may be more convenient to introduce a plane coordinate system in V. .4 COORDINATES EXAMPLE Let V be the plane in R with equation x +2x 2 +x 0, a two-dimensional subspace of R. We can describe a vector in this plane by its spatial (D)coordinates; for example, vector x 5

More information

Some probability and statistics

Some probability and statistics Appendix A Some probability and statistics A Probabilities, random variables and their distribution We summarize a few of the basic concepts of random variables, usually denoted by capital letters, X,Y,

More information

RECOMMENDATION ITU-R BO.786 *

RECOMMENDATION ITU-R BO.786 * Rec. ITU-R BO.786 RECOMMENDATION ITU-R BO.786 * MUSE ** system for HDTV broadcasting-satellite services (Question ITU-R /) (992) The ITU Radiocommunication Assembly, considering a) that the MUSE system

More information

8 MIMO II: capacity and multiplexing

8 MIMO II: capacity and multiplexing CHAPTER 8 MIMO II: capacity and multiplexing architectures In this chapter, we will look at the capacity of MIMO fading channels and discuss transceiver architectures that extract the promised multiplexing

More information

5. Orthogonal matrices

5. Orthogonal matrices L Vandenberghe EE133A (Spring 2016) 5 Orthogonal matrices matrices with orthonormal columns orthogonal matrices tall matrices with orthonormal columns complex matrices with orthonormal columns 5-1 Orthonormal

More information

Lecture Notes 1. Brief Review of Basic Probability

Lecture Notes 1. Brief Review of Basic Probability Probability Review Lecture Notes Brief Review of Basic Probability I assume you know basic probability. Chapters -3 are a review. I will assume you have read and understood Chapters -3. Here is a very

More information

Reading.. IMAGE COMPRESSION- I IMAGE COMPRESSION. Image compression. Data Redundancy. Lossy vs Lossless Compression. Chapter 8.

Reading.. IMAGE COMPRESSION- I IMAGE COMPRESSION. Image compression. Data Redundancy. Lossy vs Lossless Compression. Chapter 8. Reading.. IMAGE COMPRESSION- I Week VIII Feb 25 Chapter 8 Sections 8.1, 8.2 8.3 (selected topics) 8.4 (Huffman, run-length, loss-less predictive) 8.5 (lossy predictive, transform coding basics) 8.6 Image

More information

15.062 Data Mining: Algorithms and Applications Matrix Math Review

15.062 Data Mining: Algorithms and Applications Matrix Math Review .6 Data Mining: Algorithms and Applications Matrix Math Review The purpose of this document is to give a brief review of selected linear algebra concepts that will be useful for the course and to develop

More information

160 CHAPTER 4. VECTOR SPACES

160 CHAPTER 4. VECTOR SPACES 160 CHAPTER 4. VECTOR SPACES 4. Rank and Nullity In this section, we look at relationships between the row space, column space, null space of a matrix and its transpose. We will derive fundamental results

More information

I. GROUPS: BASIC DEFINITIONS AND EXAMPLES

I. GROUPS: BASIC DEFINITIONS AND EXAMPLES I GROUPS: BASIC DEFINITIONS AND EXAMPLES Definition 1: An operation on a set G is a function : G G G Definition 2: A group is a set G which is equipped with an operation and a special element e G, called

More information

Lecture 4: Partitioned Matrices and Determinants

Lecture 4: Partitioned Matrices and Determinants Lecture 4: Partitioned Matrices and Determinants 1 Elementary row operations Recall the elementary operations on the rows of a matrix, equivalent to premultiplying by an elementary matrix E: (1) multiplying

More information

Example/ an analog signal f ( t) ) is sample by f s = 5000 Hz draw the sampling signal spectrum. Calculate min. sampling frequency.

Example/ an analog signal f ( t) ) is sample by f s = 5000 Hz draw the sampling signal spectrum. Calculate min. sampling frequency. 1 2 3 4 Example/ an analog signal f ( t) = 1+ cos(4000πt ) is sample by f s = 5000 Hz draw the sampling signal spectrum. Calculate min. sampling frequency. Sol/ H(f) -7KHz -5KHz -3KHz -2KHz 0 2KHz 3KHz

More information

NOTES ON LINEAR TRANSFORMATIONS

NOTES ON LINEAR TRANSFORMATIONS NOTES ON LINEAR TRANSFORMATIONS Definition 1. Let V and W be vector spaces. A function T : V W is a linear transformation from V to W if the following two properties hold. i T v + v = T v + T v for all

More information

Matrix Algebra. Some Basic Matrix Laws. Before reading the text or the following notes glance at the following list of basic matrix algebra laws.

Matrix Algebra. Some Basic Matrix Laws. Before reading the text or the following notes glance at the following list of basic matrix algebra laws. Matrix Algebra A. Doerr Before reading the text or the following notes glance at the following list of basic matrix algebra laws. Some Basic Matrix Laws Assume the orders of the matrices are such that

More information

MATH 304 Linear Algebra Lecture 9: Subspaces of vector spaces (continued). Span. Spanning set.

MATH 304 Linear Algebra Lecture 9: Subspaces of vector spaces (continued). Span. Spanning set. MATH 304 Linear Algebra Lecture 9: Subspaces of vector spaces (continued). Span. Spanning set. Vector space A vector space is a set V equipped with two operations, addition V V (x,y) x + y V and scalar

More information

Econometrics Simple Linear Regression

Econometrics Simple Linear Regression Econometrics Simple Linear Regression Burcu Eke UC3M Linear equations with one variable Recall what a linear equation is: y = b 0 + b 1 x is a linear equation with one variable, or equivalently, a straight

More information

PUTNAM TRAINING POLYNOMIALS. Exercises 1. Find a polynomial with integral coefficients whose zeros include 2 + 5.

PUTNAM TRAINING POLYNOMIALS. Exercises 1. Find a polynomial with integral coefficients whose zeros include 2 + 5. PUTNAM TRAINING POLYNOMIALS (Last updated: November 17, 2015) Remark. This is a list of exercises on polynomials. Miguel A. Lerma Exercises 1. Find a polynomial with integral coefficients whose zeros include

More information

Introduction: Overview of Kernel Methods

Introduction: Overview of Kernel Methods Introduction: Overview of Kernel Methods Statistical Data Analysis with Positive Definite Kernels Kenji Fukumizu Institute of Statistical Mathematics, ROIS Department of Statistical Science, Graduate University

More information

Eigenvalues, Eigenvectors, Matrix Factoring, and Principal Components

Eigenvalues, Eigenvectors, Matrix Factoring, and Principal Components Eigenvalues, Eigenvectors, Matrix Factoring, and Principal Components The eigenvalues and eigenvectors of a square matrix play a key role in some important operations in statistics. In particular, they

More information

Lecture 3: Linear methods for classification

Lecture 3: Linear methods for classification Lecture 3: Linear methods for classification Rafael A. Irizarry and Hector Corrada Bravo February, 2010 Today we describe four specific algorithms useful for classification problems: linear regression,

More information

Trend and Seasonal Components

Trend and Seasonal Components Chapter 2 Trend and Seasonal Components If the plot of a TS reveals an increase of the seasonal and noise fluctuations with the level of the process then some transformation may be necessary before doing

More information

Lecture L3 - Vectors, Matrices and Coordinate Transformations

Lecture L3 - Vectors, Matrices and Coordinate Transformations S. Widnall 16.07 Dynamics Fall 2009 Lecture notes based on J. Peraire Version 2.0 Lecture L3 - Vectors, Matrices and Coordinate Transformations By using vectors and defining appropriate operations between

More information

Unified Lecture # 4 Vectors

Unified Lecture # 4 Vectors Fall 2005 Unified Lecture # 4 Vectors These notes were written by J. Peraire as a review of vectors for Dynamics 16.07. They have been adapted for Unified Engineering by R. Radovitzky. References [1] Feynmann,

More information

Introduction to General and Generalized Linear Models

Introduction to General and Generalized Linear Models Introduction to General and Generalized Linear Models General Linear Models - part I Henrik Madsen Poul Thyregod Informatics and Mathematical Modelling Technical University of Denmark DK-2800 Kgs. Lyngby

More information

CS229 Lecture notes. Andrew Ng

CS229 Lecture notes. Andrew Ng CS229 Lecture notes Andrew Ng Part X Factor analysis Whenwehavedatax (i) R n thatcomesfromamixtureofseveral Gaussians, the EM algorithm can be applied to fit a mixture model. In this setting, we usually

More information

MIMO CHANNEL CAPACITY

MIMO CHANNEL CAPACITY MIMO CHANNEL CAPACITY Ochi Laboratory Nguyen Dang Khoa (D1) 1 Contents Introduction Review of information theory Fixed MIMO channel Fading MIMO channel Summary and Conclusions 2 1. Introduction The use

More information

Linear Codes. Chapter 3. 3.1 Basics

Linear Codes. Chapter 3. 3.1 Basics Chapter 3 Linear Codes In order to define codes that we can encode and decode efficiently, we add more structure to the codespace. We shall be mainly interested in linear codes. A linear code of length

More information

Name: Section Registered In:

Name: Section Registered In: Name: Section Registered In: Math 125 Exam 3 Version 1 April 24, 2006 60 total points possible 1. (5pts) Use Cramer s Rule to solve 3x + 4y = 30 x 2y = 8. Be sure to show enough detail that shows you are

More information

Experiment 3 MULTIMEDIA SIGNAL COMPRESSION: SPEECH AND AUDIO

Experiment 3 MULTIMEDIA SIGNAL COMPRESSION: SPEECH AND AUDIO Experiment 3 MULTIMEDIA SIGNAL COMPRESSION: SPEECH AND AUDIO I Introduction A key technology that enables distributing speech and audio signals without mass storage media or transmission bandwidth is compression,

More information

Linear Algebra Review. Vectors

Linear Algebra Review. Vectors Linear Algebra Review By Tim K. Marks UCSD Borrows heavily from: Jana Kosecka kosecka@cs.gmu.edu http://cs.gmu.edu/~kosecka/cs682.html Virginia de Sa Cogsci 8F Linear Algebra review UCSD Vectors The length

More information

JPEG Image Compression by Using DCT

JPEG Image Compression by Using DCT International Journal of Computer Sciences and Engineering Open Access Research Paper Volume-4, Issue-4 E-ISSN: 2347-2693 JPEG Image Compression by Using DCT Sarika P. Bagal 1* and Vishal B. Raskar 2 1*

More information

December 4, 2013 MATH 171 BASIC LINEAR ALGEBRA B. KITCHENS

December 4, 2013 MATH 171 BASIC LINEAR ALGEBRA B. KITCHENS December 4, 2013 MATH 171 BASIC LINEAR ALGEBRA B KITCHENS The equation 1 Lines in two-dimensional space (1) 2x y = 3 describes a line in two-dimensional space The coefficients of x and y in the equation

More information

Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model

Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model 1 September 004 A. Introduction and assumptions The classical normal linear regression model can be written

More information

CM0340 SOLNS. Do not turn this page over until instructed to do so by the Senior Invigilator.

CM0340 SOLNS. Do not turn this page over until instructed to do so by the Senior Invigilator. CARDIFF UNIVERSITY EXAMINATION PAPER Academic Year: 2008/2009 Examination Period: Examination Paper Number: Examination Paper Title: SOLUTIONS Duration: Autumn CM0340 SOLNS Multimedia 2 hours Do not turn

More information

Using row reduction to calculate the inverse and the determinant of a square matrix

Using row reduction to calculate the inverse and the determinant of a square matrix Using row reduction to calculate the inverse and the determinant of a square matrix Notes for MATH 0290 Honors by Prof. Anna Vainchtein 1 Inverse of a square matrix An n n square matrix A is called invertible

More information

Communication on the Grassmann Manifold: A Geometric Approach to the Noncoherent Multiple-Antenna Channel

Communication on the Grassmann Manifold: A Geometric Approach to the Noncoherent Multiple-Antenna Channel IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 48, NO. 2, FEBRUARY 2002 359 Communication on the Grassmann Manifold: A Geometric Approach to the Noncoherent Multiple-Antenna Channel Lizhong Zheng, Student

More information

Mathematics Course 111: Algebra I Part IV: Vector Spaces

Mathematics Course 111: Algebra I Part IV: Vector Spaces Mathematics Course 111: Algebra I Part IV: Vector Spaces D. R. Wilkins Academic Year 1996-7 9 Vector Spaces A vector space over some field K is an algebraic structure consisting of a set V on which are

More information

Inner products on R n, and more

Inner products on R n, and more Inner products on R n, and more Peyam Ryan Tabrizian Friday, April 12th, 2013 1 Introduction You might be wondering: Are there inner products on R n that are not the usual dot product x y = x 1 y 1 + +

More information

Principles of Digital Communication

Principles of Digital Communication Principles of Digital Communication Robert G. Gallager January 5, 2008 ii Preface: introduction and objectives The digital communication industry is an enormous and rapidly growing industry, roughly comparable

More information

1 VECTOR SPACES AND SUBSPACES

1 VECTOR SPACES AND SUBSPACES 1 VECTOR SPACES AND SUBSPACES What is a vector? Many are familiar with the concept of a vector as: Something which has magnitude and direction. an ordered pair or triple. a description for quantities such

More information

Chapter 3: The Multiple Linear Regression Model

Chapter 3: The Multiple Linear Regression Model Chapter 3: The Multiple Linear Regression Model Advanced Econometrics - HEC Lausanne Christophe Hurlin University of Orléans November 23, 2013 Christophe Hurlin (University of Orléans) Advanced Econometrics

More information

MA106 Linear Algebra lecture notes

MA106 Linear Algebra lecture notes MA106 Linear Algebra lecture notes Lecturers: Martin Bright and Daan Krammer Warwick, January 2011 Contents 1 Number systems and fields 3 1.1 Axioms for number systems......................... 3 2 Vector

More information

Continued Fractions and the Euclidean Algorithm

Continued Fractions and the Euclidean Algorithm Continued Fractions and the Euclidean Algorithm Lecture notes prepared for MATH 326, Spring 997 Department of Mathematics and Statistics University at Albany William F Hammond Table of Contents Introduction

More information

MULTIVARIATE PROBABILITY DISTRIBUTIONS

MULTIVARIATE PROBABILITY DISTRIBUTIONS MULTIVARIATE PROBABILITY DISTRIBUTIONS. PRELIMINARIES.. Example. Consider an experiment that consists of tossing a die and a coin at the same time. We can consider a number of random variables defined

More information

4.5 Linear Dependence and Linear Independence

4.5 Linear Dependence and Linear Independence 4.5 Linear Dependence and Linear Independence 267 32. {v 1, v 2 }, where v 1, v 2 are collinear vectors in R 3. 33. Prove that if S and S are subsets of a vector space V such that S is a subset of S, then

More information

October 3rd, 2012. Linear Algebra & Properties of the Covariance Matrix

October 3rd, 2012. Linear Algebra & Properties of the Covariance Matrix Linear Algebra & Properties of the Covariance Matrix October 3rd, 2012 Estimation of r and C Let rn 1, rn, t..., rn T be the historical return rates on the n th asset. rn 1 rṇ 2 r n =. r T n n = 1, 2,...,

More information

Example: Credit card default, we may be more interested in predicting the probabilty of a default than classifying individuals as default or not.

Example: Credit card default, we may be more interested in predicting the probabilty of a default than classifying individuals as default or not. Statistical Learning: Chapter 4 Classification 4.1 Introduction Supervised learning with a categorical (Qualitative) response Notation: - Feature vector X, - qualitative response Y, taking values in C

More information

Orthogonal Diagonalization of Symmetric Matrices

Orthogonal Diagonalization of Symmetric Matrices MATH10212 Linear Algebra Brief lecture notes 57 Gram Schmidt Process enables us to find an orthogonal basis of a subspace. Let u 1,..., u k be a basis of a subspace V of R n. We begin the process of finding

More information

By choosing to view this document, you agree to all provisions of the copyright laws protecting it.

By choosing to view this document, you agree to all provisions of the copyright laws protecting it. This material is posted here with permission of the IEEE Such permission of the IEEE does not in any way imply IEEE endorsement of any of Helsinki University of Technology's products or services Internal

More information

Discrete Convolution and the Discrete Fourier Transform

Discrete Convolution and the Discrete Fourier Transform Discrete Convolution and the Discrete Fourier Transform Discrete Convolution First of all we need to introduce what we might call the wraparound convention Because the complex numbers w j e i πj N have

More information

Linear Maps. Isaiah Lankham, Bruno Nachtergaele, Anne Schilling (February 5, 2007)

Linear Maps. Isaiah Lankham, Bruno Nachtergaele, Anne Schilling (February 5, 2007) MAT067 University of California, Davis Winter 2007 Linear Maps Isaiah Lankham, Bruno Nachtergaele, Anne Schilling (February 5, 2007) As we have discussed in the lecture on What is Linear Algebra? one of

More information

a 11 x 1 + a 12 x 2 + + a 1n x n = b 1 a 21 x 1 + a 22 x 2 + + a 2n x n = b 2.

a 11 x 1 + a 12 x 2 + + a 1n x n = b 1 a 21 x 1 + a 22 x 2 + + a 2n x n = b 2. Chapter 1 LINEAR EQUATIONS 1.1 Introduction to linear equations A linear equation in n unknowns x 1, x,, x n is an equation of the form a 1 x 1 + a x + + a n x n = b, where a 1, a,..., a n, b are given

More information

6. Cholesky factorization

6. Cholesky factorization 6. Cholesky factorization EE103 (Fall 2011-12) triangular matrices forward and backward substitution the Cholesky factorization solving Ax = b with A positive definite inverse of a positive definite matrix

More information

Video Coding Basics. Yao Wang Polytechnic University, Brooklyn, NY11201 yao@vision.poly.edu

Video Coding Basics. Yao Wang Polytechnic University, Brooklyn, NY11201 yao@vision.poly.edu Video Coding Basics Yao Wang Polytechnic University, Brooklyn, NY11201 yao@vision.poly.edu Outline Motivation for video coding Basic ideas in video coding Block diagram of a typical video codec Different

More information

Question 2: How do you solve a matrix equation using the matrix inverse?

Question 2: How do you solve a matrix equation using the matrix inverse? Question : How do you solve a matrix equation using the matrix inverse? In the previous question, we wrote systems of equations as a matrix equation AX B. In this format, the matrix A contains the coefficients

More information

MATH 304 Linear Algebra Lecture 20: Inner product spaces. Orthogonal sets.

MATH 304 Linear Algebra Lecture 20: Inner product spaces. Orthogonal sets. MATH 304 Linear Algebra Lecture 20: Inner product spaces. Orthogonal sets. Norm The notion of norm generalizes the notion of length of a vector in R n. Definition. Let V be a vector space. A function α

More information

[1] Diagonal factorization

[1] Diagonal factorization 8.03 LA.6: Diagonalization and Orthogonal Matrices [ Diagonal factorization [2 Solving systems of first order differential equations [3 Symmetric and Orthonormal Matrices [ Diagonal factorization Recall:

More information

TTT4110 Information and Signal Theory Solution to exam

TTT4110 Information and Signal Theory Solution to exam Norwegian University of Science and Technology Department of Electronics and Telecommunications TTT4 Information and Signal Theory Solution to exam Problem I (a The frequency response is found by taking

More information

2x + y = 3. Since the second equation is precisely the same as the first equation, it is enough to find x and y satisfying the system

2x + y = 3. Since the second equation is precisely the same as the first equation, it is enough to find x and y satisfying the system 1. Systems of linear equations We are interested in the solutions to systems of linear equations. A linear equation is of the form 3x 5y + 2z + w = 3. The key thing is that we don t multiply the variables

More information

CS3220 Lecture Notes: QR factorization and orthogonal transformations

CS3220 Lecture Notes: QR factorization and orthogonal transformations CS3220 Lecture Notes: QR factorization and orthogonal transformations Steve Marschner Cornell University 11 March 2009 In this lecture I ll talk about orthogonal matrices and their properties, discuss

More information

Section 6.1 - Inner Products and Norms

Section 6.1 - Inner Products and Norms Section 6.1 - Inner Products and Norms Definition. Let V be a vector space over F {R, C}. An inner product on V is a function that assigns, to every ordered pair of vectors x and y in V, a scalar in F,

More information

7 Gaussian Elimination and LU Factorization

7 Gaussian Elimination and LU Factorization 7 Gaussian Elimination and LU Factorization In this final section on matrix factorization methods for solving Ax = b we want to take a closer look at Gaussian elimination (probably the best known method

More information

A note on companion matrices

A note on companion matrices Linear Algebra and its Applications 372 (2003) 325 33 www.elsevier.com/locate/laa A note on companion matrices Miroslav Fiedler Academy of Sciences of the Czech Republic Institute of Computer Science Pod

More information

Direct Methods for Solving Linear Systems. Matrix Factorization

Direct Methods for Solving Linear Systems. Matrix Factorization Direct Methods for Solving Linear Systems Matrix Factorization Numerical Analysis (9th Edition) R L Burden & J D Faires Beamer Presentation Slides prepared by John Carroll Dublin City University c 2011

More information

Least-Squares Intersection of Lines

Least-Squares Intersection of Lines Least-Squares Intersection of Lines Johannes Traa - UIUC 2013 This write-up derives the least-squares solution for the intersection of lines. In the general case, a set of lines will not intersect at a

More information

Notes on Symmetric Matrices

Notes on Symmetric Matrices CPSC 536N: Randomized Algorithms 2011-12 Term 2 Notes on Symmetric Matrices Prof. Nick Harvey University of British Columbia 1 Symmetric Matrices We review some basic results concerning symmetric matrices.

More information

Spatial Statistics Chapter 3 Basics of areal data and areal data modeling

Spatial Statistics Chapter 3 Basics of areal data and areal data modeling Spatial Statistics Chapter 3 Basics of areal data and areal data modeling Recall areal data also known as lattice data are data Y (s), s D where D is a discrete index set. This usually corresponds to data

More information

T ( a i x i ) = a i T (x i ).

T ( a i x i ) = a i T (x i ). Chapter 2 Defn 1. (p. 65) Let V and W be vector spaces (over F ). We call a function T : V W a linear transformation form V to W if, for all x, y V and c F, we have (a) T (x + y) = T (x) + T (y) and (b)

More information

1 Determinants and the Solvability of Linear Systems

1 Determinants and the Solvability of Linear Systems 1 Determinants and the Solvability of Linear Systems In the last section we learned how to use Gaussian elimination to solve linear systems of n equations in n unknowns The section completely side-stepped

More information

Factorization Theorems

Factorization Theorems Chapter 7 Factorization Theorems This chapter highlights a few of the many factorization theorems for matrices While some factorization results are relatively direct, others are iterative While some factorization

More information

Solving Systems of Linear Equations

Solving Systems of Linear Equations LECTURE 5 Solving Systems of Linear Equations Recall that we introduced the notion of matrices as a way of standardizing the expression of systems of linear equations In today s lecture I shall show how

More information

Lecture 3: Coordinate Systems and Transformations

Lecture 3: Coordinate Systems and Transformations Lecture 3: Coordinate Systems and Transformations Topics: 1. Coordinate systems and frames 2. Change of frames 3. Affine transformations 4. Rotation, translation, scaling, and shear 5. Rotation about an

More information