ORTHOGONALITY CRITERIA FOR OPTIMALITY
|
|
- Phoebe Johnston
- 7 years ago
- Views:
Transcription
1 MINIMUM MEAN-SQUARED ERROR LINEAR ESTIMATION Consider the task of linearly estimating random variable Y from random variables X 1,...,X N with minimum mean squared error (MSE). That is, we wish to choose coefficients a 1,...,a N such that the estimate N ^Y = ai X i i=1 has the smallest possible MSE: E(Y-^Y) 2 = E Y- N 2 ai X i i=1 Optimal linear estimators. We will now show how to choose the a i 's to minimize the MSE. Let us assume that Y and the X i 's have zero mean. If not, we could subtract their means to obtain zero mean random variables, apply optimal linear estimation, and then add the mean of Y to the estimate of Y-EY. Equivalently, we could optimize an estimator of the form Σi=1 N a i X i + b, which is called affine. We consider X = (X 1,...,X N ) t and a = (a 1,...,a N ) t to be column vectors. In this vector notation, ^Y = a t X We assume that the N N covariance matrix K = [EX i X j ] of X is known, as is the vector of covariances r = (EX 1 Y,..., EX N Y) t March 6, 2005 DPCM-20 ORTHOGONALITY CRITERIA FOR OPTIMALITY To find the optimal choice of a we will prove the following. Orthogonality Criteria for Linear Estimators a is the optimal linear predictor coefficients iff E (Y-a t X)X j = 0, j = 1,...,N i.e. iff the estimation error is orthogonal to all observations. The above equation is called the orthogonality criteria or principle. Proof: If statement: Suppose a satisfies the above, and b is an arbitrary set of coefficents. Then, E (Y-b t X) 2 = E ((Y-a t X) + (a t X-b t X)) 2 = E (Y-a t X) E (Y-a t X)(a t X-b t X)) + E (a t X-b t X) 2 = E (Y-a t X) E (Y-a t N X) (aj -b j )X j + E (a t X-b t X) 2 j=1 = E (Y-a t X) 2 N + 2 (aj -b j ) E (Y-a t X)X j + E (a t X-b t X) 2 j=1 = E (Y-a t X) 2 + E (a t X-b t X) 2 because E (Y-a t X)X j = 0 E (Y-a t X) 2 because E (a t X-b t X) 2 0 March 6, 2005 DPCM-21
2 Since this is true for every choice of b, it must be that a is optimal. In other words, if a satisfies the orthogonality criteria, it is optimal. Only if statement: Suppose b is optimal and a satisfies the orthogonality criteria. Then as shown above, E (Y-b t X) 2 = E (Y-a t X) 2 + E (a t X-b t X) 2 Since b is optimal E (Y-b t X) 2 E (Y-a t X) 2. If follows from the above two equations that E (a t X-b t X) 2 = 0 which implies E X 2 (a t -b t ) 2 = 0 which implies b = a. That is, if b is optimal, it satisfies the orthogonality criteria. March 6, 2005 DPCM-22 Corollary: Any linear combination of the X i 's is orthogonal to the error of an optimal estimator. That is, if b 1,...,b k are arbitrary numbers, and a is the vector of coefficients of an optimal estimator for Y based on X, then E (Y-a t K X) bj X j = 0 j=1 Proof: Suppose b 1,...,b k are arbitrary numbers, and a is the vector of coefficients of an optimal estimator for Y based on X, then E (Y-a t K K X) bj X j = bj E (Y-a t X) X j = 0 by orthogonality principle j=1 j=1 March 6, 2005 DPCM-23
3 USING THE ORTHOGONALITY PRINCIPLE TO FIND THE OPTIMAL ESTIMATOR From the orthogonality principle we know that if a is the vector of coefficients of the optimal linear estimator, then E(Y-a t X)X j = 0, j = 1,...,N This implies E YX j = a t E XX j = (E XX j ) t a, j = 1,...,N or equivalently E YX j = (EX 1 X j,...,e X N X j ) t a, j = 1,...,N or or EYX 1... EYX N = r = K a EX 1 X 1 EX N X 1 a EX 1 X N EX N X N.. a N Where K is the N N covariance matrix of X and r = E YX = (EYX 1,...,EYX N ) t. If K is invertible, i.e. nonsingular, then a = K -1 r is the vector of optimal linear prediction coefficients. March 6, 2005 DPCM-24 If K is not invertible, i.e. singular, then it can be shown that at least one of the X i 's is a linear combination of the others. Such X i 's can be removed from X because any estimator that used them could also be computed from the others. With these X i 's eliminated, the new covariance matrix K' is nonsingular and we can find the optimal a by inverting K'. The MSE of the optimal estimator: Derivation: MSE = EY 2 - r t K -1 r MSE = E(Y-a t X) 2 = E(Y-a t X)(Y-a t X) = E(Y-a t X)Y + E(Y-a t X)(-a t X) = E(Y-a t X)Y by corollary to orthogonality principle = EY 2 - a t EXY = EY 2 - a t r = EY 2 - r t K -1 r since for optimal estimator r = Ka, and so a t = (K -1 r) t = r t K -1 since K & K -1 are symmetric March 6, 2005 DPCM-25
4 OPTIMAL LINEAR PREDICTION Let us now apply what we've learned to the task of designing the optimal predictor of X N+1 from X 1,...,X N. Let us assume that X is a zero-mean, wide-sense stationary random process with autocorrelation function R X (k) EX i X i+k. Linear estimation theory shows that the Nth-order linear predictor of X i based on X i-1,...,x i-n that minimizes the mean squared prediction error has coefficients a = (a 1,,a N ) t given by a = K -1 r where K is the N N correlation/covariance matrix of X 1,,X N, i.e. K ij = E X i X j = R X ( i-j ), K -1 is its inverse, r = (R X (1),,R X (N)) t. It can be shown that if K is not invertible, then the random process X is deterministic in the sense that there are coefficients b 1,...,b N-1 such that X N = b 1 X b N-1 X N-1. When K is invertible, the resulting minimum mean square prediction error is M N = σ 2 - a t r = σ 2 - r t K -1 r = "N-step prediction error" and the prediction gain is G N = 10 log σ2 10 M N March 6, 2005 DPCM-26 BEHAVIOR OF THE N-STEP PREDICTION ERRORS Clearly, M N decreases with N to some limit, as illustrated below, and the prediction gains G N increase with N to some limit. mean squared error of best Nth order linear predictor N As mentioned in the transform coding section, it can also be shown that M N = K(N+1) K (N) where K (N) and K (N+1) denote, respectively, the N N and (N+1) (N+1) covariance matrices of X, and K denotes the determinant of the matrix K. This is derived in supplementary notes. March 6, 2005 DPCM-27
5 EXAMPLE: FIRST-ORDER AUTOREGRESSIVE (AR) GAUSSIAN SOURCE. Assume X is stationary and X i = ρ X i-1 + Z i ( * ) where -1 < ρ < 1, Z i 's are IID Gaussian with zero mean and variances σz. 2 Z i is independent of X j, j < i. (ρ =.9 is a typical value.) Then 1) EX i = 0 : 2) σ 2 X = EX 2 i = σ2 Z 1-ρ2 ( = 5.26 σ 2 Z if ρ =.9) 3) R X (k) = autocorrelation function = E X i X i+k = σ 2 X ρ k ρ = correlation coefficent = cov(x ix i+1 ) σ 2 X = R X(1) σ 2 X March 6, 2005 DPCM-28 4) Best predictor for X i based on X i-1, X i-2,..., X i-n is ~ X i = ρx i-1 with M N = σ 2 Z Derivation: It's not easy to use a = K -1 r. However, we may directly use the orthogonality principle to check that ~ X i is optimal. Recall that the orthogonal indicates that ~ X i is optimal if and only if We find E (X i - ~ X i )X i-j, for j = 1,...,N. E (X i - ρx i-1 )X i-j = E (ρx i-1 +Z i - ρx i-1 )X i-j = E Z i X i-j = E Z i E X i-j = 0, since Z i indep. of past X's It follows from orthog. principal that ~ X i = ρx i-1 is optimal. The MSPE is M N = E(X i - ρx i-1 ) 2 = E(ρ X i-1 + Z i - ρx i-1 ) 2 = E Z 2 i = σ 2 Z Therefore, for N 1, the prediction gain (which is the gain of DPCM over SQ) is G N = 10 log σ2 X 1 10 M = 10 log 10 N 1-ρ 2 ( = 7.2 db if ρ =.9) March 6, 2005 DPCM-29
6 prediction gain, db G N vs. ρ correlation coefficient March 6, 2005 DPCM-30 Jayant & Noll, p. 271 TYPICAL PREDICTION GAINS FOR SPEECH For each of several speakers, R X (k) was empirically estimated and the prediction gains were computed for various N's. The range of G N values found is shown below. The speech was lowpass filtered, probably to 3kHz. March 6, 2005 DPCM-31
7 TYPICAL PREDICTION GAINS FOR IMAGES Prediction gains in interframe image coding from Jayant and Noll, Fig 6.8, p B -- prediction from pixel immediately to the left ~ X =.965 X B C -- prediction from corresponding pixel in previous frame ~ X =.977 X C BC -- prediction from both pixels ~ X =.379 X B X C BCD -- prediction from the two aforementioned pixels plus the pixel to the left of the corresponding in the previous frame ~ X =.746 X B X C X D For larger values of N, the predictor is based on N pixels from the present and the previous frame. March 6, 2005 DPCM-32 ASYMPTOTIC LIMIT OF M N For a wide sense stationary random process, it can be shown (see Transform Coding Lectures) that as N, M N decreases to Q = lim N M N = "one-step prediction error" = exp 1 2π -π π ln S X (ω) dω where S X (ω) is the power spectral density of X, i.e. S X (ω) = discrete-time Fourier transform of R X (k) = RX (k) e -jkω k=- For an Nth-order AR process, Q = M N = M N+1 =... March 6, 2005 DPCM-33
8 ASYMPTOTIC OPTA OF DPCM In view of the fact that the asymptotic limit of M N is Q, we have the following. When R is large and the pdf of ~ U is similar to a scaled version of that of X, e.g. when X is Gaussian, the asymptotic OPTA function of DPCM optimized over N is δ dpcm (R) 1 12 Q α X 2-2R Q σ 2 δ sq (R) = 1 G δ sq (R) S dpcm (R) S sq (R) + G db where G = 10 log 10 σ2 Q db = prediction gain That is, the gain of DPCM over SQ approximately equals the prediction gain. March 6, 2005 DPCM-34 THE ORIGINS OF THE "DPCM" NAME The letters "PCM" comes from "Pulse Code Modulation", which is a modulation technique for transmitting analog sources such as speech. In PCM, an analog source, such as speech, is sampled; the samples are quantized; fixed-length binary codewords are produced; and each bit of each binary codeword determines which of two specific pulses are sent (e.g. one pulse might be the negative of the other, or the pulse might be sinusoidal with different frequencies). Thus PCM is a modulation technique for transmitting analog sources that competes with AM, FM and various other forms of pulse modulation. Invented in the 1940's, "PCM" is now viewed as three systems: a sampler, a quantizer and a digital modulator. However, the quantizer by itself is often referred to as PCM. DPCM was originally proposed as an improved DPCM type modulator consisting of a sampler, a DPCM quantizer as we know it, and a digital modulator. Nowadays DPCM usually refers just to the quantization part of the system. Predictive Coding DPCM is also considered to be a kind of predictive coding. March 6, 2005 DPCM-35
9 DELTA MODULATION Delta modulation is the special case of DPCM where the quantizer has just two levels: +, -. In its most basic form, the predictor produces, simply, ~ X i = ^X i-1. Fancier predictors are also used. Delta modulation is used when the source is highly correlated, for example speech coding with a high sampling rate. SPEECH CODING Adaptive versions of DPCM and Delta-Modulation have been standardized for use in the telephone industry for digitizing telephone speech. The speech is bandlimited to 3 khz before sampling at 8 khz. However, this kind of speech coding is no longer considered state-of-the-art. March 6, 2005 DPCM-36 EXAMPLES OF DPCM IMAGE CODING Example: Prediction for the present pixel X ij Y i-1,j-1 Y i-1,j ~ X ij = a 1 Y i,j-1 + a 2 Y i-1,j + a 3 Y i-1,j-1 Y i,j-1 X i,j (from R. Jain, Fund'ls of Image Proc, p. 493) All but second curve from top represents performance predicted with predictor matched to the stated source. For the second curve from the top, the values for a 1, a 2 and a 3 were designed for the actual measured correlations for a test set of images. DPCM has been extensively studied for image coding. But it is not often used today. Transform coding is more common. March 6, 2005 DPCM-37
10 DPCM IN VIDEO CODING Though not usually called DPCM, the most commonly used video coding methods use a form of DPCM, e.g. MPEG, H.26X, HDTV, satellite "Direct TV", DVD. Prediction is done on a frame-by-frame basis, with the prediction of a frame being the decoded reconstruction of the previous frame, or a "motion-compensated" version thereof. In the latter case, the coder has a forward-adaptive component in addition to the backward adaption. Example: (Netravali & Haskell, Digital Pictures, p. 331) left pixel, curr. frame Prediction Coefficients left pixel prev. frame same pixel prev frame MSPE Pred. gain db 1-1/2 1/ db 3/4-1/2 3/ db 7/8-5/8 3/ db 1Based on educated guess that σ 2 = March 6, 2005 DPCM-38 COMPARISON OF DPCM AND TRANSFORM CODING Consider a stationary, Gaussian source and large R. For k-dimensional transform coding: δ tr (k,r) 1 12 K (k) 1/k α G 2-2R For DPCM with kth-order linear prediction δ dpcm (k,r) 1 12 M k α G 2-2R where M k is MSPE of optimal kth-order linear prediction for X i from X i-k,,x i-1. Fact A: K (k) 1/k M k. Proof: K (k) 1/k = (σ 2 k-1 X Mi ) 1/k this was proved in the transform coding notes i=1 M k because all terms being averaged are M k It follows from this fact that DPCM with kth-order prediction is at least as good as k-dimensional transform coding. March 6, 2005 DPCM-39
11 Fact B: lim M k = lim k k K (k) 1/k = exp{ 1 2π -π = Q = 1-step prediction error" π ln S(ω) dω } Proof: This was proved in the transform coding notes. It follows from this fact that for Gaussian sources, large k and large R, DPCM and Transform Coding have the same performance, i.e. δ dpcm (,R) = δ tr (,R) However, consider the example of a first-order AR Gaussian source. Then, since M 1 = lim M k = Q, k δ dpcm (1,R) δ tr (,R) Thus, in this case a simple form of DPCM attains performance as good as high dimensional transform coding. March 6, 2005 DPCM-40 In the Gaussian case, the SNR gain in db of DPCM with first-order linear prediction over k-dimensional transform coding is δ tr (k,r) 10 log 10 δ dpcm (1,R) = 10 log (1-ρ2 ) (k-1)/k 10 1-ρ 2 = - 10 k log 10 (1-ρ2) which is plotted below. db 8 7 gain of DPCM over transform coding for ρ = k=transform code dimen., order of DPCM linear predictor Similarly, for an Nth-order AR Gaussian source, δ dpcm (N,R) δ tr (,R) March 6, 2005 DPCM-41
12 Given that DPCM is more efficient for the AR Gauss source, why is transform coding more commonly used nowadays than DPCM, for example for image coding? My feeling is that it because transform coding can more easily be made to take perceptual criteria into account. For example, with transform image coding, it is easy to exploit the fact that the eye is more sensitive to errors in low spatial frequencies than in high spatial frequencies. For example, JPEG does this by using coarser scalar quantizers for high frequency coefficients than for low frequency coefficients. March 6, 2005 DPCM-42
MATH10212 Linear Algebra. Systems of Linear Equations. Definition. An n-dimensional vector is a row or a column of n numbers (or letters): a 1.
MATH10212 Linear Algebra Textbook: D. Poole, Linear Algebra: A Modern Introduction. Thompson, 2006. ISBN 0-534-40596-7. Systems of Linear Equations Definition. An n-dimensional vector is a row or a column
More informationSections 2.11 and 5.8
Sections 211 and 58 Timothy Hanson Department of Statistics, University of South Carolina Stat 704: Data Analysis I 1/25 Gesell data Let X be the age in in months a child speaks his/her first word and
More informationSolutions to Exam in Speech Signal Processing EN2300
Solutions to Exam in Speech Signal Processing EN23 Date: Thursday, Dec 2, 8: 3: Place: Allowed: Grades: Language: Solutions: Q34, Q36 Beta Math Handbook (or corresponding), calculator with empty memory.
More informationReview Jeopardy. Blue vs. Orange. Review Jeopardy
Review Jeopardy Blue vs. Orange Review Jeopardy Jeopardy Round Lectures 0-3 Jeopardy Round $200 How could I measure how far apart (i.e. how different) two observations, y 1 and y 2, are from each other?
More informationSF2940: Probability theory Lecture 8: Multivariate Normal Distribution
SF2940: Probability theory Lecture 8: Multivariate Normal Distribution Timo Koski 24.09.2014 Timo Koski () Mathematisk statistik 24.09.2014 1 / 75 Learning outcomes Random vectors, mean vector, covariance
More informationVector and Matrix Norms
Chapter 1 Vector and Matrix Norms 11 Vector Spaces Let F be a field (such as the real numbers, R, or complex numbers, C) with elements called scalars A Vector Space, V, over the field F is a non-empty
More informationIntroduction to image coding
Introduction to image coding Image coding aims at reducing amount of data required for image representation, storage or transmission. This is achieved by removing redundant data from an image, i.e. by
More information1 Introduction to Matrices
1 Introduction to Matrices In this section, important definitions and results from matrix algebra that are useful in regression analysis are introduced. While all statements below regarding the columns
More informationLecture 8: Signal Detection and Noise Assumption
ECE 83 Fall Statistical Signal Processing instructor: R. Nowak, scribe: Feng Ju Lecture 8: Signal Detection and Noise Assumption Signal Detection : X = W H : X = S + W where W N(, σ I n n and S = [s, s,...,
More informationSF2940: Probability theory Lecture 8: Multivariate Normal Distribution
SF2940: Probability theory Lecture 8: Multivariate Normal Distribution Timo Koski 24.09.2015 Timo Koski Matematisk statistik 24.09.2015 1 / 1 Learning outcomes Random vectors, mean vector, covariance matrix,
More informationProbability and Random Variables. Generation of random variables (r.v.)
Probability and Random Variables Method for generating random variables with a specified probability distribution function. Gaussian And Markov Processes Characterization of Stationary Random Process Linearly
More informationMATRIX ALGEBRA AND SYSTEMS OF EQUATIONS. + + x 2. x n. a 11 a 12 a 1n b 1 a 21 a 22 a 2n b 2 a 31 a 32 a 3n b 3. a m1 a m2 a mn b m
MATRIX ALGEBRA AND SYSTEMS OF EQUATIONS 1. SYSTEMS OF EQUATIONS AND MATRICES 1.1. Representation of a linear system. The general system of m equations in n unknowns can be written a 11 x 1 + a 12 x 2 +
More informationVoice---is analog in character and moves in the form of waves. 3-important wave-characteristics:
Voice Transmission --Basic Concepts-- Voice---is analog in character and moves in the form of waves. 3-important wave-characteristics: Amplitude Frequency Phase Voice Digitization in the POTS Traditional
More informationMATRIX ALGEBRA AND SYSTEMS OF EQUATIONS
MATRIX ALGEBRA AND SYSTEMS OF EQUATIONS Systems of Equations and Matrices Representation of a linear system The general system of m equations in n unknowns can be written a x + a 2 x 2 + + a n x n b a
More informationLecture 2 Matrix Operations
Lecture 2 Matrix Operations transpose, sum & difference, scalar multiplication matrix multiplication, matrix-vector product matrix inverse 2 1 Matrix transpose transpose of m n matrix A, denoted A T or
More informationAu = = = 3u. Aw = = = 2w. so the action of A on u and w is very easy to picture: it simply amounts to a stretching by 3 and 2, respectively.
Chapter 7 Eigenvalues and Eigenvectors In this last chapter of our exploration of Linear Algebra we will revisit eigenvalues and eigenvectors of matrices, concepts that were already introduced in Geometry
More informationIntroduction to Matrix Algebra
Psychology 7291: Multivariate Statistics (Carey) 8/27/98 Matrix Algebra - 1 Introduction to Matrix Algebra Definitions: A matrix is a collection of numbers ordered by rows and columns. It is customary
More informationMatrix Representations of Linear Transformations and Changes of Coordinates
Matrix Representations of Linear Transformations and Changes of Coordinates 01 Subspaces and Bases 011 Definitions A subspace V of R n is a subset of R n that contains the zero element and is closed under
More informationLinear Algebra Notes for Marsden and Tromba Vector Calculus
Linear Algebra Notes for Marsden and Tromba Vector Calculus n-dimensional Euclidean Space and Matrices Definition of n space As was learned in Math b, a point in Euclidean three space can be thought of
More informationMATH 304 Linear Algebra Lecture 18: Rank and nullity of a matrix.
MATH 304 Linear Algebra Lecture 18: Rank and nullity of a matrix. Nullspace Let A = (a ij ) be an m n matrix. Definition. The nullspace of the matrix A, denoted N(A), is the set of all n-dimensional column
More informationThe Bivariate Normal Distribution
The Bivariate Normal Distribution This is Section 4.7 of the st edition (2002) of the book Introduction to Probability, by D. P. Bertsekas and J. N. Tsitsiklis. The material in this section was not included
More informationCorrelation in Random Variables
Correlation in Random Variables Lecture 11 Spring 2002 Correlation in Random Variables Suppose that an experiment produces two random variables, X and Y. What can we say about the relationship between
More informationSimilarity and Diagonalization. Similar Matrices
MATH022 Linear Algebra Brief lecture notes 48 Similarity and Diagonalization Similar Matrices Let A and B be n n matrices. We say that A is similar to B if there is an invertible n n matrix P such that
More informationSystems of Linear Equations
Systems of Linear Equations Beifang Chen Systems of linear equations Linear systems A linear equation in variables x, x,, x n is an equation of the form a x + a x + + a n x n = b, where a, a,, a n and
More informationQuadratic forms Cochran s theorem, degrees of freedom, and all that
Quadratic forms Cochran s theorem, degrees of freedom, and all that Dr. Frank Wood Frank Wood, fwood@stat.columbia.edu Linear Regression Models Lecture 1, Slide 1 Why We Care Cochran s theorem tells us
More informationα = u v. In other words, Orthogonal Projection
Orthogonal Projection Given any nonzero vector v, it is possible to decompose an arbitrary vector u into a component that points in the direction of v and one that points in a direction orthogonal to v
More informationInner Product Spaces
Math 571 Inner Product Spaces 1. Preliminaries An inner product space is a vector space V along with a function, called an inner product which associates each pair of vectors u, v with a scalar u, v, and
More informationUnderstanding and Applying Kalman Filtering
Understanding and Applying Kalman Filtering Lindsay Kleeman Department of Electrical and Computer Systems Engineering Monash University, Clayton 1 Introduction Objectives: 1. Provide a basic understanding
More informationis in plane V. However, it may be more convenient to introduce a plane coordinate system in V.
.4 COORDINATES EXAMPLE Let V be the plane in R with equation x +2x 2 +x 0, a two-dimensional subspace of R. We can describe a vector in this plane by its spatial (D)coordinates; for example, vector x 5
More informationSome probability and statistics
Appendix A Some probability and statistics A Probabilities, random variables and their distribution We summarize a few of the basic concepts of random variables, usually denoted by capital letters, X,Y,
More informationRECOMMENDATION ITU-R BO.786 *
Rec. ITU-R BO.786 RECOMMENDATION ITU-R BO.786 * MUSE ** system for HDTV broadcasting-satellite services (Question ITU-R /) (992) The ITU Radiocommunication Assembly, considering a) that the MUSE system
More information8 MIMO II: capacity and multiplexing
CHAPTER 8 MIMO II: capacity and multiplexing architectures In this chapter, we will look at the capacity of MIMO fading channels and discuss transceiver architectures that extract the promised multiplexing
More information5. Orthogonal matrices
L Vandenberghe EE133A (Spring 2016) 5 Orthogonal matrices matrices with orthonormal columns orthogonal matrices tall matrices with orthonormal columns complex matrices with orthonormal columns 5-1 Orthonormal
More informationLecture Notes 1. Brief Review of Basic Probability
Probability Review Lecture Notes Brief Review of Basic Probability I assume you know basic probability. Chapters -3 are a review. I will assume you have read and understood Chapters -3. Here is a very
More informationReading.. IMAGE COMPRESSION- I IMAGE COMPRESSION. Image compression. Data Redundancy. Lossy vs Lossless Compression. Chapter 8.
Reading.. IMAGE COMPRESSION- I Week VIII Feb 25 Chapter 8 Sections 8.1, 8.2 8.3 (selected topics) 8.4 (Huffman, run-length, loss-less predictive) 8.5 (lossy predictive, transform coding basics) 8.6 Image
More information15.062 Data Mining: Algorithms and Applications Matrix Math Review
.6 Data Mining: Algorithms and Applications Matrix Math Review The purpose of this document is to give a brief review of selected linear algebra concepts that will be useful for the course and to develop
More information160 CHAPTER 4. VECTOR SPACES
160 CHAPTER 4. VECTOR SPACES 4. Rank and Nullity In this section, we look at relationships between the row space, column space, null space of a matrix and its transpose. We will derive fundamental results
More informationI. GROUPS: BASIC DEFINITIONS AND EXAMPLES
I GROUPS: BASIC DEFINITIONS AND EXAMPLES Definition 1: An operation on a set G is a function : G G G Definition 2: A group is a set G which is equipped with an operation and a special element e G, called
More informationLecture 4: Partitioned Matrices and Determinants
Lecture 4: Partitioned Matrices and Determinants 1 Elementary row operations Recall the elementary operations on the rows of a matrix, equivalent to premultiplying by an elementary matrix E: (1) multiplying
More informationExample/ an analog signal f ( t) ) is sample by f s = 5000 Hz draw the sampling signal spectrum. Calculate min. sampling frequency.
1 2 3 4 Example/ an analog signal f ( t) = 1+ cos(4000πt ) is sample by f s = 5000 Hz draw the sampling signal spectrum. Calculate min. sampling frequency. Sol/ H(f) -7KHz -5KHz -3KHz -2KHz 0 2KHz 3KHz
More informationNOTES ON LINEAR TRANSFORMATIONS
NOTES ON LINEAR TRANSFORMATIONS Definition 1. Let V and W be vector spaces. A function T : V W is a linear transformation from V to W if the following two properties hold. i T v + v = T v + T v for all
More informationMatrix Algebra. Some Basic Matrix Laws. Before reading the text or the following notes glance at the following list of basic matrix algebra laws.
Matrix Algebra A. Doerr Before reading the text or the following notes glance at the following list of basic matrix algebra laws. Some Basic Matrix Laws Assume the orders of the matrices are such that
More informationMATH 304 Linear Algebra Lecture 9: Subspaces of vector spaces (continued). Span. Spanning set.
MATH 304 Linear Algebra Lecture 9: Subspaces of vector spaces (continued). Span. Spanning set. Vector space A vector space is a set V equipped with two operations, addition V V (x,y) x + y V and scalar
More informationEconometrics Simple Linear Regression
Econometrics Simple Linear Regression Burcu Eke UC3M Linear equations with one variable Recall what a linear equation is: y = b 0 + b 1 x is a linear equation with one variable, or equivalently, a straight
More informationPUTNAM TRAINING POLYNOMIALS. Exercises 1. Find a polynomial with integral coefficients whose zeros include 2 + 5.
PUTNAM TRAINING POLYNOMIALS (Last updated: November 17, 2015) Remark. This is a list of exercises on polynomials. Miguel A. Lerma Exercises 1. Find a polynomial with integral coefficients whose zeros include
More informationIntroduction: Overview of Kernel Methods
Introduction: Overview of Kernel Methods Statistical Data Analysis with Positive Definite Kernels Kenji Fukumizu Institute of Statistical Mathematics, ROIS Department of Statistical Science, Graduate University
More informationEigenvalues, Eigenvectors, Matrix Factoring, and Principal Components
Eigenvalues, Eigenvectors, Matrix Factoring, and Principal Components The eigenvalues and eigenvectors of a square matrix play a key role in some important operations in statistics. In particular, they
More informationLecture 3: Linear methods for classification
Lecture 3: Linear methods for classification Rafael A. Irizarry and Hector Corrada Bravo February, 2010 Today we describe four specific algorithms useful for classification problems: linear regression,
More informationTrend and Seasonal Components
Chapter 2 Trend and Seasonal Components If the plot of a TS reveals an increase of the seasonal and noise fluctuations with the level of the process then some transformation may be necessary before doing
More informationLecture L3 - Vectors, Matrices and Coordinate Transformations
S. Widnall 16.07 Dynamics Fall 2009 Lecture notes based on J. Peraire Version 2.0 Lecture L3 - Vectors, Matrices and Coordinate Transformations By using vectors and defining appropriate operations between
More informationUnified Lecture # 4 Vectors
Fall 2005 Unified Lecture # 4 Vectors These notes were written by J. Peraire as a review of vectors for Dynamics 16.07. They have been adapted for Unified Engineering by R. Radovitzky. References [1] Feynmann,
More informationIntroduction to General and Generalized Linear Models
Introduction to General and Generalized Linear Models General Linear Models - part I Henrik Madsen Poul Thyregod Informatics and Mathematical Modelling Technical University of Denmark DK-2800 Kgs. Lyngby
More informationCS229 Lecture notes. Andrew Ng
CS229 Lecture notes Andrew Ng Part X Factor analysis Whenwehavedatax (i) R n thatcomesfromamixtureofseveral Gaussians, the EM algorithm can be applied to fit a mixture model. In this setting, we usually
More informationMIMO CHANNEL CAPACITY
MIMO CHANNEL CAPACITY Ochi Laboratory Nguyen Dang Khoa (D1) 1 Contents Introduction Review of information theory Fixed MIMO channel Fading MIMO channel Summary and Conclusions 2 1. Introduction The use
More informationLinear Codes. Chapter 3. 3.1 Basics
Chapter 3 Linear Codes In order to define codes that we can encode and decode efficiently, we add more structure to the codespace. We shall be mainly interested in linear codes. A linear code of length
More informationName: Section Registered In:
Name: Section Registered In: Math 125 Exam 3 Version 1 April 24, 2006 60 total points possible 1. (5pts) Use Cramer s Rule to solve 3x + 4y = 30 x 2y = 8. Be sure to show enough detail that shows you are
More informationExperiment 3 MULTIMEDIA SIGNAL COMPRESSION: SPEECH AND AUDIO
Experiment 3 MULTIMEDIA SIGNAL COMPRESSION: SPEECH AND AUDIO I Introduction A key technology that enables distributing speech and audio signals without mass storage media or transmission bandwidth is compression,
More informationLinear Algebra Review. Vectors
Linear Algebra Review By Tim K. Marks UCSD Borrows heavily from: Jana Kosecka kosecka@cs.gmu.edu http://cs.gmu.edu/~kosecka/cs682.html Virginia de Sa Cogsci 8F Linear Algebra review UCSD Vectors The length
More informationJPEG Image Compression by Using DCT
International Journal of Computer Sciences and Engineering Open Access Research Paper Volume-4, Issue-4 E-ISSN: 2347-2693 JPEG Image Compression by Using DCT Sarika P. Bagal 1* and Vishal B. Raskar 2 1*
More informationDecember 4, 2013 MATH 171 BASIC LINEAR ALGEBRA B. KITCHENS
December 4, 2013 MATH 171 BASIC LINEAR ALGEBRA B KITCHENS The equation 1 Lines in two-dimensional space (1) 2x y = 3 describes a line in two-dimensional space The coefficients of x and y in the equation
More informationOverview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model
Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model 1 September 004 A. Introduction and assumptions The classical normal linear regression model can be written
More informationCM0340 SOLNS. Do not turn this page over until instructed to do so by the Senior Invigilator.
CARDIFF UNIVERSITY EXAMINATION PAPER Academic Year: 2008/2009 Examination Period: Examination Paper Number: Examination Paper Title: SOLUTIONS Duration: Autumn CM0340 SOLNS Multimedia 2 hours Do not turn
More informationUsing row reduction to calculate the inverse and the determinant of a square matrix
Using row reduction to calculate the inverse and the determinant of a square matrix Notes for MATH 0290 Honors by Prof. Anna Vainchtein 1 Inverse of a square matrix An n n square matrix A is called invertible
More informationCommunication on the Grassmann Manifold: A Geometric Approach to the Noncoherent Multiple-Antenna Channel
IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 48, NO. 2, FEBRUARY 2002 359 Communication on the Grassmann Manifold: A Geometric Approach to the Noncoherent Multiple-Antenna Channel Lizhong Zheng, Student
More informationMathematics Course 111: Algebra I Part IV: Vector Spaces
Mathematics Course 111: Algebra I Part IV: Vector Spaces D. R. Wilkins Academic Year 1996-7 9 Vector Spaces A vector space over some field K is an algebraic structure consisting of a set V on which are
More informationInner products on R n, and more
Inner products on R n, and more Peyam Ryan Tabrizian Friday, April 12th, 2013 1 Introduction You might be wondering: Are there inner products on R n that are not the usual dot product x y = x 1 y 1 + +
More informationPrinciples of Digital Communication
Principles of Digital Communication Robert G. Gallager January 5, 2008 ii Preface: introduction and objectives The digital communication industry is an enormous and rapidly growing industry, roughly comparable
More information1 VECTOR SPACES AND SUBSPACES
1 VECTOR SPACES AND SUBSPACES What is a vector? Many are familiar with the concept of a vector as: Something which has magnitude and direction. an ordered pair or triple. a description for quantities such
More informationChapter 3: The Multiple Linear Regression Model
Chapter 3: The Multiple Linear Regression Model Advanced Econometrics - HEC Lausanne Christophe Hurlin University of Orléans November 23, 2013 Christophe Hurlin (University of Orléans) Advanced Econometrics
More informationMA106 Linear Algebra lecture notes
MA106 Linear Algebra lecture notes Lecturers: Martin Bright and Daan Krammer Warwick, January 2011 Contents 1 Number systems and fields 3 1.1 Axioms for number systems......................... 3 2 Vector
More informationContinued Fractions and the Euclidean Algorithm
Continued Fractions and the Euclidean Algorithm Lecture notes prepared for MATH 326, Spring 997 Department of Mathematics and Statistics University at Albany William F Hammond Table of Contents Introduction
More informationMULTIVARIATE PROBABILITY DISTRIBUTIONS
MULTIVARIATE PROBABILITY DISTRIBUTIONS. PRELIMINARIES.. Example. Consider an experiment that consists of tossing a die and a coin at the same time. We can consider a number of random variables defined
More information4.5 Linear Dependence and Linear Independence
4.5 Linear Dependence and Linear Independence 267 32. {v 1, v 2 }, where v 1, v 2 are collinear vectors in R 3. 33. Prove that if S and S are subsets of a vector space V such that S is a subset of S, then
More informationOctober 3rd, 2012. Linear Algebra & Properties of the Covariance Matrix
Linear Algebra & Properties of the Covariance Matrix October 3rd, 2012 Estimation of r and C Let rn 1, rn, t..., rn T be the historical return rates on the n th asset. rn 1 rṇ 2 r n =. r T n n = 1, 2,...,
More informationExample: Credit card default, we may be more interested in predicting the probabilty of a default than classifying individuals as default or not.
Statistical Learning: Chapter 4 Classification 4.1 Introduction Supervised learning with a categorical (Qualitative) response Notation: - Feature vector X, - qualitative response Y, taking values in C
More informationOrthogonal Diagonalization of Symmetric Matrices
MATH10212 Linear Algebra Brief lecture notes 57 Gram Schmidt Process enables us to find an orthogonal basis of a subspace. Let u 1,..., u k be a basis of a subspace V of R n. We begin the process of finding
More informationBy choosing to view this document, you agree to all provisions of the copyright laws protecting it.
This material is posted here with permission of the IEEE Such permission of the IEEE does not in any way imply IEEE endorsement of any of Helsinki University of Technology's products or services Internal
More informationDiscrete Convolution and the Discrete Fourier Transform
Discrete Convolution and the Discrete Fourier Transform Discrete Convolution First of all we need to introduce what we might call the wraparound convention Because the complex numbers w j e i πj N have
More informationLinear Maps. Isaiah Lankham, Bruno Nachtergaele, Anne Schilling (February 5, 2007)
MAT067 University of California, Davis Winter 2007 Linear Maps Isaiah Lankham, Bruno Nachtergaele, Anne Schilling (February 5, 2007) As we have discussed in the lecture on What is Linear Algebra? one of
More informationa 11 x 1 + a 12 x 2 + + a 1n x n = b 1 a 21 x 1 + a 22 x 2 + + a 2n x n = b 2.
Chapter 1 LINEAR EQUATIONS 1.1 Introduction to linear equations A linear equation in n unknowns x 1, x,, x n is an equation of the form a 1 x 1 + a x + + a n x n = b, where a 1, a,..., a n, b are given
More information6. Cholesky factorization
6. Cholesky factorization EE103 (Fall 2011-12) triangular matrices forward and backward substitution the Cholesky factorization solving Ax = b with A positive definite inverse of a positive definite matrix
More informationVideo Coding Basics. Yao Wang Polytechnic University, Brooklyn, NY11201 yao@vision.poly.edu
Video Coding Basics Yao Wang Polytechnic University, Brooklyn, NY11201 yao@vision.poly.edu Outline Motivation for video coding Basic ideas in video coding Block diagram of a typical video codec Different
More informationQuestion 2: How do you solve a matrix equation using the matrix inverse?
Question : How do you solve a matrix equation using the matrix inverse? In the previous question, we wrote systems of equations as a matrix equation AX B. In this format, the matrix A contains the coefficients
More informationMATH 304 Linear Algebra Lecture 20: Inner product spaces. Orthogonal sets.
MATH 304 Linear Algebra Lecture 20: Inner product spaces. Orthogonal sets. Norm The notion of norm generalizes the notion of length of a vector in R n. Definition. Let V be a vector space. A function α
More information[1] Diagonal factorization
8.03 LA.6: Diagonalization and Orthogonal Matrices [ Diagonal factorization [2 Solving systems of first order differential equations [3 Symmetric and Orthonormal Matrices [ Diagonal factorization Recall:
More informationTTT4110 Information and Signal Theory Solution to exam
Norwegian University of Science and Technology Department of Electronics and Telecommunications TTT4 Information and Signal Theory Solution to exam Problem I (a The frequency response is found by taking
More information2x + y = 3. Since the second equation is precisely the same as the first equation, it is enough to find x and y satisfying the system
1. Systems of linear equations We are interested in the solutions to systems of linear equations. A linear equation is of the form 3x 5y + 2z + w = 3. The key thing is that we don t multiply the variables
More informationCS3220 Lecture Notes: QR factorization and orthogonal transformations
CS3220 Lecture Notes: QR factorization and orthogonal transformations Steve Marschner Cornell University 11 March 2009 In this lecture I ll talk about orthogonal matrices and their properties, discuss
More informationSection 6.1 - Inner Products and Norms
Section 6.1 - Inner Products and Norms Definition. Let V be a vector space over F {R, C}. An inner product on V is a function that assigns, to every ordered pair of vectors x and y in V, a scalar in F,
More information7 Gaussian Elimination and LU Factorization
7 Gaussian Elimination and LU Factorization In this final section on matrix factorization methods for solving Ax = b we want to take a closer look at Gaussian elimination (probably the best known method
More informationA note on companion matrices
Linear Algebra and its Applications 372 (2003) 325 33 www.elsevier.com/locate/laa A note on companion matrices Miroslav Fiedler Academy of Sciences of the Czech Republic Institute of Computer Science Pod
More informationDirect Methods for Solving Linear Systems. Matrix Factorization
Direct Methods for Solving Linear Systems Matrix Factorization Numerical Analysis (9th Edition) R L Burden & J D Faires Beamer Presentation Slides prepared by John Carroll Dublin City University c 2011
More informationLeast-Squares Intersection of Lines
Least-Squares Intersection of Lines Johannes Traa - UIUC 2013 This write-up derives the least-squares solution for the intersection of lines. In the general case, a set of lines will not intersect at a
More informationNotes on Symmetric Matrices
CPSC 536N: Randomized Algorithms 2011-12 Term 2 Notes on Symmetric Matrices Prof. Nick Harvey University of British Columbia 1 Symmetric Matrices We review some basic results concerning symmetric matrices.
More informationSpatial Statistics Chapter 3 Basics of areal data and areal data modeling
Spatial Statistics Chapter 3 Basics of areal data and areal data modeling Recall areal data also known as lattice data are data Y (s), s D where D is a discrete index set. This usually corresponds to data
More informationT ( a i x i ) = a i T (x i ).
Chapter 2 Defn 1. (p. 65) Let V and W be vector spaces (over F ). We call a function T : V W a linear transformation form V to W if, for all x, y V and c F, we have (a) T (x + y) = T (x) + T (y) and (b)
More information1 Determinants and the Solvability of Linear Systems
1 Determinants and the Solvability of Linear Systems In the last section we learned how to use Gaussian elimination to solve linear systems of n equations in n unknowns The section completely side-stepped
More informationFactorization Theorems
Chapter 7 Factorization Theorems This chapter highlights a few of the many factorization theorems for matrices While some factorization results are relatively direct, others are iterative While some factorization
More informationSolving Systems of Linear Equations
LECTURE 5 Solving Systems of Linear Equations Recall that we introduced the notion of matrices as a way of standardizing the expression of systems of linear equations In today s lecture I shall show how
More informationLecture 3: Coordinate Systems and Transformations
Lecture 3: Coordinate Systems and Transformations Topics: 1. Coordinate systems and frames 2. Change of frames 3. Affine transformations 4. Rotation, translation, scaling, and shear 5. Rotation about an
More information