Lecture Notes 9 Course Summary. Random Variables. Mean and Covariance matrix. Vector Detection and Estimation. Convergence and Limit Theorems

Lecture Notes 9 Course Summary Random Variables Mean and Covariance matrix Gaussians Vector Detection and Estimation Convergence and Limit Theorems Random Processes Random processes in Linear Systems EE 278B: Course Summary 9 1 Random Variables Functions of random variables Review: HW1 #6, Sample midterm #2 Generation of random variables (X U[0,1], Y = F 1 (X) F(y)) Review: HW1 #9 Conditional expectation: conditional expectation is a random variable E(g(X,Y) Y) iterated expectation: E(g(X,Y)) = E Y (E X (g(x,y) Y)) E(X Y) is the best MSE estimate of X given Y ; its MSE= E(Var(X Y)) Review: HW2 #E10, #E11 EE 278B: Course Summary 9 2

Bounds: Union of events: P( n A i) n P(A i) Schwarz: (E(XY)) 2 E(X 2 )E(Y 2 ); equality iff X = ay Jensen: if g(x) is convex, then E(g(X)) g(e(x)) Markov: if X 0 and a > 1 then P{X ae(x)} 1/a Chebychev: P{ X E(X) aσ X } 1/a 2 Review: HW2 #8, Sample midterm #1, Midterm #1 EE 278B: Course Summary 9 3 Mean and Covariance Matrix Consider random vector X: mean vector: E(X) = µ X correlation matrix: E(XX T ) covariance matrix: Σ X = E ( (X µ)(x µ) T) = E(XX T ) E(X)E(X T ) A matrix can be a covariance (correlation) matrix iff it is real, even, nonnegative definite crosscovariance matrix: Σ XY = E(XY T ) E(X)E(Y T ) Coloring and whitening: Σ has a square root; Cholesky decomposition Review: HW4 #1, #E5 EE 278B: Course Summary 9 4

Gaussians Gaussian random variable: X N(µ,σ 2 ) has pdf f X (x) = 1 2πσ 2 e (x µ)2 2σ 2 Gaussian random vector: X N(µ,Σ) has joint pdf where Σ = detσ. 1 f X (x) = (2π) n 2 Σ 2 1 Properties of Gaussian random vectors 1. uncorrelation implies independence e 1 2 (x µ)t Σ 1 (x µ) 2. linear transformation of GRV yields GRV; if A is m n full rank matrix, where m n, Y = AX N(Aµ X, AΣ X A T ) 3. marginals of GRV are Gaussian EE 278B: Course Summary 9 5 4. Conditionals of GRV are Gaussian; if Y N 0, Σ Y Σ YX, X 0 Σ XY Σ X then X {Y = y} N(Σ XY Σ 1 Y y, Σ X Σ XY Σ 1 Y Σ YX) best MSE estimate of GRV given GRV is linear Review: HW3 #7, HW4 #E2, Sample midterm #5 EE 278B: Course Summary 9 6

Vector Detection and Estimation Signal detection: MAP; ML; minimum distance decoding rules Review: HW2 #2, HW3 #1, HW 4 #7 Linear estimation: Signal X and observation Y are zero mean; orthogonality: X signal ˆX error vector X ˆX subspace spanned by Y 1,Y 2,...,Y n EE 278B: Course Summary 9 7 The estimate and its MSE are: ˆX = Σ T YXΣ 1 Y Y MSE = Var(X) Σ T YXΣ 1 Y Σ YX Review: HW3 #4, HW5 #2, Sample midterm #3 Innovation sequence: Observations Y 1,Y 2,...,Y n. Innovation sequence is Ỹ i+1 (Y i ) = Y i+1 Ŷi+1(Y i ) Prediction using innovation sequence: n ˆX(Y n ) = ˆX(Ỹi) = MSE = Var(X) n Cov(X,Ỹ i ) Var(Ỹ i ) n Cov 2 (X,Ỹ i ) Var(Ỹ i ) Kalman filter: state space model, estimate the current/future state recursively as observations become available Review: HW5 #3,4, Sample final #2 Ỹ i EE 278B: Course Summary 9 8

Convergence and Limit Theorems w.p.1: P{ω : lim n X n(ω) = X(ω)} = 1 in m.s.: lim n E( (X n X) 2) = 0 in probability: lim n P{ X n X > ǫ} = 0 for every ǫ > 0, Weak law of large numbers: if X 1,X 2,X 3,... are i.i.d. r.v.s with finite mean and variance, then S n E(X) in probability in distribution: lim n F X n (x) = F X (x) Central limit theorem: if X 1,X 2,X 3,... are i.i.d. r.v.s with finite mean E(X) and variance σx 2 then n (X i E(X)) N(0,1) in distribution σ X n EE 278B: Course Summary 9 9 CLT also holds for i.i.d sequence of random vectors X 1,X 2,...,X n with finite mean µ and nonsingular covariance matrix Σ: Relationships with probability 1 1 n (X i µ) N(0,Σ) in distribution n in probability in distribution in mean square Review: HW5 #8, #E4, HW6 #E2 EE 278B: Course Summary 9 10

Random Processes IID: Bernoulli, discrete time WGN Markov: random walk, Poisson, Gauss-Markov Review: HW6 #6, #E6 Independent increment: random walk, Poisson Review: HW6 #E4 Poisson process: N(t) such that N(0) = 0, N(t) is independent increment, and (N(t 2 ) N(t 1 )) Poisson(λ(t 2 t 1 )) for all t 2 > t 1 0 Merging, branching, infinite divisibility Arrival time process, interarrival time process, random telegraph process Review: HW6 #7, HW7 #6 Gaussian: all finite order pdfs (joint pdfs of a finite set of sample) are Gaussian Examples: discrete-time WGN, discrete-time Wiener process, Gauss-Markov, bandlimited WGN EE 278B: Course Summary 9 11 WSS SSS Review: HW6 #5, HW7 #1 mean function: E(X(t)) autocorrelation function: R X (t 1,t 2 ) = E(X(t 1 )X(t 2 )) Review: HW7 #3 crosscorrelation function: R XY (t 1,t 2 ) = E(X(t 1 )Y(t 2 )) Review: HW8 #4 SSS: all n-th order distributions are time invariant, e.g., periodic signal with random phase, IID, WSS Gaussian WSS: mean and autocorrelation functions are time invariant E(X(t)) is constant and R X (t 1,t 2 ) = R X (t 1 t 2 ) = R X (τ) R(τ) is an autocorrelation function iff it is real, even, nonnegative definite (equivalent to S X (f) = F[R(τ)] 0 for all f) R X (τ) R X (0) = E(X 2 (t)) (average power) If R X (T) = R X (0) then X(t) and R X (τ) are periodic with period T EE 278B: Course Summary 9 12

Power spectral density: S X (f) = F[R X (τ)] real, even and 0 Discrete-time white noise: WSS with zero mean and S X (f) = N, f < 1 2 Continuous-time bandlimited white noise: zero mean and S X (f) = N 2, f < B White noise: WSS with zero mean and S X (f) = N 2 for all f Review HW7 #4, Examples in Lecture notes 7-15 to 7-17 Continuity of random processes Mean ergodic random processes WSS process X(t) is mean ergodic if 2 lim n t 2 Review HW7 extra #5 t 0 (t τ)r X (τ) dτ = (E(X(t))) 2 EE 278B: Course Summary 9 13 Random Processes in Linear Systems X(t) random process input to linear system: X(t) h(t,t τ) Y(t) Finding mean and autocorrelation functions of Y(t) X(t) WSS process, < t < input to an LTI system: X(t) and Y(t) are jointly WSS with S X (f) H(f) S YX (f) H( f) S Y (f) Review: HW8 #4, #5, #6 Examples and applications: KT/C noise; ARMA; Sampling theorem EE 278B: Course Summary 9 14

Infinite Smoothing Signal X(t) and observation Y(τ) for < τ < are zero mean jointly WSS The best linear MSE estimate is of the form ˆX(t) = h(t) Y(t) = Y(τ)h(t τ)dτ, where the transfer function and the MSE of the best linear MSE estimate are H(f) = S XY(f) S Y (f) MSE = Review: HW8 #6, Sample Final #12 S X (f)df S XY (f) 2 S Y (f) = E ( (X(t)) 2) ( R XY (τ) h( τ) ) τ=0 df EE 278B: Course Summary 9 15 Applications Digital communications: channel models, detection Signal and image processing: coloring and whitening; estimation; Kalman, infinite smoothing, Wiener Control: Kalman filter Circuits and devices: noise models, noise analysis (stationary versus non stationary) Communication networks: Poisson and associated processes Monte Carlo Simulations: generation of r.v.s; coloring and whitening Machine learning Many other applications in biology, energy, finance,... EE 278B: Course Summary 9 16

Where do you go from here? EE courses that use EE 278B: Communications, information theory, and coding: EE 276, 279, 359, 374, 376A,B, 379, 476 Signal and image processing: EE378A,B, 355, 363, 368, 372, 398A,B More on probability, random processes, graphical models, machine learning: Stat 217, 218 Stat 310A,B,C, 317 Stat 315A, B, 375 CS 228, 229 MS&E 321, 322, 351 EE 278B: Course Summary 9 17