LECTURE 4. Last time: Lecture outline

LECTURE 4 Last time: Types of convergence Weak Law of Large Numbers Strong Law of Large Numbers Asymptotic Equipartition Property Lecture outline Stochastic processes Markov chains Entropy rate Random walks on graphs Hidden Markov models Reading: Chapter 4.

Stochastic processes A stochastic process is an indexed sequence or r.v.s X 0, X 1,... characterized by the joint PMF P X0,X 1,...,X n (x 0, x 1,..., x n ), (x 0, x 1,..., x n ) X n for n = 0, 1,.... A stochastic process is stationary if P X0,X 1,...,X n (x 0, x 1,..., x n ) = P Xl,X l+1,...,x l+n (x 0, x 1,..., x n ) for every shift l and all (x 0, x 1,..., x n ) X n.

Stochastic processes A discrete stochastic process is a Markov chain if P Xn X 0,...,X n 1 xn x 0,..., x n 1 = PXn X n 1 xn x n 1 for n = 1, 2,... and all (x 0, x 1,..., x n ) X n. We deal with time invariant Markov chains X n : state after n transitions belongs to a finite set, e.g., {1,..., m} X 0 is either given or random (given current state, the past does not matter) p i,j = P(X n+1 = j X n = i) = P(X n+1 = j X n = i, X n 1,..., X 0 ) Markov chain is characterized by probability transition matrix P = [p i,j ]

Review of Markov chains State occupancy probabilities, given initial state i: r i,j (n) = P(X n = j X 0 = i) Key recursion: m r i,j (n) = r i,k (n 1)p k,j k=1 With random initial state: m P(X n = j) = P(X 0 = i)r i,j (n) i=1 Does r ij converge to something? Does the limit depend on initial state?

Review of Markov chains Recurrent and transient states. State i is recurrent if: starting from i, and from wherever you can go, there is a way of returning to i. If not recurrent, called transient. Recurrent class collection of recurrent states that communicate to each other and to no other state. A recurrent state is periodic if: there is an integer d > 1 such that r i,i (k) = 0 when k is not an integer multiple of d Assume a single class of recurrent states, aperiodic. Then, lim r i,j (n) = π j n where π j does not depend on the initial conditions lim P(X n = j n X 0 ) = π j

π 1,..., π m can be found as the unique solution of the balance equations π j = π k p k,j k together with πj = 1 j

Entropy rate The entropy rate of a stochastic process is if it exists 1 lim H(X n ) n n For a stationary stochastic process, the entropy rate exists and is equal to lim H(X n X n 1 ) n since conditioning decreases entropy and by stationarity, it holds that H(X n+1 X n ) H(X n+1 X 2 n ) = H(X n X n 1 ) so it reaches a limit (decreasing non-negative sequence) Chain rule n 1 1 n H(Xn ) = H(Xi X i 1 ) n i=1 since the elements in the sum on the RHS reach a limit, that is the limit of the LHS

Entropy rate Markov chain entropy rate: lim H(X n X n 1 ) n = lim H(X n X n 1 ) n = H(X 2 X 1 ) = pi,jπ i log(p i,j ) i,j

Random walk on graph Consider undirected graph G = (N, E, W) where N, E, W are the nodes, edges and weights. With each edge there is an associated edge weight W i,j W i,j = W j,i W i = Wi,j W = j W i,j i,j:j>i 2W = Wi i

Random walk on graph We call a random walk the Markov chain in which the states are the nodes of the graph W i,j p i,j = Wi W i π i = 2W Check: i π i = 1 and i π i p i,j = W i 2W W i,j i W i W i,j = i 2W W j = 2W = π j

Random walk on graph H(X 2 X 1 ) = πi pi,j log(p i,j ) i j = W i W i,j Wi,j log i 2W j W i W i = W i,j Wi,j log i,j 2W W i = W i,j Wi,j log i,j 2W 2W W i,j Wi + log i,j 2W 2W W i,j Wi,j = log + W i Wi log 2W 2W 2W 2W i,j Entropy rate is difference of two entropies i Note: time reversibility for Markov chain that can be represented as random walk on undirected weighted graph

Hidden Markov models Consider an ALOHA wireless model M users sharing the same radio channel to transmit packets to a base station During each time slot, a packet arrives to a user s queue with probability p, independently of the other M 1 users Also, at the beginning of each time slot, if a user has at least one packet in its queue, it will transmit a packet with probability q, independently of all other users If two packets collide at the receiver, they are not successfully transmitted and remain in their respective queues

Hidden Markov models Let X i = (N[1] i, N[2] i,..., N[M] ) denote the random vector at time i where N[m] i is the number of packets that are in user m s queue at time i. X i is a Markov chain. Consider the random vector Y i = (Z[1], Z[2],..., Z[M]) where Z[m] i = 1 if user m transmits during time slot i and Z[i] = 0 otherwise Is Y i Markov?

Hidden Markov processes Let X i, X 2,... be a stationary Markov chain and let Y i = φ(x i ) be a process, each term of which is a function of the corresponding state in the Markov chain Y 1, Y 2,... form a hidden Markov chain, which is not always a Markov chain, but is still stationary What is its entropy rate?

Hidden Markov processes We suspect that the effect of initial information should decay H(Y n Y n 1 ) H(Y n Y n 1, X 1 ) = I(X 1 ; Y n Y n 1 ) should go to 0 Indeed, n lim I (X 1 ; Y n ) = n lim n i=1 I X1 ; Y i Y i 1 = I X 1 ; Y i Y i 1 i=1 since we have an infinite sum in which the terms are non-negative and which is upper bounded by H(X 1 ), the terms must tend to 0

MIT OpenCourseWare http://ocw.mit.edu 6.441 Information Theory Spring 2010 For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.