page 5 SOR() Chapter Markov Chains. Introduction and Definitions Consider a sequence of consecutive times ( or trials or stages): n =,,,... Suppose that at each time a probabilistic experiment is performed, the outcome of which determines the state of the system at that time. For convenience, denote (or label) the states of the system by {,,,...}, a discrete (finite or countably infinite) state space: at each time these states are mutually exclusive and exhaustive. Denote the event the system is in state k at time n by X n = k. For a given value of n, X n is a random variable and has a probability distribution {P(X n = k) : k =,,,...} termed the absolute probability distribution at time n. We refer to the sequence {X n } as a discrete-time stochastic process. Example Consider a sequence of independent Bernoulli trials, each with the same probability of success. Let X n be the number of successes obtained after n trials (n =,,...). We might find Trial n 3 5 6 7 8... Outcome S S S F S S F F... X n 3 3 5 5 5... (Here we ignore n = or define X = ). This is a particular realization of the stochastic process {X n }. We know that P(X n = k) is binomial. The distinguishing features of a stochastic process {X n } are the relationships between X, X, X,... Suppose we have observed the system at times,,..., n and we wish to make a probability statement about the state of the system at time n. known State i i i i n? Time n n The most general model has the state of the system at time n being dependent on the entire past history of the system. We would then be interested in conditional probabilities P(X n = i n X = i, X = i,..., X n = i n ).
page 5 SOR() The simplest model assumes that the states at different times are independent events, so that P(X n = i n X = i, X = i,..., X n = i n ) = P(X n = i n ), n =,,,... : i =,,,..., etc. The next simplest (Markov) model introduces the simplest form of dependence. Definition A sequence of r.v.s X, X, X,... is said to be a (discrete) Markov chain if P(X n = j X = i, X = i,..., X n = i) = P(X n = j X n = i) for all possible n, i, j, i,..., i n. (.) Hence in a Markov chain we do not require knowledge of what happened at times,,..., (n ) but only what happened at time (n ) in order to make a conditional probability statement about the state of the system at time n. We describe the system as making a transition from state i to state j at time n if X n = i and X n = j with (one-step) transition probability P(X n = j X n = i). We shall only consider systems for which the transition probabilities are independent of time, so that P(X n = j X n = i) = p ij for n and all i, j. (.) This is the stationarity assumption and {X n : n =,,...} is then termed a (time)-homogeneous Markov chain. The transition probabilities {p ij } form the transition probability matrix P : The {p ij } have the properties p p p... p p p... P = p p p............................ and all j p ij, all i, j p ij =, all i (.3) (note that i i transitions are possible). Such a matrix is termed a stochastic matrix. It is often convenient to show P on a state (or transition) diagram: each vertex (or node) in the diagram corresponds to a state and each arrow to a non-zero p ij. For example,..3.5 P =..6.5.. is represented by.5...3.6...
page 5 SOR() Knowledge of P and the initial distribution {P(X = k), k =,,,...} enables us, at least in theory, to calculate all probabilities of interest, e.g. absolute probabilities at time n, P(X n = k), k =,,,... and conditional probabilities (or m-step transition probabilities) P(X m+n = j X n = i), m. Notes: (i) Some systems may be more appropriately defined for n =,,... and/or a state space which is a subset of {,,,...}: the above discussion is readily modified to take account of such variations. (ii) If the system is known to be in state l at time, the initial distribution is P(X = l) = P(X = k) =, for k l.. Some simple examples Example. An occupancy problem. Balls are distributed, one after the other, at random among cells. Let X n be the number of empty cells remaining after the nth ball has been distributed. The stages are : n =,, 3,... The state space is: {,,, 3}. The possible transitions are: Transition Conditional X n X n Probability 3 3 3 3 Other transitions are impossible (i.e. have zero probabilities).
page 53 SOR() The probability distribution over states at stage n can be found from a knowledge of the state at stage (n ), the information from earlier stages not being required. Hence {X n } is a Markov chain. Furthermore, the transition probabilities are not functions of n, so the chain is homogeneous. The transition probability matrix is and the transition diagram is P = 3 3 Example. 3 Random walk with absorbing barriers. A random walk is the path traced out by the motion of a particle which takes repeatedly a step of one unit in some direction, the direction being randomly chosen. Consider a one-dimensional random walk on the x-axis, where there are absorbing barriers at x = and x = M (a positive integer), with 3 P(particle moves one unit to the right) = p P(particle moves one unit to the left) = p = q, except that if the particle is at x = or x = M it stays there. The steps or times are: n =,,,... Let X n = position of particle after step (or at time) n. The state space is: {,,,..., M}. The transition probabilities are homogeneous and given by: all other p ij being zero. {X n } is a homogeneous Markov chain with p i,i+ = p, p i,i = q; i =,,..., M p =, p M,M =,..... q p..... q p............ P =................... q p...
page 5 SOR() and the transition diagram is p p p p M- M q q q q Example.3 Discrete-time queue Customers arrive for service and take their place in a waiting line. During each time period a single customer is served, provided there is at least one customer waiting. During a service period new customers may arrive: the number of customers arriving in a time period is denoted by Y, with probability distribution P(Y = k) = a k, k =,,,... We assume that the numbers of arrivals in different periods are independent. Here the times (each marking the end of its respective period) are:,,,... Let X n = number of customers waiting at time n. The state space is: {,,,...} The situation at time n can be pictured as follows: n X n = i or X n = n X n = i + k = j X n = k time k arrivals in (n, n], prob. a k : k =,,... X n is a homogeneous Markov chain (why?) with a a a a 3.... a a a a 3.... a a a.... P = a a.................
page 55 SOR().3 Calculation of probabilities The absolute probabilities at time n can be calculated from those at time (n ) by invoking the law of total probability. Conditioning on the state at time n, we have: P(X n = j) = i = i P(X n = j X n = i)p(x n = i) p ij P(X n = i) (since {X n = i : i =,,,...} are mutually exclusive and exhaustive events). In matrix notation: p (n) = p (n ) P, n (.) where p (n) = (P(X n = ), P(X n = ),...) (row vector). Repeated application of (.) gives p (n) = p (r) P n r, r n (.5) and in particular i.e. P(X n = j) = i p (n) = p () P n (.6) p (n) ij P(X = i) (.7) where p (n) ij denotes the (i, j) element of P n. To obtain the conditional probabilities (or n-step transition probabilities), we condition on the initial state: P(X n = j) = P(X n = j X = i)p(x = i). (.8) i Then, comparing with (.7), we deduce that Also, because the Markov chain is homogeneous, So P(X n = j X = i) = p (n) ij. (.9) P(X r+n = j X r = i) = p (n) ij. P(X t = j X s = i) = p (t s) ij, t > s. (.) The main labour in actual calculations is the evaluation of P n. If the elements of P are numerical, P n may be computed by a suitable numerical method when the number of states is finite: if they are algebraic, there are special methods for some forms of P.
page 56 SOR(). Classification of States [Note: this is an extensive subject, and only a limited account is given here.] Suppose the system is in state k at time n =. Let f kk = P(X n = k for some n X = k), (.) i.e. f kk is the probability that the system returns at some time to the state k. The state k is called persistent or recurrent if f kk =, i.e. a return to k is certain. Otherwise (f kk < ) the state k is called transient (there is a positive probability that k is never re-entered). If p kk =, state k is termed an absorbing state. The state k is periodic with period t k > if and p (n) kk > when n is a multiple of t k p (n) kk = otherwise. Thus t k = gcd{n : p (n) kk > }: e.g. if p (n) kk > only for n =, 8,,.., then t k =. State k is termed aperiodic if no such t k > exists. State j is accessible (or reachable) from state i (i j) if p (n) ij > for some n >. If i j and j i, then states i and j are said to communicate (i j). It can be shown that if i j then states i and j (i) are both transient or both recurrent; (ii) have the same period. A set C of states is called irreducible if i j for all i, j C, so all the states in an irreducible set have the same period and are either all transient or all recurrent. A set C of states is said to be closed if no state outside C is accessible from any state in C, i.e. p ij = for all i C, j C. (Thus, an absorbing state is a closed set with just one state.) It can be shown that the entire state space can be uniquely partitioned as follows: T C C (.) where T is the set of transient states and C, C,... are irreducible closed sets of recurrent states (some of the C i may be absorbing states). Quite often, the entire state space is irreducible, so the terms irreducible, aperiodic etc. can be applied to the Markov chain as a whole. An irreducible chain contains at most one closed set of states. In a finite chain, it is impossible for all states to be transient: if the chain is irreducible, the states are recurrent.
page 57 SOR() Let s consider some examples. Example. (i) State space S = {,, }, with P =. The possible direct transitions are:, ;, ;,. So i j for all i, j S. S is the only closed set, and the Markov chain is irreducible, which in turn implies that all 3 states are recurrent...... Also p =, p () >, p(3) >,..., p(n) >. So state is aperiodic, which implies that all states are aperiodic. (ii) State space S = {,,, 3}, with P =. The possible direct transitions are:, 3 : : : 3, so again i j for all i, j S. S is the only closed set, the Markov chain is irreducible, and all states are recurrent...... Also p =, p () 3 =, p(3) = >... Thus state is periodic with period t = 3, and so all states have period 3. (iii) State space S = {,,, 3, }, with P =. The possible direct transitions are:, ;, ;, 3; 3, 3;,,. We conclude that {, } is a closed set, irreducible and aperiodic, and its states are recurrent; similarly for {, 3}: state is transient and aperiodic. Thus S = T C C, where T = {}, C = {, }, C = {, 3}.
page 58 SOR().5 The Limiting Distribution Markov s Theorem Consider a finite, aperiodic, irreducible Markov chain with states,,..., M. Then p (n) ij π j as n, for all i, j; (.3a) or π π...... π M P n π π...... π M............ as n. (.3b) π π...... π M The limiting probabilities {π j : j =,..., M} are the unique solution of the equations M π j = π i p ij, j =,..., M (.a) satisfying the normalisation condition i= M π i =. i= The equations (.a) may be written compactly as (.b) where Also π = πp, (.5) π = (π, π,..., π M ). p (n) = p () P n π as n. (.6) Example.5 Consider the Markov chain (i) in Example. above. Since it is finite, aperiodic and irreducible, there is a unique limiting distribution π which satisfies π = πp, i.e. π = π p + π p + π p, i.e. π = π + π π = π p + π p + π p, i.e. π = π + π π = π p + π p + π p, i.e. π = π + π together with the normalisation condition π + π + π = The best general approach to solving such equations is to set one π i =, deduce the other values and then normalise at the end: note that one of the equations in (.a) is redundant and can be used as a check at the end. So here, set π = : then the first two equations in (.a) yield π = π =. Since i π i = 3, normalisation yields π = π = π = 3.
page 59 SOR().6 Absorption in a finite Markov chain Consider a finite Markov chain consisting of a set T of transient states and a set A of absorbing states (termed an absorbing Markov chain). Let A k denote the event of absorption in state k and f ik the probability of A k when starting from the transient state i. A k T p ik i p ij f jk j Conditioning on the first transition (first step analysis) as indicated on the diagram, and using the law of total probability, we have P(A k X = i) = j A T P(A k X = i, X = j)p(x = j X = i) =.P(X = k X = i)+ j T P(A k X = j)p(x = j X = i), i.e. f ik = p ik + j T p ij f jk. (.7) Also, let T A denote the time to absorption (in any k A), and let µ i be the mean time to absorption starting from state i. Then, again conditioning on the first transition, we obtain µ i = E(T A X = i) = j A T E(T A X = i, X = j)p(x = j X = i) =.p ij + j A j T E(T A X = j)p ij = p ij + j A j T { + E(T A X = j)}p ij, i.e. µ i = + j T p ij µ j. (.8) (Note that in both (.7) and (.8) the summation j T includes j = i.)
page 6 SOR().6. Absorbing random walk and Gambler s Ruin Consider the random walk with absorbing barriers (at, M) introduced in Example. above. The states {,..., M } are transient. Then f im = p im + j T p ij f jm, i =,..., M i.e. f M = pf M f im = p i,i f i,m + p i,i+ f i+,m = qf i,m + pf i+,m, i =,..., M f M,M = p + qf M,M For convenience, define f M =, f MM =. Then f im = qf i,m + pf i+,m, i =,..., M. (.9) If we define d im = f im f i,m, i =,..., M (.) these difference equations can be written as simple -step recursions: It follows that Now so pd i+,m = qd im, i =,..., M. (.) d im = (q/p) i d M, i =,..., M. (.) i d jm = (f M f M ) + (f M f M ) + + (f im f i,m ) j= = f im f M = f im. f im = i (q/p) j d M j= = { + (q/p) + (q/p) + + (q/p) i }f M (q/p) i = (q/p) f M, if p. if M, if p = Using the fact that f MM =, we deduce that (q/p) f M = (q/p) M, if p M, if p = and so (q/p) i f im = (q/p) M, if p i M, if p = By a similar argument (or appealing to symmetry): (p/q) M i f i = (p/q) M, if q M i M, if q =,. (.3). (.)
page 6 SOR() We deduce that f i + f im = i.e. absorption at either or M is certain to occur sooner or later. Note that, as M, f im f i { (q/p) i, if p >, if p : { (q/p) i, if p >, if p. (.5) Similarly we have with µ = µ M =. The solution is µ i = + pµ i+ + qµ i, i M (.6) ( M ) (q/p)i µ i = p q (q/p) M i if p i(m i) if p =. (.7) We have in fact solved the famous Gambler s Ruin problem. Two players, A and B, start with a and (M a) respectively (so that their total capital is M. A coin is flipped repeatedly, giving heads with probability p and tails with probability q = p. Each time heads occurs, B gives to A, otherwise A gives to B. The game continues until one or other player runs out of money. After each flip the state of the system is A s current capital, and it is clear that this executes a random walk precisely as discussed above. Then (i) f am (f a ) is the probability that A (B) wins: (ii) µ a is the expected number of flips of the coin before one of the players becomes bankrupt and the game ends. From the result (.5) above, we see that if a gambler is playing against an infinitely rich adversary, then if p > there is a positive probability that the gambler s fortune will increase indefinitely, while if p the gambler is certain to go broke sooner or later.