Solved Problems. Chapter 14. 14.1 Probability review

Similar documents

Probability Generating Functions

E3: PROBABILITY AND STATISTICS lecture notes

Chapter 4 Lecture Notes

Question: What is the probability that a five-card poker hand contains a flush, that is, five cards of the same suit?

1. (First passage/hitting times/gambler s ruin problem:) Suppose that X has a discrete state space and let i be a fixed state. Let

Summary of Formulas and Concepts. Descriptive Statistics (Ch. 1-4)

6.3 Conditional Probability and Independence

Joint Exam 1/P Sample Exam 1

WHERE DOES THE 10% CONDITION COME FROM?

Random variables P(X = 3) = P(X = 3) = 1 8, P(X = 1) = P(X = 1) = 3 8.

IEOR 6711: Stochastic Models I Fall 2012, Professor Whitt, Tuesday, September 11 Normal Approximations and the Central Limit Theorem

ST 371 (IV): Discrete Random Variables

Random variables, probability distributions, binomial random variable

FEGYVERNEKI SÁNDOR, PROBABILITY THEORY AND MATHEmATICAL

3. Mathematical Induction

Math 431 An Introduction to Probability. Final Exam Solutions

Basic Probability Concepts

Normal distribution. ) 2 /2σ. 2π σ

IEOR 6711: Stochastic Models, I Fall 2012, Professor Whitt, Final Exam SOLUTIONS

Definition and Calculus of Probability

Answer Key for California State Standards: Algebra I

6.041/6.431 Spring 2008 Quiz 2 Wednesday, April 16, 7:30-9:30 PM. SOLUTIONS

1 Gambler s Ruin Problem

Section 1.3 P 1 = 1 2. = P n = 1 P 3 = Continuing in this fashion, it should seem reasonable that, for any n = 1, 2, 3,..., =

Chapter 3. Cartesian Products and Relations. 3.1 Cartesian Products

8 Divisibility and prime numbers

Practice problems for Homework 11 - Point Estimation

Math/Stats 425 Introduction to Probability. 1. Uncertainty and the axioms of probability

Lecture 13: Martingales

Covariance and Correlation

4. Continuous Random Variables, the Pareto and Normal Distributions

Chapter 3. Distribution Problems. 3.1 The idea of a distribution The twenty-fold way

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.436J/15.085J Fall 2008 Lecture 5 9/17/2008 RANDOM VARIABLES

Lecture Note 1 Set and Probability Theory. MIT Spring 2006 Herman Bennett

Continued Fractions and the Euclidean Algorithm

The sample space for a pair of die rolls is the set. The sample space for a random number between 0 and 1 is the interval [0, 1].

arxiv: v1 [math.pr] 5 Dec 2011

Mathematical Induction

6 PROBABILITY GENERATING FUNCTIONS

Bayesian Tutorial (Sheet Updated 20 March)

MATRIX ALGEBRA AND SYSTEMS OF EQUATIONS

MATRIX ALGEBRA AND SYSTEMS OF EQUATIONS. + + x 2. x n. a 11 a 12 a 1n b 1 a 21 a 22 a 2n b 2 a 31 a 32 a 3n b 3. a m1 a m2 a mn b m

Solution to Homework 2

1. Prove that the empty set is a subset of every set.

Lectures on Stochastic Processes. William G. Faris

INCIDENCE-BETWEENNESS GEOMETRY

Generating Functions

Probability Theory. Florian Herzog. A random variable is neither random nor variable. Gian-Carlo Rota, M.I.T..

CONTINUED FRACTIONS AND PELL S EQUATION. Contents 1. Continued Fractions 1 2. Solution to Pell s Equation 9 References 12

MATH10040 Chapter 2: Prime and relatively prime numbers

Binomial lattice model for stock prices

The Exponential Distribution

5. Continuous Random Variables

Tenth Problem Assignment

Homework 4 - KEY. Jeff Brenion. June 16, Note: Many problems can be solved in more than one way; we present only a single solution here.

Chapter 31 out of 37 from Discrete Mathematics for Neophytes: Number Theory, Probability, Algorithms, and Other Stuff by J. M.

Math 55: Discrete Mathematics

Modern Optimization Methods for Big Data Problems MATH11146 The University of Edinburgh

Notes on Probability Theory

LEARNING OBJECTIVES FOR THIS CHAPTER

Mathematical Induction. Mary Barnes Sue Gordon

Statistics 100A Homework 8 Solutions

WRITING PROOFS. Christopher Heil Georgia Institute of Technology

7: The CRR Market Model

Introduction to Probability

5.1 Radical Notation and Rational Exponents

Exponential Distribution

Chicago Booth BUSINESS STATISTICS Final Exam Fall 2011

Probabilistic Strategies: Solutions

Linear Algebra I. Ronald van Luijk, 2012

Review for Test 2. Chapters 4, 5 and 6

The last three chapters introduced three major proof techniques: direct,

A Few Basics of Probability

e.g. arrival of a customer to a service station or breakdown of a component in some system.

Notes on Determinant

Lecture 5 Principal Minors and the Hessian

Lecture 13 - Basic Number Theory.

Discrete Mathematics and Probability Theory Fall 2009 Satish Rao, David Tse Note 10

What is Statistics? Lecture 1. Introduction and probability review. Idea of parametric inference

Lecture 6: Discrete & Continuous Probability and Random Variables

The Prime Numbers. Definition. A prime number is a positive integer with exactly two positive divisors.

MATH Fundamental Mathematics IV

Lecture 2 Binomial and Poisson Probability Distributions

Mathematical Induction

Discrete Mathematics and Probability Theory Fall 2009 Satish Rao, David Tse Note 2

Overview of Monte Carlo Simulation, Probability Review and Introduction to Matlab

Probability: Terminology and Examples Class 2, 18.05, Spring 2014 Jeremy Orloff and Jonathan Bloom

Solutions for Practice problems on proofs

MATH 140 Lab 4: Probability and the Standard Normal Distribution

n k=1 k=0 1/k! = e. Example 6.4. The series 1/k 2 converges in R. Indeed, if s n = n then k=1 1/k, then s 2n s n = 1 n

SECTION 10-2 Mathematical Induction

Statistics 100A Homework 7 Solutions

For a partition B 1,..., B n, where B i B j = for i. A = (A B 1 ) (A B 2 ),..., (A B n ) and thus. P (A) = P (A B i ) = P (A B i )P (B i )

Hypothesis Testing for Beginners

Matrix Algebra. Some Basic Matrix Laws. Before reading the text or the following notes glance at the following list of basic matrix algebra laws.

Financial Mathematics and Simulation MATH Spring 2011 Homework 2

Stationary random graphs on Z with prescribed iid degrees and finite mean connections

ECE302 Spring 2006 HW4 Solutions February 6,

Discrete Mathematics and Probability Theory Fall 2009 Satish Rao, David Tse Note 13. Random Variables: Distribution and Expectation

Notes on Continuous Random Variables

Transcription:

Chapter 4 Solved Problems 4. Probability review Problem 4.. Let X and Y be two N 0 -valued random variables such that X = Y + Z, where Z is a Bernoulli random variable with parameter p (0, ), independent of Y. Only one of the following statements is true. Which one? (a) X + Z and Y + Z are independent (b) X has to be N 0 = {0,, 4,,... }-valued (c) The support of Y is a subset of the support of X (d) E[(X + Y )Z] = E[(X + Y )]E[Z]. (e) none of the above The correct answer is (c). (a) False. Simply take Y = 0, so that Y + Z = Z and X + Z = Z. (b) False. Take Y = 0. (c) True. For m in the support of Y (so that P[Y = m] > 0), we have P[X = m] P[Y = m, Z = 0] = P[Y = m]p[z = 0] = P[Y = m]( p) > 0. Therefore, m is in the support of X. (d) False. Take Y = 0. (e) False. Problem 4.. A fair die is tossed and its outcome is denoted by X, i.e., ( ) 4 5 X. / / / / / / 07

CHAPTER 4. After that, X independent fair coins are tossed and the number of heads obtained is denoted by Y. Compute:. P[Y = 4].. P[X = 5 Y = 4].. E[Y ]. 4. E[XY ].. For k =,...,, conditionally on X = k, Y has the binomial distribution with parameters k and. Therefore, {( k ) P[Y = i X = k] = i k, 0 i k 0, i > k, and so, by the law of total probability. P[Y = 4] = P[Y = 4 X = k]p[x = k] k= = ( 4 +. By the (idea behind the) Bayes formula ( ) 5 5 + 4 ( ) (4.) ) [ = 9 ] 4 84. P[X = 5 Y = 4] = P[X = 5, Y = 4] P[Y = 4] = P[Y = 4 X = 5]P[X = 5] P[Y = 4] = ( 5 4 ( 4 + ( 5 4 ) 5 ) 5 + ( ) ) [ = 0. Since E[Y X = k] = k (the expectation of a binomial with n = k and p = ), the law of total probability implies that 4. By the same reasoning, E[XY ] = E[Y ] = = E[Y X = k]p[x = k] = k= E[XY X = k]p[x = k] = k= ke[y X = k]p[x = k] = k= 4 k= k 9 ]. [ = 7 ] 4. E[kY X = k]p[x = k] k= k= [ k = 9 ]. Last Updated: December 4, 00 08 Intro to Stochastic Processes: Lecture Notes

CHAPTER 4. Problem 4... An urn contains red ball and 0 blue balls. Other than their color, the balls are indistiguishable, so if one is to draw a ball from the urn without peeking - all the balls will be equally likely to be selected. If we draw 5 balls from the urn at once and without peeking, what is the probability that this collection of 5 balls contains the red ball?. We roll two fair dice. What is the probability that the sum of the outcomes equals exactly 7?. Assume that A and B are disjoint events, i.e., assume that A B =. Moreover, let P[A] = a > 0 and P[B] = b > 0. Calculate P[A B] and P[A B], using the values a and b:. P["the red ball is selected"] = ( 0 ) 4 ) = 5. ( 5. There are possible outcomes (pairs of numbers) of the above roll. Out of those, the following have the sum equal to 7 : (, ), (, 5), (, 4), (4, ), (5, ), (, ). Since the dice are fair, all outcomes are equally likely. So, the probability is. According to the axioms of probability: Problem 4.4. =. P[A B] = P[A] + P[B] = a + b, P[A B] = P[ ] = 0.. Consider an experiment which consists of independent coin-tosses. Let the random variable X denote the number of heads appearing. Write down the probability mass function of X.. There are 0 balls in an urn numbered through 0. You randomly select of those balls. Let the random variable Y denote the maximum of the three numbers on the extracted balls. Find the probability mass function of Y. You should simplify your answer to a fraction that does not involve binomial coefficients. Then calculate: P[Y 7].. A fair die is tossed 7 times. We say that a toss is a success if a 5 or appears; otherwise it s a failure. What is the distribution of the random variable X representing the number of successes out of the 7 tosses? What is the probability that there are exactly successes? What is the probability that there are no successes? 4. The number of misprints per page of text is commonly modeled by a Poisson distribution. It is given that the parameter of this distribution is λ = 0. for a particular book. Find the probability that there are exactly misprints on a given page in the book. How about the probability that there are or more misprints? Last Updated: December 4, 00 09 Intro to Stochastic Processes: Lecture Notes

CHAPTER 4.. p 0 = P[{(T, T )}] = 4, p = P[{(T, H), (H, T )}] =, p = P[{(H, H)}] = 4, p k = 0, for all other k.. The random variable Y can take the values in the set {, 4,... 0}. For any i, the triplet resulting in Y attaining the value i must consist of the ball numbered i and a pair of balls with lower numbers. So, p i = P[Y = i] = ( i ) ( 0 ) = (i )(i ) 0 9 8 Since the balls are numbered through 0, we have So, = (i )(i ) 40 P[Y 7] = P[Y = 7] + P[Y = 8] + P[Y = 9] + P[Y = 0]. P[Y 7] = 5 40 + 7 40 + 8 7 40 + 9 8 40 = (0 + 4 + 5 + 7) 40 = 00 40 = 5.. X has a binominal distribution with parameters n = 7 and p = /, i.e., X b(7, /). ( ) ( ) 7 ( ) 4 P[X = ] = = 50 87, ( ) 7 P[X = 0] = = 8 87. 4. Let X denote the random variable which stands for the number of misprints on a given page. Then P[X = ] = 0. e 0. 0.0988,! P[X ] = P[X < ] = (P[X = 0] + P[X = ]) ( ) 0. 0 = e 0. + 0. e 0. 0!! = ( e 0. + 0.e 0.) =.e 0. 0... Last Updated: December 4, 00 0 Intro to Stochastic Processes: Lecture Notes

CHAPTER 4. Problem 4.5. Let X and Y be two Bernoulli random variables with the same parameter p =. Can the support of their sum be equal to {0, }? How about the case where p is not necesarily equal to? Note that no particular dependence structure between X and Y is assumed. Let p ij, i = 0,, j = 0, be defined by p ij = P[X = i, Y = j]. These four numbers effectively specify the full dependence structure of X and Y (in other words, they completely determine the distribution of the random vector (X, Y )). Since we are requiring that both X and Y be Bernoulli with parameter p, we must have Similarly, we must have p = P[X = ] = P[X =, Y = 0] + P[X =, Y = ] = p 0 + p. (4.) p = p 00 + p 0, (4.) p = p 0 + p, (4.4) p = p 00 + p 0 (4.5) Suppose now that the support of X + Y equals to {0, }. Then p 00 > 0 and p 0 + p 0 > 0, but p = 0 (why?). Then, the relation (4.) implies that p 0 = p. Similarly, p 0 = p by relation (4.4). Relations (4.) and (4.5) tell us that p 00 = p. When p =, this implies that p 00 = 0 - a contradiction with the fact that 0 X+Y. When p <, there is still hope. We construct X and Y as follows: let X be a Bernoulli random variable with parameter p. Then, we define Y depending on the value of X. If X =, we set Y = 0. If X = 0, we set Y = 0 with probabilty p p p and with probability p. How do we know that Y is Bernoulli with probability p? We use the law of total probability: P[Y = 0] = P[Y = 0 X = 0]P[X = 0] + P[Y = 0 X = ]P[X = ] = p p ( p) + p = p. Similarly, P[Y = ] = P[Y = X = 0]P[X = 0] + P[Y = X = ]P[X = ] = ( p p )( p) = p. 4. Random Walks Problem 4.. Let {X n } n N0 be a symmetric simple random walk. For n N the average of the random walk on the interval [0, n] is defined by A n = n n X k. k=. Is {A n } n N0 a simple random walk (not necessarily symmetric)? Explain carefully using the definition.. Compute the covariance Cov(X k, X l ) = E[(X k E[X k ])(X l E[X l ])], for k l N Last Updated: December 4, 00 Intro to Stochastic Processes: Lecture Notes

CHAPTER 4.. Compute the variance of A n, for n N. (Note: You may need to use the following identities: n k = k= n(n + ) and n k = k= n(n + )(n + ), for n N.). No it is not. The distribution of A is 4 4 4 4 In particular, its support is {,,, }. For n =, A = X, which has the support {, }. Therefore, the support of the difference A A cannot be {, } (indeed, the difference must be able to take fractional values). This is in contradiction with the definition of the random walk which states that all increments must have distributions supported by {, }.. We write X k = k i= ξ i, and X l = l j= ξ j, where {ξ n } n N are the (iid) increments of {X n } n N0. Since E[X k ] = E[X l ] = 0, we have ( k ) l Cov(X k, X l ) = E[X k X l ] = E ξ i When the sum above is expanded, the terms of the form E[ξ i ξ j ], for i j will disappear because ξ i and ξ j are independent for i j. The only terms left are those of the form E[ξ i ξ i ]. Their value is, and there are k of them (remember k l). Therefore, i= Cov(X k, X l ) = k.. Let B n = n k= X k = na n. We know that E[A n ] = 0, and that Var[A n ] = Var[B n n ], so it suffices to compute [( n )] Var[B n ] = E[Bn] = E X l k= X k) ( n l= We expand the sum on the right and group together equal (k = l) and different indices (k l) to get Var[B n ] = n Var[X k ] + k= k<l n Cov(X k, X l ) = j= ξ j n k + k= k<l n k. Last Updated: December 4, 00 Intro to Stochastic Processes: Lecture Notes

CHAPTER 4. For each k =,,..., n, there are exactly n k values of l such that k < l n. Therefore, k<l n k = n k= k(n k), and so Therefore, Var[B n ] = n k + k= = (n + ) n k(n k) = (n + ) k= n(n + ) Var[A n ] = n k k= n(n + )(n + ) (n + )(n + ). n = n k= k n(n + )(n + ). Another way to approach this computation uses the representation X k = k i= ξ k, so ( n ) ( n ) k Var[B n ] = E X k = E ξ i k= ( n = E = n i= k=i n (n i + ) = i= ) ξ i = E where we use the fact that E[ξ i ξ j ] = 0 for i j. n i = i= k= i= ( n ) (n i + )ξ i i= n(n + )(n + ), Problem 4.7. Let {X n } n N0 be a simple random walk. The area A n under the random walk on the interval [0, n] is defined by n A n = X k + X n. k= (Note: Draw a picture of a path of a random walk and convince yourself that the expression above is really the area under the graph.) Compute the expectation and variance of A n. Problem 4.8. Let {X n } n N0 be a simple symmetric random walk (X 0 = 0, p = q = ). What is the probability that X n will visit all the points in the set {,,, 4} at least once during the time-interval n {0,..., 0}? (Note: write your answer as a sum of binomial coefficients; you don t have to evaluate it any further.) The random walk will visit all the points in {,,, 4} if and only if its maximum M 0 during {0,..., 0} is at least equal to 4. Since we start at X 0 = 0, visiting all of {,,, 4} at least once is the same as visiting 4 at least once. Therefore, the required probability is P[M 0 4]. We know (from the lecture notes) that P[M n = l] = P[X n = l] + P[X n = l + ]. Last Updated: December 4, 00 Intro to Stochastic Processes: Lecture Notes

CHAPTER 4. Therefore, 0 ( P[M 0 4] = P[X0 = l] + P[X 0 = l + ] ) l=4 = P[X 0 = 4] + P[X 0 = ] + P[X 0 = 8] + P[X 0 = 0] (( ) ( ) ( ) ) 0 0 0 = 0 + + +. 7 8 9 Problem 4.9. Let {X n } n N0 be a simple symmetric random walk (X 0 = 0, p = q = ). What is the probability that X n will not visit the point 5 during the time-interval n {0,..., 0}? The random walk will not visit the point 5 if only if its maximum M 0 during {0,..., 0} is at most equal to 4. Therefore, the required probability is P[M 0 4]. We know (from the lecture notes) that P[M n = l] = P[X n = l] + P[X n = l + ]. Therefore, P[M 0 4] = 4 ( P[X0 = l] + P[X 0 = l + ] ) l=0 = P[X 0 = 0] + P[X 0 = ] + P[X 0 = 4] (( ) ( ) ( )) 0 0 0 = 0 + +. 5 7 Problem 4.0. A fair coin is tossed repeatedly and the record of the outcomes is kept. Tossing stops the moment the total number of heads obtained so far exceeds the total number of tails by. For example, a possible sequence of tosses could look like HHTTTHTHHTHH. What is the probability that the length of such a sequence is at most 0? Let X n, n N 0 be the number of heads minus the number of tails obtained so far. Then, {X n } n N0 is a simple symmetric random walk, and we stop tossing the coin when X hits for the first time. This will happen during the first 0 tosses, if and only if M 0, where M n denotes the (running) maximum of X. According to the reflection principle, P[M 0 ] = P[X 0 ] + P[X 0 4] = (P[X 0 = 4] + P[X 0 = ] + P[X 0 = 8] + P[X 0 = 0]) [( ) ( ) ( ) ( )] 0 0 0 0 = 9 [ + + + = ] 0. 4. Generating functions Problem 4.. If P (s) is the generating function of the random variable X, then the generating function of X + is Last Updated: December 4, 00 4 Intro to Stochastic Processes: Lecture Notes

CHAPTER 4. (a) P (s + ) (b) P (s) + (c) P (s + ) (d) sp (s ) (e) none of the above The correct answer is (d): using the representation P Y (s) = E[s Y ], we get P X+ = E[s X+ ] = E[s(s X )] = se[(s ) X ] = sp X (s ). Alternatively, we can use the fact that 0, k {0,, 4,... }, P[X + = k] =, k {,, 5,... }, where p n = P[X = n], n N 0, so that p k P X+ (s) = 0 + p 0 s + 0s + p s + 0s + p s 4 + = sp X (s ). Problem 4.. Let P (s) be the generating function of the sequence (p 0, p,... ) and Q(s) the generating function of the sequence (q 0, q,... ). If the sequence {r n } n N0 is defined by { 0, n r n = n k= p kq n k, n >, then its generating function is given by (Note: Careful! {r n } n N0 of {p n } n N0 and {q n } n N0. ) is not exactly the convolution (a) P (s)q(s) p 0 q 0 (b) (P (s) p 0 )(Q(s) q 0 ) (c) s (P (s) p 0)Q(s) (d) s P (s)(q(s) q 0) (e) s (P (s) p 0 )(Q(s) q 0 ) The correct answer is (c). Let R(s) be the generating function of the sequence {r n } n N0 and let ˆr be the convolution ˆr n = n k=0 p kq n k of the sequences {p n } n N0 and {q n } n N0, so that the generating function of ˆr is P (s)q(s). For n >, n ˆr n = p k q n k = r n + p 0 q n. k=0 Last Updated: December 4, 00 5 Intro to Stochastic Processes: Lecture Notes

CHAPTER 4. Therefore, for s (0, ), P (s)q(s) = Therefore, R(s) = P (s) p 0 s Q(s). Problem 4.. ˆr n s n = ˆr 0 + ˆr n s n n=0 = p 0 q 0 + n= (r n+ + p 0 q n )s n n= = p 0 q 0 + s (R(s) r s r 0 ) + p 0 q n s n = p 0 Q(s) + s R(s). A fair coin and a fair -sided die are thrown repeatedly until the the first time appears on the die. Let X be the number of heads obtained (we are including the heads that may have occurred together with the first ) in the count. The generating function of X is (a) (b) s 5s + s 5s (c) +s 7 5s (d) +s 7 4s (e) None of the above X can be seen as a random sum of -Bernoulli random variables with the number of summands given by G, the number of tries it takes to get the first heads. G is clearly geometrically distributed with parameter. The generating function of G is P G(s) = function of the whole random sum X is Problem 4.4.. Compute p 0 = P[X = 0].. Compute p = P[X = ]. P X (s) = P G (( + s)) = + s 7 5s. n= s 5s, so the generating The generating function of the N 0 -valued random variable X is given by P X (s) = s + s.. Does E[X] exist? If so, find its value; if not, explain why not.. p 0 = P X (0), so p 0 = 0, Last Updated: December 4, 00 Intro to Stochastic Processes: Lecture Notes

CHAPTER 4.. p = P X (0), and P X(s) = so p =. s ( + s ),. If E[X] existed, it would be equal to P X (). However, lim P X(s) = +, s so E[X] (and, equivalently, P X ()) does not exist. Problem 4.5. Let N be a random time, independent of {ξ n } n N, where {ξ n } n N is a sequence of mutually independent Bernoulli ({0, }-valued) random variables with parameter p B (0, ). Suppose that N has a geometric distribution g(p g ) with parameter p g (0, ). Compute the distribution of the random sum N Y = ξ k. (Note: You can think of Y as a binomial random variable with random n.) k= Independence between N and {ξ n } n N allows us to use the fact that the generating function P Y (s) of Y is given by P Y (s) = P N (P B (s)), where P N (s) = pg q is the generating function of N (geometric distribution) and P gs B(s) = q B + p B s is the generating function of each ξ k (Bernoulli distribution). Therefore, P Y (s) = p g q g (q b + p B s) = p g q gq B qgp B q gq B s = p Y q Y s, where p Y = p g q g q B and q Y = p Y. P Y can be recognized as the generating function of a geometric random variable with parameter p Y. Problem 4.. Six fair gold coins are tossed, and the total number of tails is recorded; let s call this number N. Then, a set of three fair silver coins is tossed N times. Let X be the total number of times at least two heads are observed (among the N tosses of the set of silver coins). (Note: A typical outcome of such a procedure would be the following: out of the six gold coins 4 were tails and were heads. Therefore N = 4 and the 4 tosses of the set of three silver coins may look something like {HHT, T HT, T T T, HT H}, so that X = in this state of the world. ) Find the generating function and the pmf of X. You don t have to evaluate binomial coefficients. Let H k, k N, be the number of heads on the k th toss of the set of three silver coins. The distribution of H k is binomial so P[H k ] = P[H k = ]+P[H k = ] = /8+/8 = /. Let ξ k be the indicator {, H k, ξ k = {Hk } = 0, H k <. Last Updated: December 4, 00 7 Intro to Stochastic Processes: Lecture Notes

CHAPTER 4. The generating function P ξ (s) of each ξ k is a generating function of a Bernoulli random variable with p =, i.e. P ξ (s) = ( + s). The random variable N is has a binomial distribution with parameters n = and p =. Therefore, P N (s) = ( + s). By a theorem from the notes, the generating function of the value of the random sum X is given by P X (s) = P N (P ξ (s)) = ( + ( + s) ) = ( 4 + 4 s). Therefore, X is a binomial random variable with parameters n = and p = /4, i.e. ( ) k p k = P[X = k] =, k = 0,...,. k 4 Problem 4.7. Tony Soprano collects his cut from the local garbage management companies. During a typical day he can visit a geometrically distributed number of companies with parameter p = 0.. According to many years worth of statistics gathered by his consigliere Silvio Dante, the amount he collects from the i th company is random with the following distribution ( ) $000 $000 $000 X i 0. 0.4 0.4 The amounts collected from different companies are independent of each other, and of the number of companies visited.. Find the (generating function of) the distribution of the amount of money S that Tony will collect on a given day.. Compute E[S] and P[S > 0].. The number N of companies visited has the generating function P N (s) = 0. 0.9s, and the generting function of the amount X i collected from each one P X (s) = 0.s 000 + 0.4s 000 + 0.4s 000. The total amount collected is given by the random sum S = so the generating function of S is given by P S (s) = P N (P X (s)) = N X i, i= 0. 0.9(0.s 000 + 0.4s 000 + 0.4s 000 ). Last Updated: December 4, 00 8 Intro to Stochastic Processes: Lecture Notes

CHAPTER 4.. The expectation of S is given by E[S] = P S() = P N(P X ())P X() = E[N] E[X ] = 9 $00 = $9800. To compute P[S > 0], we note that S = 0 if and only if no collections have been made, i.e., if the value of N is equal to 0. Since N is geometric with parameter p = 0., so P[S > 0] = P[S = 0] = P[N = 0] = p = 0.9. Problem 4.8. Let X be an N 0 -valued random variable, and let P (s) be its generating function. Then the generating function of 4X is given by. (P (s)) 4,. P (P (P (P (s)))),. P (s 4 ), 4. 4P (s), 5. none of the above. () Problem 4.9. Let X and Y be two N 0 -valued random variables, let P X (s) and P Y (s) be their generating functions and let Z = X Y, V = X + Y, W = XY. Then. P X (s) = P Z (s)p Y (s),. P X (s)p Y (s) = P Z (s),. P W (s) = P X (P Y (s)), 4. P Z (s)p V (s) = P X (s)p Y (s), 5. none of the above. (5) Problem 4.0. Let X be an N 0 -valued random variable, and let P (s) be its generating function, and let Q(s) = P (s)/( s). Then. Q(s) is a generating function of a random variable,. Q(s) is a generating function of a non-decreasing sequence of non-negative numbers,. Q(s) is a concave function on (0, ), 4. Q(0) =, 5. none of the above. () Last Updated: December 4, 00 9 Intro to Stochastic Processes: Lecture Notes

CHAPTER 4. 4.4 Random walks - advanced methods Problem 4.. Let {X n } n N0 be a simple random walk with p = P[X = ] (0, ). Then (a) X will visit every position l {...,, 0,,,... } sooner or later, with probability (b) E[X n] < E[X n+ ], for all n N 0. (c) E[X n ] < E[X n+ ], for all n N 0. (d) X n has a binomial distribution. (e) None of the above. The correct answer is (b). (a) False. When p <, the probability of ever hitting the level l = is less than. (b) True. X n+ = X n + ξ n+, so that, by the independence of X n and ξ n+, we have E[X n+] = E[X n] + E[X n ]E[ξ n+ ] + E[ξ n+] > E[X n], because E[X n ]E[ξ] = n(p ) (p ) 0 and E[ξ ] = > 0. (c) False. E[X n+ ] E[X n ] = p, which could be negative if p <. (d) False. The support of X n is { n, n+,..., n, n}, while the support of any binomial distribution has the form {0,,..., N}, for some N N. (e) False. Problem 4.. Throughout the problem, we let τ k be the first hitting time of the level k, for k Z = {...,,, 0,,,... } for a simple random walk {X n } n N0 with p = P[X = ] (0, ). Furthermore, let P k be the generating function of the distribution of τ k.. When p =, we have (a) P[τ < ] =, E[τ ] <, (b) P[τ < ] = 0, E[τ ] <, (c) P[τ < ] =, E[τ ] =, (d) P[τ < ] (0, ), E[τ ] =, (e) none of the above.. Either one of the following 4 random times is not a stopping time for a simple random walk {X n } n N0, or they all are. Choose the one which is not in the first case, or choose (e) if you think they all are. (a) the first hitting time of the level 4, Last Updated: December 4, 00 0 Intro to Stochastic Processes: Lecture Notes

CHAPTER 4. (b) the first time n such that X n X n X, (c) the first time the walk hits the level or the first time the walk sinks below 5, whatever happens first, (d) the second time the walk crosses the level 5 or the third time the walk crosses the level, whatever happens last, (e) none of the above.. We know that P (the generating function of the distribution of the first hitting time of level ) satisfies the following equation P (s) = ps + qsp (s). The generating function P of the first hitting time τ of the level then satisfies (a) P (s) = ps + qsp (s), (b) P (s) = qs + psp (s), (c) P (s) = ps qsp (s), (d) P (s) = ps + qsp (s), (e) none of the above. 4. Let P be the generating function of the first hitting time of the level. Then (a) P (s) = ps + qsp (s) 4, (b) P (s) = ps + qsp (s), (c) P (s) = P (s), (d) P (s) = P (s), (e) none of the above.. The correct answer is (c); see table at the very end of Lecture 7.. The correct answer is (e); first, second, or... hitting times are stopping times, and so are their minima or maxima. Note that for two stopping times T and T, the one that happens first is min(t, T ) and the one that happens last is max(t, T ).. The correct answer is (b); hitting is the same as hitting, if you switch the roles of p and q. 4. The correct answer is (d); to hit, you must first hit and then hit, starting from. The times between these two are independent, and distributed like τ. Last Updated: December 4, 00 Intro to Stochastic Processes: Lecture Notes

CHAPTER 4. 4.5 Branching processes Problem 4.. Let {Z n } n N0 be a simple branching process which starts from one individual. Each individual has exactly three children, each of whom survives until reproductive age with probability 0 < p <, and dies before he/she is able to reproduce with probability q = p, independently of his/her siblings. The children that reach reproductive age reproduce according to the same rule.. Write down the generating function for the offspring distribution..0 0.8 0. 0.4 0.. For what values of p will the population go extinct with probability. (Hint: You don t need to compute much. Just find the derivative P () and remember the picture from class ) 0. 0.4 0. 0.8.0. The number of children who reach reproductive age is binomially distributed with parameters and p. Therefore, P (s) = (ps + q).. We know that the extinction happens with probability if and only if the graphs of functions s and P (s) meet only at s =, for s [0, ]. They will meet once again somewhere on [0, ] if and only if P () >. Therefore, the extinction happens with certainty if P () = p(p + q) = p, i.e., if p /. Problem 4.4. Let {Z n } n N0 be a branching process with offspring distribution (p 0, p,... ). We set P (s) = n=0 p ns n. The extinction is inevitable if (a) p 0 > 0 (b) P ( ) > 4 (c) P () < (d) P () (e) None of the above The correct answer is (c). (a) False. Take p 0 = /, p = 0, p = /, and p n = 0, n >. Then the equation P (s) = s reads / + /s = s, and s = / is its smallest solution. Therefore, the probability of extinction is /. (b) False. Take p 0 = 0, p = /, p = /, and p n = 0, n >. Then the equation P (s) = s reads /s + /s = s, and s = 0 is its smallest solution. Therefore, the probability of extinction is 0, but P (/) = /4 + /8 > /4. Last Updated: December 4, 00 Intro to Stochastic Processes: Lecture Notes

CHAPTER 4. (c) True. The condition P () <, together with the convexity of the function P imply that the graph of P lies strictly above the 45-degree line for s (0, ). Therefore, the only solution of the equation P (s) = s is s =. (d) False. Take the same example as in (a), so that P () = 4/ >. On the other hand, the extinction equation P (s) = s reads / + /s = s. Its smallest solution if s = /. (e) False. Problem 4.5. Bacteria reproduce by cell division. In a unit of time, a bacterium will either die (with probability 4 ), stay the same (with probability 4 ), or split into parts (with probability ). The population starts with 00 bacteria at time n = 0.. Write down the expression for the generating function of the distribution of the size of the population at time n N 0. (Note: you can use n-fold composition of functions.). Compute the extinction probability for the population.. Let m n be the largest possible number of bacteria at time n. Find m n and compute the probability that there are exactly m n bacteria in the population at time n, n N 0. 4. Given that there are 000 bacteria in the population at time 50, what is the expected number of bacteria at time 5?. The generating function for the offspring distribution is P (s) = 4 + 4 s + s. We can think of the population Z n at time n, as the sum of 00 independent populations, each one produced by one of 00 bacteria in the initial population. The generating function for the number of offspring of each bacterium is P (n) (s) = P (P (... P (s)... )) } {{ } n Ps Therefore, since sums of independent variables correspond to products in the world of generating functions, the generating function of Z n is given by [ 00 P Zn (s) = P (s)] (n).. As in the practice problem, the extinction probability of the entire population (which starts from 00) is the 00-th power of the extinction probability p of the population which starts from a single bacterium. To compute p, we write down the extinction equation s = 4 + 4 s + s. Its solutions are s = and s =, so p = s =. Therefore, the extinction probability for the entire population is ( )00. Last Updated: December 4, 00 Intro to Stochastic Processes: Lecture Notes

CHAPTER 4.. The maximum size of the population at time n will occurr if each of the initial 00 bacteria splits into each time, so m n = 00 n. For that to happen, there must be 00 splits to produce the first generation, 00 for the second, 400 for the third,..., and n 00 for the n-th generation. Each one of those splits happens with probability and is independent of other splits in its generation, given the previous generations. It is clear that for Z n = m n, we must have Z n = m n,..., Z = m, so P[Z n = m n ] = P[Z n = m n, Z n = m n,... Z = m, Z 0 = m 0 ] = P[Z n = m n Z n = m n, Z n = m n,... ]P[Z n = m n, Z n = m n,... ] = P[Z n = m n Z n = m n ]P[Z n = m n Z n = m n, Z n = m n,... ]P[Z n = m n,.. =... = P[Z n = m n Z n = m n ]P[Z n = m n Z n = m n ]... P[Z 0 = m 0 ] = ( )m n ( )m n... ( )m 0 = ( )00(++ +n ) = ( )00 (n ) 4. The expected number of offspring of each of the 000 bacteria is given by P () = 5 4. Therefore, the total expected number of bacteria at time 5 is 000 5 4 = 50. Problem 4.. A branching process starts from 0 individuals, and each reproduces according to the probability distribution (p 0, p, p,... ), where p 0 = /4, p = /4, p = /, p n = 0, for n >. The extinction probability for the whole population is equal to (a) (b) (c) 0 (d) 00 (e) 04 The correct answer is (e). The extinction probability for the population starting from each of the 0 initial individuals is given by the smallest solution of /4 + /4s + /s = s, which is s = /. Therefore, the extinction probability for the whole population is (/) 0 = /04. Problem 4.7. A (solitaire) game starts with N silver dollars in the pot. At each turn the number of silver dollars in the pot is counted (call it K) and the following procedure is repeated K times: a die is thrown, and according to the outcome the following four things can happen If the outcome is or the player takes silver dollar from the pot. If the outcome is nothing happens. If the outcome is 4 the player puts extra silver dollar in the pot (you can assume that the player has an unlimited supply of silver dollars). If the outcome is 5 or, the player puts extra silver dollars in the pot. Last Updated: December 4, 00 4 Intro to Stochastic Processes: Lecture Notes

CHAPTER 4. If there are no silver dollars on in the pot, the game stops.. Compute the expected number of silver dollars in the pot after turn n N.. Compute the probability that the game will stop eventually.. Let m n be the maximal possible number of silver dollars in the pot after the n-th turn? What is the probability that the actual number of silver dollars in the pot after n turns is equal to m n? The number of silver dollars in the pot follows a branching process {Z n } n N0 with Z 0 = and offspring distribution whose generating function P (s) is given by P (s) = + s + s + s.. The generating function P Zn of Z n is (P (P (... P (s)... ))), so E[Z n ] = P Z n () = (P ()) n = ( + + )n = n+ n. The probability p that the game will stop is the extinction probability of the process. We solve the extinction equation P (s) = s to get the extinction probability corresponding to the case Z 0 =, where there is silver dollar in the pot: P (s) = s + s + s + s = s (s )(s + )(s ) = 0. The smallest solution in [0, ] of the equation above is /. To get the true (corresponding to Z 0 = ) extinction probability we raise it to the third power to get p = 8.. The maximal number m n of silver dollars in the pot is achieved if we roll 5 or each time. In that case, there will be rolls in the first turn, 9 = in the second, 7 = in the third, etc. That means that after n turns, there will be at most m n = n+ silver dollars in the pot. In order to have exactly m n silver dollars in the pot after n turns, we must keep getting 5s and s throughout the first n turns. The probability of that is ( )n + n + + = ( ) n. After that, we must get a 5 or a in n throws and a 4 in a single throw during turn n. There are n possible ways to choose the order of the throw which produces the single 4, so the later probability is n ( )n ( ) = ( )n n. We multiply the two probabilities to get P[Z n = m n ] = ( ) n+ n. Problem 4.8. It is a well-known fact(oid) that armadillos always have identical quadruplets (four offspring). Each of the 4 little armadillos has a / chance of becoming a doctor, a lawyer or a scientist, independently of its siblings. A doctor armadillo will reproduce further with probability /, a lawyer with probability / and a scientist with probability /4, again, independently of everything else. If it reproduces at all, an armadillo reproduces only once in its life, and then leaves the armadillo scene. (For the purposes of this problem assume that armadillos reproduce asexually.) Let us call the armadillos who have offspring fertile. Last Updated: December 4, 00 5 Intro to Stochastic Processes: Lecture Notes

CHAPTER 4.. What is the distribution of the number of fertile offspring? Write down its generating function.. What is the generating function for the number of great-grandchildren an armadillo will have? What is its expectation? (Note: do not expand powers of sums). Let the armadillo population be modeled by a branching process, and let s suppose that it starts from exactly one individual at time 0. Is it certain that the population will go extinct sooner or later?. Each armadillo is fertile with probability p, where p = P[ Fertile ] = P[ Fertile Lawyer ] P[ Lawyer ] + P[ Fertile Doctor ] P[ Doctor ] + P[ Fertile Scientist ] P[ Scientist ] = + + 4 = 5. Therefore, the number of fertile offspring is binomial with n = 4 and p = 5. The generating function of this distribution is P (s) = ( 7 + 5 s)4.. To get the number of great-grandchildren, we first compute the generating function Q(s) of the number of fertile grandchildren. This is simply given by the composition of P with itself, i.e., P (P (s)) = ( 7 + 5 ( 7 + 5 s)4 ) 4. Finally, the number of great-grandchildren is the number of fertile grandchildren multiplied by 4. Therefore, its generating function is given by Q(s) = P (P (s 4 )) = ( 7 + 5 ( 7 + 5 s4 ) 4 ) 4. The compute the expectation, we need to evaluate Q (): so that Q (s) = (P (P (s 4 ))) = P (P (s 4 ))P (s 4 )4s and P (s) = 4 5 ( 7 + 5 s), Q () = 4P ()P () = 00 9.. We need to consider the population of fertile armadillos. Its offspring distribution has generating function P (s) = ( 7 + 5 s)4, so the population will go extinct with certainty if and only if the extinction probability is, i.e., if s = is the smallest solution of the extinction equation s = P (s). We know, however, that P () = 5 5 = 0 >, so there exists a positive solution to P (s) = s which is smaller than. Therefore, it is not certain that the population will become extinct sooner or later. Problem 4.9. Branching in alternating environments. Suppose that a branching process {Z n } n N0 is constructed in the following way: it starts with one individual. The individuals in odd generations reproduce according to an offspring distribution with generating function P odd (s) and those in even generations according to an offspring distribution with generating function P even (s). All independence assumptions are the same as in the classical case. Last Updated: December 4, 00 Intro to Stochastic Processes: Lecture Notes

CHAPTER 4.. Find an expression for the generating function P Zn of Z n.. Derive the extinction equation.. Since n = 0 is an even generation, the distribution of the number of offspring of the initial individual is given by P even (s). Each of the Z individuals in the generation n = reproduce according the generating function P odd, so P Z (s) = P Z k= Z (s) = P even (P odd (s)).,k Similarly, P Z (s) = P even (P odd (P even (s))), and, in general, for n N 0, P Zn (s) = P even (P odd (P even (... P odd (s)... )),, and } {{ } n Ps P Zn+ (s) = P even (P odd (P even (... P even (s)... )),. } {{ } n+ Ps (4.). To compute the extinction probability, we must evaluate the limit p = P[E] = lim n P[Z n = 0] = lim n P Z n (0). Since the limit exists (see the derivation in the classical case in the notes), we know that any subsequence also converges towards the same limit. In particular, p = lim n x n, where x n = P[Z n = 0] = P Zn (0). By the expression for P Zn above, we know that x n+ = P even (P odd (x n )), and so p = lim n x n+ = lim n P even (P odd (x n )) = P even (P odd (lim n x n )) = P even (P odd (p)), which identifies p = P even (P odd (p)), (4.7) as the extinction equation. I will leave it up to you to figure out whether p is characterized as the smallest positive solution to (4.7). Problem 4.0. In a branching process, the offspring distribution is given by its characteristic function where a, b, c > 0. P (s) = as + bs + c (i) Find the extiction probability for this branching process. Last Updated: December 4, 00 7 Intro to Stochastic Processes: Lecture Notes

CHAPTER 4. (ii) Give a condition for sure extinction. (i) By Proposition 7.5 in the lecture notes, the extinction probability is the smallest non-negative solution to P (x) = x. So, in this case: ax + bx + c = x ax + (b )x + c = 0. The possible solutions are x, = ( (b ) ± (b ) a 4ac ). Recall that it is necessary that P () = a + b + c =. So, x, = ( (b ) ± ( (a + c) ) a 4ac ) = ( b ± (a + c) a 4ac ) = ( ) b ± (a c) a = ( ) a + c ± a c. a Now, we differentiate two cases: If a > c, then So, and x, = ( ) ( ) a + c ± a c = a + c ± (a c). a a The solution we are looking for is c/a. If a c, then So, and x = (a + c + a c) = a x = a (a + c a + c) = c a <. x, = ( ) ( ) a + c ± a c = a + c ± (c a). a a x = a (a + c + c a) = c a x = a (a + c c + a) = a a =. The solution we are looking for is, i.e., in this case there is sure extinction. Last Updated: December 4, 00 8 Intro to Stochastic Processes: Lecture Notes

CHAPTER 4. (ii) The discussion above immediately yields the answer to question (ii): the necessary and sufficient condition for certain extinction is a c. Problem 4.. The purpose of this problem is to describe a class of offspring distributions (pmfs) for which an expression for P Zn (s) can be obtained. An N 0 -valued distribution is said to be of fractional-linear type if its generating function P has the following form P (s) = as + b cs, (4.8) for some constants a, b, c 0. In order for P to be a generating function of a probability distribution we must have P () =, i.e. a + b + c =, which will be assumed throughout the problem.. What (familiar) distributions correspond to the following special cases (in each case identify the distribution and its parameters): (a) c = 0 (b) a = 0. Let A be the following matrix [ ] [ a b A = and let A n a (n) b = (n) ] c c (n) d (n) be its n th power (using matrix multiplication, of course). prove that Using mathematical induction P (P (... P (s)... ) = a(n) s + b (n) } {{ } c (n) s + d (n). (4.9) n Ps. Take a = 0 and b = c = /. Show inductively that A n = [ ] (n ) n n. (4.0) n n + Use that to write down the generating function of Z n in the linear-fractional form (4.8). 4. Find the extinction probability as a function of a, b and c in the general case (don t forget to use the identity a + b + c =!). Note: It seems that not everybody is familiar with the principle of mathematical induction. It is a logical device you can use if you have a conjecture and want to prove that it is true for all n N. Your conjecture will be typically look like For each n N, the statement P (n) holds. Last Updated: December 4, 00 9 Intro to Stochastic Processes: Lecture Notes

CHAPTER 4. where P (n) is some assertion which depends on n (For example, P (n) could be + + + + n = n(n + ) (4.) The principle of mathematical induction says that you can prove that your statement is true for all n N by doing the following two things:. Prove the statement for n =, i.e., prove P (). (induction basis). Prove that the implication P (n) P (n + ) always holds, i.e., prove the statement for n + if you are, additionally, allowed to use the statement for n as a hypothesis. (inductive step) As an example, let us prove the statement (4.) above. For n =, P () reads =, which is evidently true. Supposing that P (n) holds, i.e., that we can add n + to both sides to get + + + n = n(n + ), ++ +n+(n+) = n(n + ) n(n + ) + (n + ) +(n+) = = n + n + = (n + )(n + ), which is exactly P (n + ). Therefore, we managed to prove P (n + ) using P (n) as a crutch. The principle of mathematical induction says that this is enough to be able to conclude that P (n) holds for each n, i.e., that (4.) is a ture statement for all n N. Back to the solution of the problem:. (a) When c = 0, P (s) = as + b - Beronulli distribution with success probability a = b. (b) For a = 0, P (s) = b cs - Geometric distribution with success probability b = c.. For n =, a () = a, b () = b, c () = c and d () =, so clearly P (s) = a() s + b () c () s + d (). Suppose that the equality (4.9) holds for some n N. Then [ a (n+) b (n+) ] [ c (n+) d (n+) = A n+ = A n a (n) b A = (n) ] [ ] [ a b a a c (n) d (n) = (n) c b (n) b a (n) + b (n) ] c a c (n) c d (n) b c (n) + d (n) On the other hand, by the inductive assumption, P (P (... P (s)... ) = a(n) P (s) + b (n) } {{ } c (n) P (s) + d (n) = a(n) as+b cs + b(n) c (n) as+b cs + d(n) n+ Ps = (a a(n) c b (n) )s + b a (n) + b (n) (a c (n) c d (n) )s + b c (n) + d (n). Therefore, (4.9) also holds for n +. By induction, (4.9) holds for all n N. Last Updated: December 4, 00 0 Intro to Stochastic Processes: Lecture Notes

CHAPTER 4.. Here A = [ ] 0 / = / [ ] [ ] 0 = ( ) + so the statement holds for n = (induction basis). Suppose that (4.0) holds for some n. Then [ ] [ ] A n+ = A n A = (n ) n 0 n n n + = [ ] n (n ) + n n+ n n + (n + ) = [ ] n n + n+ (n + ) n + which is exactly (4.0) for n + (inductive step). Thus, (4.0) holds for all n N. By the previous part, the generating function of Z n is given by P Zn (s) = (n )s + n ns + (n + ). We divide the numerator and the denominator by (n + ) to get the above expression into the form dictated by (4.8): P Zn (s) = n n+ s + n n+ n n+ s. Note that a = n n+, b = n n+ and c = n n+ so that a + b + c = n++n+n n+ =, as required. 4. For the extinction probability we need to find the smallest solution of s = as + b cs in [0, ]. The equation above transforms into a quadratic equation after we multiply both sides by cs s cs = as + b, i.e., cs + (a )s + b = 0. (4.) We know that s = is a solution and that a = b c, so we can factor (4.) as cs + (a )s + b = (s )(cs b). If c = 0, the only solution is s =. If c 0, the solutions are and b c. Therefore, the extinction probability P[E] is given by {, c = 0, P[E] = min(, b c ), otherwise. Last Updated: December 4, 00 Intro to Stochastic Processes: Lecture Notes

CHAPTER 4. Problem 4.. (Too hard to appear on an exam, but see if you can understand its solution. Don t worry if you can t.) For a branching process {Z n } n N0, denote by S the total number of individuals that ever lived, i.e., set S = Z n = + Z n. n=0 (i) Assume that the offspring distribution has the generating function given by Find the generating function P S in this case. n= P (s) = p + qs. (ii) Assume that the offspring distribution has the generating function given by Find P S in this case. P (s) = p/( qs). (iii) Find the general expression for E[S] and calculate this expectation in the special cases (i) and (ii). In general, the r.v. S satisfies the relationship P S (s) = E[s + n= Zn ] = = k=0 k=0 E[s + n= Zn Z = k]p[z = k] se[s Z + n= Zn Z = k]p[z = k] When Z = k, the expression Z + n= Z n counts the total number of individuals in k separate and independent Branching processes - one for each of k members of the generation at time k =. Since this random variable is the sum of k independent random variables, each of which has the same distribution as S (why?), we have E[s Z + n= Zn Z = k] = [P S (s)] k. Consequently, P S is a solution of the following equation P S (s) = s [P S (s)] k P[Z = k] = sp (P S (s)). k=0 (i) In this case, P S (s) = s(p + qp S (s)) P S (s) = sp sq. Last Updated: December 4, 00 Intro to Stochastic Processes: Lecture Notes

CHAPTER 4. (iii) Here, P S (s) must satisfy P S (s)( qp S (s)) sp = 0, i.e., qp S (s) P S (s) + sp = 0. Solving the quadratic, we get as the only sensible solution (the one that has P S (0) = 0): P S (s) = 4spq. q Note that P S () < if p < q. Does that surprise you? Should not: it is possible that S = +. (iii) Using our main recursion for P S (s), we get So, for s =, If µ, E[S] = +. If µ <, P S(s) = (sp Z (P S (s))) = P Z (P S (s)) + sp Z (P (s))p (s). E[S] = P S() = P Z (P S ()) + P Z (P S ())P s() = + µe[s]. E[S] = µ. 4. Markov chains - classification of states Problem 4.. Let {X n } n N0 be a simple symmetric random walk, and let {Y n } n N0 be a random process whose value at time n N 0 is equal to the amount of time (number of steps, including possibly n) the process {X n } n N0 has spent above 0, i.e., in the set {,,... }. Then (a) {Y n } n N0 is a Markov process (b) Y n is a function of X n for each n N. (c) X n is a function of Y n for each n N. (d) Y n is a stopping time for each n N 0. (e) None of the above The correct answer is (e). Last Updated: December 4, 00 Intro to Stochastic Processes: Lecture Notes

CHAPTER 4. (a) False. The event {Y = } corresponds to exactly two possible paths of the random walk - {X =, X = 0, X = } and {X =, X = 0, X = }, each occurring with probability 8. In the first case, there is no way that Y 4 =, and in the second one, Y 4 = if and only if, additionally, X 4 =. Therefore, On the other hand, P[Y 4 = Y = ] = P[Y = ] P[Y 4 = and Y = ] (4.) = 4P[X =, X = 0, X =, X 4 = ] = 4. P[Y 4 = Y =, Y =, Y =, Y 0 = 0] = P[Y 4 = and X =, X = 0, X = ] P[X =, X = 0, X = ] = 0. (b) False. Y n is a function of the entire past X 0, X,..., X n, but not of the individual value X n. (c) False. Except in trivial cases, it is impossible to know the value of X n if you only know how many of the past n values are positive. (d) False. The fact that, e.g., Y 0 = (meaning that X hits exactly once in its first 0 steps and immediately returns to 0) cannot possibly be known at time. (e) True. Problem 4.4. Suppose that all classes of a Markov chain are recurrent, and let i, j be two states such that i j. Then (a) for each state k, either i k or j k (b) j i (c) p ji > 0 or p ij > 0 (d) n= p(n) jj < (e) None of the above The correct answer is (b). (a) False. Take a chain with two states, where p = p =, and set i = j =, k =. (b) True. Recurrent classes are closed, so i and j belong to the same class. Therefore j i. (c) False. Take a chain with 4 states,,, 4 where p = p = p 4 = p 4 =, and set i =, j =. (d) False. That would mean that j is transient. (e) False. Last Updated: December 4, 00 4 Intro to Stochastic Processes: Lecture Notes

CHAPTER 4. Problem 4.5. Which of the following statements is true? Give a short explanation (or a counterexample where appropriate) for your choice. Here, {X n } n N0 be a Markov chain with state space Z.. {X n} n N0 must be a Markov chain.. {X n} n N0 does not have to be a Markov chain.. There exists a set T and a function f : S T such that {f(x n )} n N0 is not a Markov chain. 4. There exists a set T and a function f : S T which is not a bijection (one-to-one and onto) such that {f(x n )} n N is a Markov chain, 5. None of the above. Problem 4.. Two containers are filled with ping-pong balls. The red container has 00 red balls, and the blue container has 00 blue balls. In each step a container is selected; red with probability / and blue with probability /. Then, a ball is selected from it - all balls in the container are equally likely to be selected - and placed in the other container. If the selected container is empty, no ball is transfered. Once there are 00 blue balls in the red container and 00 red balls in the blue container, the game stops (and we stay in that state forever). We decide to model the situation as a Markov chain.. What is the state space S we can use? How large is it?. What is the initial distribution?. What are the transition probabilities between states? Don t write the matrix, it is way too large; just write a general expression for p ij, i, j S. 4. How many transient states are there? There are many ways in which one can solve this problem. Below is just one of them.. In order to describe the situation being modeled, we need to keep track of the number of balls of each color in each container. Therefore, one possibility is to take the set of all quadruplets (r, b, R, B), r, b, R, b {0,,,..., 00} and this state space would have 0 4 elements. We know, however, that the total number of red balls, and the total number of blue balls is always equal to 00, so the knowledge of the composition of the red (say) container is enough to reconstruct the contents of the blue container. In other words, we can use the number of balls of each color in the red container only as our state, i.e. S = {(r, b) : r, b = 0,,..., 00}. This state space has 0 0 = 00 elements. Last Updated: December 4, 00 5 Intro to Stochastic Processes: Lecture Notes

CHAPTER 4.. The initial distribution of the chain is deterministic: P[X 0 = (00, 0)] = and P[X 0 = i] = 0, for i S \ {(00, 0)}. In the vector notation, a (0) = (0, 0,...,, 0, 0,..., 0), where is at the place corresponding to (00, 0).. Let us consider several separate cases, under the understandin that p ij = 0, for all i, j not mentioned explicitely below: (a) One of the containers is empty. In that case, we are either in (0, 0) or in (00, 00). Let us describe the situation for (0, 0) first. If we choose the red container - and that happens with probability - we stay in (0, 0): p (0,0),(0,0) =. If the blue container is chosen, a ball of either color will be chosen with probability 00 00 =, so p (0,0),(,0) = p (0,0),(0,) = 4. By the same reasoning, p (00,00),(0,0) = and p (00,00),(99,00) = p (00,00),(00,99) = 4. (b) We are in the state (0, 00). By the description of the model, this is an absorbing state, so p (0,00),(0,00) =. (c) All other states. Suppose we are in the state (r, b) where (r, b) {(0, 00), (0, 0), (00, 00)}. r If the red container is chosen, then the probability of getting a red ball is r+b, so Similarly, p (r,b),(r,b) = p (r,b),(r,b ) = r r+b. b r+b. In the blue container there are 00 r red and 00 b blue balls. Thus, and p (r,b),(r+,b) = p (r,b),(r,b+) = 00 r 00 r b, 00 b 00 r b. (d) The state (0, 00) is clearly recurrent (it is absorbing). Let us show that all other (000 of them) are transient. It will be enough to show that i (0, 00), for each i S, because (0, 00) is in a recurrent class by itself. To do that, we just need to find a path from any state (r, b) to (0, 00) through the transition graph: (r, b) (r, b) (0, b) (0, b + ) (0, 00). Last Updated: December 4, 00 Intro to Stochastic Processes: Lecture Notes

CHAPTER 4. Problem 4.7. Let {Z n } n N0 be a Branching process (with state space S = {0,,,, 4,... } = N 0 ) with the offspring probability given by p 0 = /, p = /. Classify the states (find classes), and describe all closed sets. If i = 0, the i is an absorbing state (and, therefore, a class in itself). For i N, p ij > 0 if j is of the form k for some k i (out of i individuals, i k die and the remaining k have two children). In particular, noone ever communicates with odd i N. Therefore, each odd i is a class in itself. For even i, it communicates with all even numbers from 0 to i; in particular, i i + and i i. Therefore, all positive even numbers intercommunicate and form a class. Clearly, {0} is a closed set. Every other closed set will contain a positive even number; indeed, after one step we are in an even state, no matter where be start from, and this even number could be positive, unless the initial state was 0). Therefore, any closed set different from {0} will have to contain the set of all even numbers N 0 = {0,, 4,... }, and, N 0 is a closed set itself. On the other hand, if we add any collection of odd numbers to N 0, the resulting set will still be closed (exiting it would mean hitting an odd number outside of it). Therefore, closed sets are exactly the sets of the form C = {0} or C = N 0 D, where D is any set of odd numbers. Problem 4.8. Which of the following statements is true? Give a short explanation (or a counterexample where appropriate) for your choice. {X n } n N0 is a Markov chain with state space S.. p (n) ij f (n) ij for all i, j S and all n N.. If states i and j intercommunicate, then there exists n N such that p (n) ij > 0 and p (n) ji > 0.. If all rows of the transition matrix are equal, then all states belong to the same class. 4. If f ij < and f ji <, then i and j do not intercommunicate. 5. If P n I, then all states are recurrent. (Note: We say that a sequence {A n } n N of matrices converges to the matrix A, and we denote it by A n A, if (A n ) ij A ij, as n, for all i, j.). TRUE. p (n) ij is the probability that X n = j ( given that X 0 = i). On the other hand f (n) ij is the probability that X n = j and that X k j for all k =,..., n (given X 0 = i). Clearly, the second event is a subset of the first, so f (n) ij p (n) ij.. FALSE. Consider the following Markov chain. Last Updated: December 4, 00 7 Intro to Stochastic Processes: Lecture Notes

CHAPTER 4. All states intercommunicate, but p (n) > 0 if and only if n is of the form k +, for k N 0. On the other hand p (n) > 0 if and only if n = k +, for some k N 0. Thus, p (n) and p(n) are never simultaneously positive.. FALSE. Consider a Markov chain with the following transition matrix: [ ] 0 P =. 0 Then is an absorbing state and it is in a class of its own, so it is not true that all states belong to the same class. 4. FALSE. Consider the following Markov chain: Then the states and clearly intercommunicate. In the event that (starting from ), the first transition happens to be, we are never visiting. Otherwise, we visit at time. Therefore, f = / <. Similarly, f = / <. 5. TRUE. Suppose that there exists a transient state i S. Then n p(n) ii <, and, in particular, p (n) ii 0, as n. This is a contradiction with the assumption that p (n) ii, for all i S.. Last Updated: December 4, 00 8 Intro to Stochastic Processes: Lecture Notes

CHAPTER 4. Problem 4.9. Consider a Markov Chain whose transition graph is given below 0.5 0.5 5 0.5. 4 0.5. 0.5 0.5 0.5 7. 0.75 8 0.5 0.5. Identify the classes.. Find transient and recurrent states.. Find periods of all states. 4. Compute f (n), for all n N. 5. Using Mathematica, we can get that, approximately, P 0 = 0 0 0.5 0.4 0.07 0.4 0. 0.9 0 0 0. 0.5 0.07 0.5 0. 0.9 0 0 0. 0.7 0.5 0.8 0 0 0 0 0.7 0. 0. 0.9 0 0 0 0 0.9 0.8 0.5 0.8 0 0 0 0 0.8 0.9 0.4 0.9 0 0 0 0 0 0 0 0 0.4 0.57 0 0 0 0 0 0 0.4 0.57 where P is the transition matrix of the chain. Compute the probability P[X 0 = ], if the initial distribution (the distribution of X 0 ) is given by P[X 0 = ] = / and P[X 0 = ] = /.. The classes are T = {}, T = {}, C = {, 4, 5, } and C = {7, 8}. The states in T and T are transient, and the others are recurrent.. The periods are all., Last Updated: December 4, 00 9 Intro to Stochastic Processes: Lecture Notes

CHAPTER 4. 4. For n =, f (n) = 0, since you need at least two steps to go from to. For n =, the chain needs to follow, for f (n) = (/). For n larger than, the only possibility is for the chain to stay in the state for n periods, jump to and finish at, so f (n) = (/)n in that case. All together, we have { f (n) = 0, n = (/) n, n. 5. We know from the notes that the distribution of X 0, when represented as a vector a (0) = (a (0), a (0),..., a (0) 8 ) satisfies a (0) = a (0) P 0. By the assumption a (0) = (, 0,, 0, 0, 0, 0, 0), so P[X 0 = ] = a (0) = 0.5 + 0. = 0.5. Problem 4.40. A country has m + cities (m N), one of which is the capital. There is a direct railway connection between each city and the capital, but there are no tracks between any two non-capital cities. A traveler starts in the capital and takes a train to a randomly chosen noncapital city (all cities are equally likely to be chosen), spends a night there and returns the next morning and immediately boards the train to the next city according to the same rule, spends the night there,..., etc. We assume that his choice of the city is independent of the cities visited in the past. Let {X n } n N0 be the number of visited non-capital cities up to (and including) day n, so that X 0 =, but X could be either or, etc.. Explain why {X n } n N0 is a Markov chain on the appropriate state space S and the find the transition probabilities of {X n } n N0, i.e., write an expression for P[X n+ = j X n = i], for i, j S.. Find recurrent and transient classes and periods of all states. Sketch the transition graph for m =.. Let τ m be the first time the traveler has visited all m non-capital cities, i.e. τ m = min{n N 0 : X n = m}. What is the distribution of τ m, for m = and m =. 4. Compute E[τ m ] for general m N. (Note: you don t need to use any heavy machinery. In particular, no knowledge of the absorption and reward techniques are necessary.). The natural state space for {X n } n N0 is S = {,,..., m}. It is clear that P[X n+ = j X n = i] = 0, unless, i = j or i = j +. If we start from the state i, the process will remain in i if the traveler visits one of the already-visited cities, and move to i + is the visited city has never been visited before. Thanks to the uniform distributrion in the choice of the next city, the probability that a never-visited city will be selected is m i m, and it does not depend on the (names of the) cities already visited, or on the times of their first visits; it only depends on Last Updated: December 4, 00 40 Intro to Stochastic Processes: Lecture Notes

CHAPTER 4. their number. Consequently, the extra information about X, X,..., X n will not change the probability of visiting j in any way, which is exactly what the Markov property is all about. Therefore, {X n } n N is Markov and its transition probabilities are given by 0, j {i, i + } p ij = P[X n+ = j X n = i] = m i m, j = i + i m, j = i. (Note: the situation would not be nearly as nice if the distribution of the choice of the next city were non-uniform. In that case, the list of the (names of the) already-visited cities would matter, and it is not clear that the described process has the Markov property (does it?). ). Once the chain moves to the next state, it can never go back. Therefore, it cannot happen that i j unless i = j, and so, every state is a class of its own. The class {m} is clearly absorbing and, therefore, recurrent. All the other classes are transient since it is possible (and, in fact, certain) that the chain will move on to the next state and never come back. As for periods, all of them are, since is in the return set of each state. 0.7 0. 0.7 0.. For m =, τ m = 0, so its distribution is deterministic and concentrated on 0. The case m = is only slightly more complicated. After having visited his first city, the visitor has a probability of of visiting it again, on each consecutive day. After a geometrically distributed number of days, he will visit another city and τ will be realized. Therefore the distribution {p n } n N0 of τ is given by 4. For m >, we can write τ m as so that p 0 = 0, p =, p = ( ), p = ( ),... τ m = τ + (τ τ ) + + (τ m τ m ), E[τ m ] = E[τ ] + E[τ τ ] + + E[τ m τ m ]. We know that τ = 0 and for k =,,..., m, the difference τ k+ τ k denotes the waiting time before a never-before-visited city is visited, given that the number of already-visited cities is k. This random variable is geometric with sucess probability given by m k m, so its expecation is given by E[τ k+ τ k ] = = m m k m k. m Last Updated: December 4, 00 4 Intro to Stochastic Processes: Lecture Notes

CHAPTER 4. Therefore, E[τ m ] = m k= m m k = m( + + + + m ). (Note: When m is large, the partial harmonic sum H m = + + + log m, so that, asymptotically, E[τ m ] behaves like m log m.) m behaves like 4.7 Markov chains - absorption and reward Problem 4.4. [ In ] a Markov chain with a finite number of states, the fundamental matrix is 4 given by F =. The initial distribution of the chain is uniform on all transient states. The expected value of 4 τ C = min{n N 0 : X n C}, where C denotes the set of all recurrent states is (a) 7 (b) 8 (c) 5 4 (d) 9 (e) None of the above. The correct answer is (c). Using the reward g, the vector v of expected values of τ C, where each entry corresponds to a different transient initial state is [ ] [ ] [ ] 4 7 v = F g = = 4 Finally, the initial distribution puts equal probabilities on the two transient states, so, by the law of total probability, E[τ C ] = 7 + = 5 4. Problem 4.4. A fair -sided die is rolled repeatedly, and for n N, the outcome of the n-th roll is denoted by Y n (it is assumed that {Y n } n N are independent of each other). For n N 0, let X n be the remainder (taken in the set {0,,,, 4}) left after the sum n k= Y k is divided by 5, i.e. X 0 = 0, and X n = n Y k ( mod 5 ), for n N, k= making {X n } n N0 a Markov chain on the state space {0,,,, 4} (no need to prove this fact).. Write down the transition matrix of the chain.. Classify the states, separate recurrent from transient ones, and compute the period of each state. Last Updated: December 4, 00 4 Intro to Stochastic Processes: Lecture Notes

CHAPTER 4.. Compute the expected number of rolls before the first time {X n } n N0 visits the state, i.e., compute E[τ ], where τ = min{n N 0 : X n = }. 4. Compute E[ τ k=0 X k]. (Note: For parts () and (4), you can phrase your answer as (B C) ij, where the matrices B and C, as well as the row i and the column j have to be given explicitly. You don t need to evaluate B. ). The outcomes,,, 4, 5, leave remainders,,, 4, 0,, when divided by 5, so the transition matrix P of the chain is given by P =. Since p ij > 0 for all i, j S, all the states belong to the same class, and, because there is at least one recurrent state in a finite-state-space Markov chain and because recurrence is a class property, all states are recurrent. Finally, is in the return set of every state, so the period of each state is.. In order to be able to use absorption-and-reward computations, we turn the state into an absorbing state, i.e., modify the transition matrix as follows P = 0 0 0 0 Now, the first hitting time of the state corresponds exactly to the absorption time, i.e., the time τ C the chain hits its first recurrent state (note that for the new transition probabilities the state is recurrent and all the other states are transient). In the canonical form, the new transition matrix looks like 0 0 0 0 Last Updated: December 4, 00 4 Intro to Stochastic Processes: Lecture Notes

CHAPTER 4. Therefore, Q =, and F = (I Q) 5 = 5 5 The expected time until absorption is given by an absorption-and-reward computation with the reward matrix g =, so that, since we are starting from the state 0, we get E[τ ] = (F g ). 4. If the sum went from 0 to τ (up to, but not including the state where we get absorbed), it would correspond exactly to an absorption-and-reward computation with reward g given by 0 g =. 4 Therefore, E[ τ k=0 X k] = (F g ). To get the final answer, we need to add the value X τ (which is clearly equal to ) to the sum: E[ τ k=0 X k ] = (F g ) +. Problem 4.4. In a Gambler s ruin situation with a = (S = {0,,, }) and p = compute the probability that the gambler will go bankrupt before her wealth reaches a, if her initial wealth is X 0 =. You must use the matrix technique (the one in the Absorption and Reward lecture with U, Q, etc.) Remember that for a, b, c, d R with ad bc 0, we have [ ] [ ] a b d b =. c d ad bc c a The states 0 and are absorbing, and all the others are transient. Therefore C = {0}, C = {} and T = T = {, }. The transition matrix P in the canonical form (the rows and columns represent the states in the order 0,,, ) P = 0 0 0 0 0 0 0 0 0 0 so that Q = 0 0 5, R = 0 0 Last Updated: December 4, 00 44 Intro to Stochastic Processes: Lecture Notes

CHAPTER 4. The matrix I Q is a matrix so it is easy to invert: (I Q) = 4. So U = (I Q) R = 4 0 0 = 4 Therefore, for the initial wealth, the probability of going bankrupt before getting rich is. Problem 4.44. Let {X n } n N0 Suppose that the chains starts from the state. be a Markov chain with the following transition matrix / / 0 P = / / / 0 0. What is expected time that will pass before the chain first hits?. What is the expected number of visits to state before is hit?. Would your answers to () and () change if we replaced values in the first row of P by any other values (as long as P remains a stochastic matrix)? Would and still be transient states? 4. Use the idea of part () to answer the following question. What is the expected number of visits to the state before a Markov chain with transition matrix 7/0 /0 /0 P = /5 /5 /5 /5 4/5 / hits the state for the first time (the initial state is still )? Remember this trick for the final. The states and are transient and is recurrent, so the canonical decomposition is {} {, } and the canonical form of the transition matrix is 0 0 P = 0 / / / / / The matrices Q and R are given by Q = and the fundamental matrix F = (I Q) is [ ] [ ] / / 0, R =, / / / F = [ ] 4. 4 4. Last Updated: December 4, 00 45 Intro to Stochastic Processes: Lecture Notes

CHAPTER 4.. The reward function g() = g() =, i.e. g = absorption: [ ] [ ] [ ] 4 v = F g = = will give us the expected time until [ ] 7. 5 Since the initial state is i =, the expected time before we first hit is v = 7.. Here we use the reward function to get so the answer is v =. g(i) = { 0, i =,, i =. [ ] [ ] 4 0 v = F g = = [ ],. The way the question is posed, the answer is Maybe. What I intended to ask, though, is the same question about the third row. In that case, the answer is No, they would not. Indeed, these numbers only affect what the chain does after it hits for the first time, and that is irrelevant for calculations about the events which happen prior to that. No, all states would be recurrent. The moral of the story that the absorption calculations can be used even in the settings where all states are recurrent. You simply need to adjust the probabilities, as shown in the following part of the problem. 4. We make the state absorbing (and states and transient) by replacing the transition matrix by the following one: 7/0 /0 /0 P = /5 /5 /5 0 0 The new chain behaves exactly like the old one until it hits for the first time. Now we find the canonical decomposition and compute the matrix F as above and get [ ] 8 F =, 4 9 so that the expected number of visits to (with initial state ) is F =. Problem 4.45. A math professor has 4 umbrellas. He keeps some of them at home and some in the office. Every morning, when he leaves home, he checks the weather and takes an umbrella with him if it rains. In case all the umbrellas are in the office, he gets wet. The same procedure is repeated in the afternoon when he leaves the office to go home. The professor lives in a tropical region, so the chance of rain in the afternoon is higher than in the morning; it is /5 in the afternoon and /0 in the morning. Whether it rains of not is independent of whether it rained the last time he checked. On day 0, there are umbrellas at home, and in the office. What is the expected number of days that will pass before the professor gets wet (remember, there are two trips each day)? What is the probability that the first time he gets wet it is on his way home from the office? Last Updated: December 4, 00 4 Intro to Stochastic Processes: Lecture Notes

CHAPTER 4. We model the situation by a Markov chain whose state space S is given by S = {(p, u) : p {H, O}, u {0,,,, 4, w}}, where the first coordinate denoted the current position of the professor and the second the number of umbrellas at home (then we automatically know how many umbrellas there are at the office). The second coordinate w stands for wet and the state (H, w) means that the professor left home without an umbrella during a rain (got wet). The transitions between the states are simple to figure out. For example, from the state (H, ) we either move to (O, ) (with probability 9/0) or to (O, ) (with probability /0), and from (O, 4) we move to (O, w) with probability /5 and to (H, 4) with probability 4/5. States (H, w) and (O, w) are made absorbing, and so, all the other states are transient. The first question can be reformulated as a reward problem with reward g, and the second one is about absorption. We use Mathematica to solve it, and the code is on the right. The answers are: the probability of getting wet for the first time on the way home from the office is about 99%. The expected number of days before getting wet is about 0. ph 0; po 5; Transitions "W", "O", "W", "O",, "W", "H", "W", "H",, 4, "H", 4, "O", ph, 4, "H",, "O", ph,, "H",, "O", ph,, "H",, "O", ph,, "H",, "O", ph,, "H",, "O", ph,, "H",, "O", ph,, "H", 0, "O", ph, 0, "H", 0, "O", ph, 0, "H", "W", "H", ph, 0, "O", 0, "H", po, 0, "O",, "H", po,, "O",, "H", po,, "O",, "H", po,, "O",, "H", po,, "O",, "H", po,, "O",, "H", po,, "O", 4, "H", po, 4, "O", 4, "H", po, 4, "O", "W", "O", po ; Initial, "H", ; Rain BuildChain Transitions, Initial ; P TransitionMatrix Rain ; States Rain 0, H, 0, O,, H,, O,, H,, O,, H,, O, 4, H, 4, O, W, H, W, O Q P ;; 0, ;; 0 ; F Inverse IdentityMatrix 0 Q ; R P ;; 0, ;; ; G Table, i,, 0 ; InitialState StateToPosition, "H", Rain ; AbsorptionProbabilities N F.R InitialState 0.00978, 0.9890 ExpectedTime N F.G InitialState 9.57 Problem 4.4. A zoologist, Dr. Gurkensaft, claims to have trained Basil the Rat so that it can aviod being shocked and find food, even in highly confusing situations. Another scientist, Dr. Hasenpfeffer does not agree. She says that Basil is stupid and cannot tell the difference between food and an electrical shocker until it gets very close to either of them. The two decide to see who is right by performing the following experiment. Basil is put in the compartment of a maze that looks like this: Food 4 5 Shock. Dr. Gurkensaft s hypothsis is that, once in a compartment with k exits, Basil will prefer the exits Last Updated: December 4, 00 47 Intro to Stochastic Processes: Lecture Notes

CHAPTER 4. that lead him closer to the food. Dr. Hasenpfeffer s claim is that every time there are k exits from a compartment, Basil chooses each one with probability /k. After repeating the experiment 00 times, Basil got shocked before getting to food 5 times and he reached food before being shocked 48 times. Compute the theoretical probabilities of being shocked before getting to food, under the assumption that Basil is stupid (all exits are equally likely). Compare those to the observed data. Whose side is the evidence on? The task here is to calculate the absorption probability for a Markov chain which represents the maze. Using Mathematica, we can do it in the following way: In[0]:= Transitions,,,,,,,,,, "Food",,, 4,,,,,, 4,,, "Shock",, 4,,, 4,,, 4, 5,, 5, 4,, 5, "Food", ; Initial, ; Maze BuildChain Transitions, Initial ; In[]:= Out[]= States Maze,,, 4, 5, Food, Shock In[4]:= P TransitionMatrix Maze ; Q P ;; 5, ;; 5 ; R P ;; 5, ;; 7 ; F Inverse IdentityMatrix 5 Q ; In[8]:= N F.R Out[8]= 0.47, 0.58 Therefore, the probability of getting to food before being shocked is about 4%. This is somewhat lower than the observed 48% and even though there may be some truth to Dr. Gurkensaft s claims, these two numbers are not very different. Note: For those of you who know a bit of statistics, you can easily show that we cannot reject the null hypothesis that Basil is stupid (in the precise sense described in the problem), even at the 90% significance level. In fact, the (one-sided) p-value (using a binomial test) is a bir larger than 0.. That means that a truly stupid rat would appear smarter than Basil 0% of the time by pure chance. 4.8 Markov chains - stationary and limiting distributions Problem 4.47. Let {X n } n N0 be a Markov chain with the transition matrix P = 4 4 0. Find all stationary distributions. Last Updated: December 4, 00 48 Intro to Stochastic Processes: Lecture Notes

CHAPTER 4.. The chain starts from the state i =. What is the expected number of steps before it returns to?. How many times, on average, does the chain visit between two consecutive visits to? 4. Each time the chain visits the state, $ is added to an account, $ for the state, and nothing in the state. Estimate the amount of money on the account after 0000 transitions? You may assume that the law of large numbers for the Markov chains provides an adequate approximation.. Stationary distributions π = (π, π, π ) satisfy πp = π, i.e., 4 π + π = π 4 π + π + π = π π + π + π = π. We also know that π + π + π =, and that the matrix P is stochastic. Therefore, the third equation below is a linear combination of the first two, and can be exculded from consideration (this is always the case in problems with stationary distributions). The first equation yields that π = 9 4 π, and the second one that π = ( 4 π + π ) = π. It remains to find π such that π +π +π =, i.e, π + π + 9 4 π =, i.e., π = (+ + 9 4 ) = 4 9. Therefore, π = ( 4 9, 9, 9 9 ) is the only stationary distribution.. The number of steps between two returns to a state i (in an irreducible finite chain) is given by Therefore, E [τ ()] = 9 4. E i [τ i ()] = π i.. The number of visits to the state j between two consecutive visits to the state i (in an irreducible finite chain) is given by τ () E i [ {Xn=j}] = π j π i. n=0 Therefore, our chain will visit the state on average.5 times between two visits to the state. 4. The chain in question is irreducible and finite, so the law of large numbers applies: lim N N N n=0 f(x n ) = i f(i)π i. Last Updated: December 4, 00 49 Intro to Stochastic Processes: Lecture Notes

CHAPTER 4. In our case f() =, f() = and f() = 0, so the amount of money M = 0000 n=0 f(x n) can be approximated as 0000 M = 000 000 f(x n ) 000 ($ 4 9 + $ 9 ) = 000 $ 9 $84. n=0 Problem 4.48. Let {X n } n N0 be a Markov chain whose transition graph is given below: 0.5. 7. 0.5 0.5. 0.5 0.5 0.5. 0.5 4. 0.5... 0.5. 5.. Find all stationary distributions.. Is there a limiting distribution?. If the chain starts from the state 5, what is the asymptotic proportion of time it will spend at state 7? How about the state 5?. The classes are C = {,, }, C =, 7, T = {4} and T = {5}. Any stationary distribution π will have π 4 = π 5 = 0 (since 4 and 5 are transient states). The next step is to find (unique) stationary distributions for C and C (viewed as Markov chains in themselves). The transition matrices P C and P C of C and C are given by 0 0.5 0.5 P C =. 0 0, P C = 0.5 0.5 0 [ 0.5 ] 0.5. 0 The unique distributions π C, π C which solve the equations π C = π C P C and π C = π C P C respectively are π C = ( 4 9,, 9 ), πc = (, ). Therefore, the stationary distributions for P are exactly those of the form where α, α 0 satisfy α + α =. π = (α 4 9, α, α 9, 0, 0, α, α ), Last Updated: December 4, 00 50 Intro to Stochastic Processes: Lecture Notes

CHAPTER 4.. No limiting distribution exists because of the dependence on the initial conditions. To make the argument a bit more rigorous we suppose, to the contrary, that a limiting distribution π = (π,..., π 7 ) exists, i.e., π j = lim n p(n) ij, for all i, j =,,..., 7. When we start in C, the chain will stay there, so p (n) + p(n) + p(n) =, for all n, and, consequently, π + π + π = lim n (p(n) + p(n) + p(n) ) =. Similarly, if we start in C, we stay in C, so π + π 7 =, by the same argument. This is a contradiction beacause π + π + + π 7 must be equal to, but we have proven that π + π + + π 7 π + π + π + π + π 7 =.. When the chain starts from the state 5, it will be in in the next step, and will stay in C forever. The fact that the chain spent one unit of time in the state 5 will be negligible as n. Therefore, we need to compute the asymptotic proportion of time that the chain with transition matrix P C will spend in 7. This is given by π C 7 = /. As for the state 5 itself, the chain will spend only one unit of time there, so, the proportion of time spent there will converge to 0. Problem 4.49. An airline reservation system has two computers, only one of which is in operation at any given time. A computer may break down on any given day with probability p (0, ). There is a single repair facility which takes two days to restore a computer to normal. The facilities are such that only one computer at a time can be dealt with. Form a Markov chain that models the situation (make sure to keep track of the number of machines in operation as well as the status of the machine - if there is one - at the repair facility).. Find the stationary distribution.. What percentage of time (on average) are no machines operable?. What percentage of time (on average) is exactly one machine operable?. Any of the computers can be in the following 4 conditions: in operation, in repair facility - nd day, in repair facility - st day, waiting to enter the repair facility. Since there are two computers, each state of the Markov chain will be a quadruple of numbers denoting the number of computers in each conditions. For example, (, 0,, 0) means that there is one computer in operation and one which is spending its first day in the repair facility. If there are no computers in operation, the chain moves deterministically, but if or computers are in operation, they break down with probability p each, independently of the other. For example, if there are two computers in operation (the state of the system being (, 0, 0, 0)), there are possible scenarios: both computers remain in operation (that happens with probability ( p) ), exactly one computer breaks down (the probability of that is p( p)) and both computers break down (with probability p ). In the first case, the chain stays in Last Updated: December 4, 00 5 Intro to Stochastic Processes: Lecture Notes

CHAPTER 4. the state (, 0, 0, 0). In the second case, the chain moves to the state (, 0,, 0) and in the third one one of the computers enters the repair facility, while the other spends a day waiting, which corresponds to the state (0, 0,, ). All in all, here is the transition graph of the chain: p ^ 0,0,, p^,0,0,0 p p p,,0,0 p p 0,,0, p,0,,0 One can use Mathematica (or work by hand) to obtain the unique stationary distribution ( ) ( p) π = p p 4 4p +p +p+, p 4 p +p p 4 4p +p +p+, p p p 4 4p +p +p+, ( p)(p p ) p 4 4p +p +p+, ( p) p 4 4p +p +p+ where the states are ordered as follows S = {(0, 0,, ), (0,, 0, ), (, 0,, 0), (,, 0, 0), (, 0, 0, 0)}.. No machines are operable if the system is in (0, 0,, ) or in (0,, 0, ). Therefore, the asymptotic precentage of time no machines are operable is π + π = (p ) p +p 4 p +p p 4 4p +p +p+.. Exactly one machine is operable in states (, 0,, 0) and (,, 0, 0), so the required asymptotic precentage is π + π 4 = (p ) p p 4 4p +p +p+. Last Updated: December 4, 00 5 Intro to Stochastic Processes: Lecture Notes

CHAPTER 4. Here is the Mathematica code I used: In[5]:= In[5]:= In[5]:= $Assumptions p 0 && p 0 ; just to simplify some internal computations. Ignore Transitions ",0,0,0 ", ",0,0,0 ", p^ p, ",0,0,0 ", ",0,,0 ", p p^, ",0,0,0 ", " 0,0,, ", p^, ",0,,0 ", " 0,,0, ", p, ",0,,0 ", ",,0,0 ", p, ",,0,0 ", ",0,,0 ", p, ",,0,0 ", ",0,0,0 ", p, " 0,,0, ", ",0,,0 ",, " 0,0,, ", " 0,,0, ", ; Initial ",0,0,0 ", ; G BuildChain Transitions, Initial ; P TransitionMatrix G ; v NullSpace Transpose P IdentityMatrix 5 ; pi v Simplify Total v Out[57]= p p p p 4 p p, p p p 4 4 p p 4 p p, 4 p p p p p p p 4 p p, 4 p p 4 p p, p 4 p p 4 p p 4 Problem 4.50. Let {Y n } n N0 be a sequence of die-rolls, i.e., a sequence of independent random variables with distribution ( ) 4 5 Y n. / / / / / / Let {X n } n N0 be a stochastic process defined by X n = max(y 0, Y,..., Y n ). In words, X n is the maximal value rolled so far.. Explain why X is a Markov chain, and find its transition matrix and the initial distribution.. Supposing that the first roll of the die was, i.e., X 0 =, what is the expected time until a is reached?. Under the same assumption as above (X 0 = ), what is the probability that a 5 will not be rolled before a is rolled for the first time? 4. Starting with the first value X 0 =, each time a die is rolled, the current record (the value of X n ) is written down. When a is rolled for the first time all the numbers are added up and the result is called S (the final is not counted). What is the expected value of S? (Hints:. You can eliminate and from the state space for parts () and ().). Use the fact that A = r a ab 0 b rb+r abc r bc 0 0 c a r r when A = 0 b r, for a, b, c, r > 0. 0 0 c Last Updated: December 4, 00 5 Intro to Stochastic Processes: Lecture Notes

CHAPTER 4.. The value of X n+ is either going to be equal to X n if Y n+ happens to be less than or equal to it, or it moves up to Y n+, otherwise, i.e., X n+ = max(x n, Y n+ ). Therefore, the distribution of X n+ depends on the previous values X 0, X,..., X n only through X n, and, so, {X n } n N0 is a Markov chain on the state space S = {,,, 4, 5, }. The transition Matrix is given by / / / / / / 0 / / / / / P = 0 0 / / / / 0 0 0 / / / 0 0 0 0 5/ / 0 0 0 0 0 /. As stated in the Hint, we can restrict our attention to the Markov chain on the state space S = {, 4, 5, } with the transition matrix / / / / P = 0 / / / 0 0 5/ / 0 0 0 / The state is absorbing (and therefore recurrent), while all the others are transient, so the canonical form of the matrix (with the canonical decomposition S = {} {, 4, 5} is given by 0 0 0 / / / / / 0 / / / 0 0 5/ The fundamental matrix F = (I Q) is now given by / / / F = 0 / / 0 0 / = 0 0 0 The expected time to absorption can be computed by using the reward for each state: F = Therefore it will take on average rolls until the first. Note: You don t need Markov chains to solve this part; the expected time to absorption is nothing but the waiting time until the first is rolled - a shifted geometrically distributed random variable with parameter /. Therefore, its expectation is + 5/ / =. Last Updated: December 4, 00 54 Intro to Stochastic Processes: Lecture Notes

CHAPTER 4.. We make the state 5 absorbing and the canonical decomposition changes to S = {5} {} {, 4}. The canonical form of the transition matrix is 0 0 0 0 0 0 and so Q = / / / / / / 0 / [ ] / / 0 / and R = [ ] / /. / / Therefore, the absorption probabilities are the entries of the matrix [ ] (I Q) / / R =, / / and the answer is /. Note: You don t need Markov chains to solve this part either; the probability that will be seen before 5 is the same as the probability that 5 will appear before. The situation is entirely symmetric. Therefore, the answer must be /. 4. This problem is about expected reward, where the reward g(i) of the state i {, 4, 5} is g(i) = i. The answer is v, where 5 v = F 4 = 7, 5 0 i.e., the expected value of the sum is 5. Note: Can you do this without the use of the Markov-chain theory? Problem 4.5. A car-insurance company classifies drivers in three categories: bad, neutral and good. The reclassification is done in January of each year and the probabilities for transitions between different categories is given by / / 0 P = /5 /5 /5, /5 /5 /5 where the first row/column corresponds to the bad category, the second to neutral and the third to good.. The company started in January 990 with 400 drivers in each category. Estimate the number of drivers in each category in 090. Assume that the total number of drivers does not change in time.. A yearly premium charged to a driver depends on his/her category; it is $000 for bad drivers, $500 for neutral drivers and $00 for good drivers. A typical driver has a total of 5 years of driving experience. Estimate the total amount he/she will pay in insurance premiums in those 5 years. You may assume that 5 years is long enough for the limiting theorems to provide an accurate approximation. Last Updated: December 4, 00 55 Intro to Stochastic Processes: Lecture Notes

CHAPTER 4.. Equal numbers of drivers in each category corresponds to the uniform initial distribution, a (0) = (/, /, /). The distribution of drivers in 090 is given by the distribution of a (00) of X 00 given by a (00) = a (0) P 00, which can be approximated by the limiting distribution of the underlying Markov chain. The limiting distribution is always stationary (and it exists in the present chase thanks to irreducibility and aperiodicity). The stationary distribution solves the linear system π = πp ; we easily get π = ( 4 4, 5 4, 5 4 ). Therefore, the total of 400 drivers will be distributed as ( 4 4 400, 5 4 400, 5 4 400) = (00, 500, 500).. Let the function f : S R be given by f() = 000, f() = 500 and f() = 00. The total amount paid in insurance premiums is 54 k=0 f(x k). By the Law of large numbers for Markov chains, we have 55 k=0 f(x k ) = 5 55 k=0 f(x k) 5 (π f() + π f() + π f()) = 0000. 5 4.9 Markov chains - various multiple-choice problems In each of the following multiple-choice problems choose the answer that necessarily follows from the information given. There will be one and only one correct choice. Problem 4.5. Let C be a class in a Markov chain. Then (a) C is closed, (b) C c is closed, (c) at least one state in C is recurrent, (d) for all states i, j C, p ij > 0, (e) none of the above. The answer is (e). (a) False. Take C = {(0, 0)} in the Tennis example. (b) False. Take C = {Amélie wins} in the Tennis example. (c) False. This is only true in finite chains. Take the Deterministically Monotone Markov Chain as an example. All of its states are transient. Last Updated: December 4, 00 5 Intro to Stochastic Processes: Lecture Notes

CHAPTER 4. (d) False. This would be true if it read for each pair of states i, j C, there exists n N such that p (n) ij > 0. Otherwise, we can use the Tennis chain and the states i = (40, Adv) and j = (Adv, 40). They belong to the same class, but p ij = 0 (you need to pass through (40, 40) to go from one to another). (e) True. Problem 4.5. In a Markov chain whose state space has n elements (n N), (a) all classes are closed (b) at least one state is transient, (c) not more than half of all states are transient, (d) there are at most n classes, (e) none of the above. The answer is (d). (a) False. In the Tennis example, there are classes that are not closed. (b) False. Just take the Regime Switching with 0 < p 0, p 0 <. Both of the states are recurrent there. Or, simply take a Markov chain with only one state (n = ). (c) False. In the Tennis example, 8 states are transient, but n = 0. (d) True. Classes form a partition of the state space, and each class has at least one element. Therefore, there are at most n classes. (e) False. Problem 4.54. Let C and C be two (different) classes. Then (a) i j or j i, for all i C, and j C, (b) C C is a class, (c) if i j for some i C and j C, then k l for all k C and l C, (d) if i j for some i C and j C, then k l for some k C and l C, (e) none of the above. The answer is (c). (a) False. Take C = {Amélie wins} and C = {Björn wins}. (b) False. Take the same example as above. Last Updated: December 4, 00 57 Intro to Stochastic Processes: Lecture Notes

CHAPTER 4. (c) True. Suppose, to the contrary, that k l for some k C and l C. Then, since j and k are in the same class, we must have j k. Similarly, l i (l and i are in the same class). Using the transitivity property of the communication relation, we get j k l i, and so j i. By the assumption i j, and so, i j. This is a contradiction, however, with the assumption that i and j are in different classes. (d) False. In the Tennis example, take i = (0, 0), j = (0, 0), C = {i} and C = {j}. (e) False. Problem 4.55. Let i be a recurrent state with period 5, and let j be another state. Then (a) if j i, then j is recurrent, (b) if j i, then j has period 5, (c) if i j, then j has period 5, (d) if j i then j is transient, (e) none of the above. The answer is (c). 5 4. 0.5 0 0.5 (a) False. Take j = 0 and i = in the chain in the picture. (b) False. Take the same counterexample as above. (c) True. We know that i is recurrent, and since all recurrent classes are closed, and i j, h must belong to the same class as i. Period is a class property, so the period of j is also 5. (d) False. Take j =, i = 0 in the chain in the picture. (e) False. Problem 4.5. Let i and j be two states such that i is transient and i j. Then (a) n= p(n) jj = n= p(n) ii, (b) if i k, then k is transient, (c) if k i, then k is transient, (d) period of i must be, Last Updated: December 4, 00 58 Intro to Stochastic Processes: Lecture Notes

CHAPTER 4. (e) none of the above. The answer is (c). 0.5 0.5 0.5 (a) False. In the chain above, take i = and j =. Clearly, i j, and both of them are transient. A theorem from the notes states that n= p(n) jj < and n= p(n) ii <, but these two sums do not need to be equal. Indeed, the only way for i = to come back to itself in n steps is to move to, come back to in n steps and then jump to. Therefore, p (n) ii = p (n ) jj 0.5 for n, p () ii = 0.5 and p () ii = p ii = 0. Therefore, n= p (n) ii = p () ii + p () ii + n= p (n) ii = 0 + 0.5 + 0.5 This equality implies that the two sums are equal if and only if n= p (n) jj. 0.75 n= p (n) ii = 0.5, i.e., n= p (n) jj = n= p (n) ii =. We know however, that one of the possible ways to go from to in n steps is to just stay in all the time. The probability of this trajectory is (/) n, and so, p (n) jj (/) n for all n. Hence, (/) n =. n= p (n) jj Therefore, the two sums cannot be equal. n= (b) False. Take i = and j = in the chain above. (c) True. Suppose that k i, but k is recurrent. Since recurrent classes are closed, i must be in the same class as k. That would mean, however, that i is also recurrent. This is a contradiction with the assumption that i is transient. (d) False. Take i = in the modification of the example above in which p = 0 and p = p = /. The state is still transient, but its period is. (e) none of the above. Problem 4.57. Suppose there exists n N such that P n = I, where I is the identity matrix and P is the transition matrix of a finite-state-space Markov chain. Then (a) P = I, (b) all states belong to the same class, Last Updated: December 4, 00 59 Intro to Stochastic Processes: Lecture Notes

CHAPTER 4. (c) all states are recurrent (d) the period of each state is n, (e) none of the above. The answer is (c). (a) False. Take the Regime-switching chain with [ ] 0 P = 0 Then P = I, but P I. (b) False. If P = I, all states are absorbing, and, therefore, each is in a class of its own. (c) True. By the assumption P kn = (P n ) k = I k = I, for all k N. Therefore, p (kn) ii = for all k N, and so lim m p (m) 0 (maybe it even doesn t exist). In any case, the series m= p(m) ii ii cannot be convergent, and so, i is recurrent, for all i S. Alternatively, the condition P n = I means that the chain will be coming back to where it started - with certainty - every n steps, and so, all states must be recurrent. (d) False. Any chain satisfying P n = I, but with the property that the n above is not unique is a counterexample. For example, if P = I, then P n = I for any n N. (e) False. Problem 4.58. Let {X n } n N0 be a Markov chain with state space Z. Then. { X n } n N0 is a Markov chain,. { X n } n N0 is not a Markov chain,. { X n } n N0 is a Markov chain, 4. { X n } n N0 is not a Markov chain, 5. none of the above. Problem 4.59. Let X be a gambler s ruin chain with the upper barier a > and p >. Then. the probability of reaching a before reaching 0 is greater than.. the probability of (eventual) bankruptcy does not depend on a,. the number of recurrent classes depends on a, 4. the limiting distribution exists, 5. none of the above. Problem 4.0. Let i be a recurrent state with period, and let j be another state. Then Last Updated: December 4, 00 0 Intro to Stochastic Processes: Lecture Notes

CHAPTER 4.. if j i, then j is recurrent,. if j i, then j has period,. if i j, then j has period, 4. if j i then j is transient, 5. none of the above. Problem 4.. Let C and C be two (different) communication classes of a Markov chain. Then. i j or j i, for all i C, and j C,. C C is a class,. if i j for some i C and j C, then k l for all k C and l C, 4. if i j for some i C and j C, then k l for some k C and l C, 5. none of the above. Problem 4.. Two containers are filled with 50 balls each. A container is selected at random (with equal probabilities) and a ball is taken from it and transfered to the other container. If the selected container is empty, nothing happens. The same procedure is then repeated indefinitely. A natural state space on which we model the situation is S = {0,,,..., 00}, where the state i S corresponds to i balls in the first container and 00 i in the second.. all states have period,. (, ) is a stationary distribution,. all states are positive recurrent 4. the number of transient states would change if we started from the state i = 49, 5. none of the above. Problem 4.. Let {X n } n N0 be a finite-state Markov chain with one transient and two recurrent classes. Then. stationary distributions may not exist,. the limiting distribution does not exist,. the limiting distribution may or may not exist, depending on the particular chain, 4. the number of states is at least 4, 5. none of the above Last Updated: December 4, 00 Intro to Stochastic Processes: Lecture Notes