6 Scalar, Stochastic, Discrete Dynamic Systems



Similar documents
STAT 35A HW2 Solutions

Chapter 4 Lecture Notes

Data Mining: Algorithms and Applications Matrix Math Review

A few useful MATLAB functions

Notes on Factoring. MA 206 Kurt Bryan

Chapter 6: The Information Function 129. CHAPTER 7 Test Calibration

Section 7C: The Law of Large Numbers

The Math. P (x) = 5! = = 120.

Probability Generating Functions

Probability: Terminology and Examples Class 2, 18.05, Spring 2014 Jeremy Orloff and Jonathan Bloom

9. Sampling Distributions

AMATH 352 Lecture 3 MATLAB Tutorial Starting MATLAB Entering Variables

MATH10212 Linear Algebra. Systems of Linear Equations. Definition. An n-dimensional vector is a row or a column of n numbers (or letters): a 1.

Appendix 4 Simulation software for neuronal network models

AP Physics 1 and 2 Lab Investigations

Wald s Identity. by Jeffery Hein. Dartmouth College, Math 100

BNG 202 Biomechanics Lab. Descriptive statistics and probability distributions I

Statistics I for QBIC. Contents and Objectives. Chapters 1 7. Revised: August 2013

Probability and Statistics Prof. Dr. Somesh Kumar Department of Mathematics Indian Institute of Technology, Kharagpur

Probability Using Dice

How To Understand And Solve A Linear Programming Problem

ECON20310 LECTURE SYNOPSIS REAL BUSINESS CYCLE

3.2 Roulette and Markov Chains

Random Fibonacci-type Sequences in Online Gambling

MATH 140 Lab 4: Probability and the Standard Normal Distribution

How To Find The Sample Space Of A Random Experiment In R (Programming)

Trading and Price Diffusion: Stock Market Modeling Using the Approach of Statistical Physics Ph.D. thesis statements. Supervisors: Dr.

Question: What is the probability that a five-card poker hand contains a flush, that is, five cards of the same suit?

Important Probability Distributions OPRE 6301

1. (First passage/hitting times/gambler s ruin problem:) Suppose that X has a discrete state space and let i be a fixed state. Let

Section 5 Part 2. Probability Distributions for Discrete Random Variables

Lab 11. Simulations. The Concept

Algebra 2 Chapter 1 Vocabulary. identity - A statement that equates two equivalent expressions.

Probability, statistics and football Franka Miriam Bru ckler Paris, 2015.

Chapter ML:IV. IV. Statistical Learning. Probability Basics Bayes Classification Maximum a-posteriori Hypotheses

CHAPTER 2 Estimating Probabilities

Random variables, probability distributions, binomial random variable

People have thought about, and defined, probability in different ways. important to note the consequences of the definition:

CHAPTER 8 FACTOR EXTRACTION BY MATRIX FACTORING TECHNIQUES. From Exploratory Factor Analysis Ledyard R Tucker and Robert C.

Exploratory Data Analysis

COMMON CORE STATE STANDARDS FOR

Information Theory and Coding Prof. S. N. Merchant Department of Electrical Engineering Indian Institute of Technology, Bombay

Appendix 2 Statistical Hypothesis Testing 1

1 Short Introduction to Time Series

Supplement to Call Centers with Delay Information: Models and Insights

Statistics and Random Variables. Math 425 Introduction to Probability Lecture 14. Finite valued Random Variables. Expectation defined

Pattern matching probabilities and paradoxes A new variation on Penney s coin game

6.4 Normal Distribution

2x + y = 3. Since the second equation is precisely the same as the first equation, it is enough to find x and y satisfying the system

3. INNER PRODUCT SPACES

9.2 Summation Notation

Ch. 13.3: More about Probability

Non-Inferiority Tests for One Mean

STT315 Chapter 4 Random Variables & Probability Distributions KM. Chapter 4.5, 6, 8 Probability Distributions for Continuous Random Variables

1 Solving LPs: The Simplex Algorithm of George Dantzig

Lecture 2: Descriptive Statistics and Exploratory Data Analysis

What Is Probability?

Hardware-Aware Analysis and. Presentation Date: Sep 15 th 2009 Chrissie C. Cui

Paper No 19. FINALTERM EXAMINATION Fall 2009 MTH302- Business Mathematics & Statistics (Session - 2) Ref No: Time: 120 min Marks: 80

Performance Level Descriptors Grade 6 Mathematics

Common Core Unit Summary Grades 6 to 8

Lecture 5 : The Poisson Distribution

Scicos is a Scilab toolbox included in the Scilab package. The Scicos editor can be opened by the scicos command

Integer Factorization using the Quadratic Sieve

Math 370/408, Spring 2008 Prof. A.J. Hildebrand. Actuarial Exam Practice Problem Set 2 Solutions

Using simulation to calculate the NPV of a project

Gamma Distribution Fitting

Session 8 Probability

Jim Lambers MAT 169 Fall Semester Lecture 25 Notes

Interpreting Data in Normal Distributions

Probability: The Study of Randomness Randomness and Probability Models. IPS Chapters 4 Sections

Probability. Sample space: all the possible outcomes of a probability experiment, i.e., the population of outcomes

Multi-state transition models with actuarial applications c

PGR Computing Programming Skills

Factoring & Primality

Time series analysis as a framework for the characterization of waterborne disease outbreaks

Why Taking This Course? Course Introduction, Descriptive Statistics and Data Visualization. Learning Goals. GENOME 560, Spring 2012

99.37, 99.38, 99.38, 99.39, 99.39, 99.39, 99.39, 99.40, 99.41, cm

Algebra Unpacked Content For the new Common Core standards that will be effective in all North Carolina schools in the school year.

Understanding Confidence Intervals and Hypothesis Testing Using Excel Data Table Simulation

LEO I{ATZ MICHIGAN STATE COLLEGE

Nonparametric adaptive age replacement with a one-cycle criterion

Linear Programming. March 14, 2014

This content downloaded on Tue, 19 Feb :28:43 PM All use subject to JSTOR Terms and Conditions

Binomial random variables

Foundation of Quantitative Data Analysis

AP Statistics 7!3! 6!

2. Simple Linear Regression

Chi Square Tests. Chapter Introduction

SIMPLIFIED PERFORMANCE MODEL FOR HYBRID WIND DIESEL SYSTEMS. J. F. MANWELL, J. G. McGOWAN and U. ABDULWAHID

Chapter 3. Distribution Problems. 3.1 The idea of a distribution The twenty-fold way

Betting with the Kelly Criterion

**BEGINNING OF EXAMINATION**

Polynomials and Factoring

Logistic Regression (1/24/13)

COMP 250 Fall 2012 lecture 2 binary representations Sept. 11, 2012

Thursday, October 18, 2001 Page: 1 STAT 305. Solutions

Transcription:

47 6 Scalar, Stochastic, Discrete Dynamic Systems Consider modeling a population of sand-hill cranes in year n by the first-order, deterministic recurrence equation y(n + 1) = Ry(n) where R = 1 + r = 1 + b d. In this expression, the growth rate r is the difference between the birth rate b and the death rate d. If the recurrence above describes something other than a population (say, an amount of money), then negative values of R are meaningful as well (debit versus credit), and we saw in class that different values of R yield a relatively rich variety of possible responses. However, the output y(n) in the sand-hill crane example has to be interpreted in any case as the average number of birds, rather than the actual number, if nothing else because when R is a real number y(n) is not necessarily an integer number. A deeper reason for thinking of y(n) as an average is that the exact number of cranes is hard or impossible to predict. Consider the experiment of observing a population of sand-hill cranes over, say, N = 1 years from an initial population y. Repeating the experiment would entail reproducing the same environmental conditions, restart the population at y, and observe its evolution over another ten years. Different experiments are bound to produce different time sequences y(), y(1),..., although y() in particular would be the same in all experiments. Discrepancies are caused by uncontrolled variations in the environment, by the health and genetic makeup of the specific initial population, and by other unpredictable factors such as whether a particular alligator was or was not able to capture and kill a particular bird on day 37 of year 4. In summary, a bird population is a stochastic quantity, that is, a quantity whose exact variations defy detailed modeling, and are rather described in an aggregate sense. Another example of a stochastic quantity is the outcome of the roll of a die. A classical, mechanistic view of physics would have posited that it is possible in principle to know enough about the circumstances in which a die is cast to predict the outcome. In practice this is unrealistic, and in more modern views of physics utterly impossible. An aggregate description is more feasible, and would state that the probability of any of the six possible outcomes is the same for a fair die. In one interpretation, this means that if a fair die is rolled K times, then each of the possible outcome values 1 through 6 is likely 13 to occur about K/6 times, and the approximation improves indefinitely as K increases. A less detailed description would state that the average outcome of the roll of a die is 3.5. In one interpretation, this means that if a fair die is rolled K times with outcomes o 1,..., o K, then the quantity 1 K K k=1 is likely to become closer and closer to 3.5 as K increases. 13 The astute reader will have noticed that the expression is likely reeks itself of probability. This observation is correct, and can be made precise. o k

48 6 SCALAR, STOCHASTIC, DISCRETE DYNAMIC SYSTEMS Similarly, a growth coefficient of R could be interpreted by stating that if the experiment mentioned above were repeated K times, and the ratios, say, R k = [y(1)/y()] experiment k were computed from empirical observations over all experiments, then one would obtain 1 K K R k R k=1 for a large enough K. 14 The average outcome y(n) from a recurrence based on the average growth coefficient R is incomplete information, just as the statement that a roll of a die yields 3.5 on average is incomplete. Much more detailed information could be obtained if the recurrence under study were to also model the stochastic variations from experiment to experiment. This greater amount of information is both a curse and a blessing: It is a curse in that running such a recurrence once through a sequence of years would only provide information about that particular sequence, and would therefore be of limited predictive value. More detailed information is a blessing in that a recurrence can be run multiple times through a sequence of years, and by doing so one can compute aggregate information that includes but is not limited to the average behavior of the system. A recurrence that includes stochastic behavior is called a stochastic dynamic system. Such a recurrence requires some mechanism for generating random outcomes. Even once such a mechanism is available, there is a wide choice of options for how to inject randomness into a recurrence. We start from a conceptually straightforward method next, for motivation and intuition. Some preliminaries on probability theory follow, and a more quantitative treatment is presented thereafter, together with some alternative options for injecting randomness. Russian Roulette The simplest and most direct way of thinking about randomness in the sand-hill crane example is to flip a coin every year n and for each of the y(n) birds in turn: head means that the bird survives, tail means that it dies. This only makes sense for a population that does not grow ( R 1). In other words, we assume a zero birth rate b for now, and model 1 d only, rather than 1 + b d. Births will be handled in the next Section. With a fair coin, each bird has a 5-5 chance of survival, so R =.5. For different values of R between zero and one, we can think of Russian roulette, in which the revolver s cylinder is spun anew before each bird is... visited. By varying the number of chambers and bullets, one can achieve any desired probability p that a bird survives. The resulting recurrence is as follows: y(n) = number of survivals when simulating Russian roulette y(n 1) times. Although this may not look much like a recurrence, it is, because y(n) is a (stochastic) function of y(n 1). 14 The fact that a similar result would be obtained for ratios y(n)/y(n 1) for other values of n is a bit of a coincidence, and corresponds to the fact that the growth coefficient R is assumed to be the same regardless of population size or time (that is, it is indepent of both y(n) and n).

49 In Matlab, we can generate what is called a pseudo-random number between zero and one with the instruction rand Pseudo-random means that the sequence of numbers produced by repeated calls to rand is deterministic but hard to predict. If you restart Matlab, or, more conveniently, you call rand( seed, ) then you will restart the pseudo-random generator, and obtain exactly the same sequence of numbers you obtained before the restart. So randomness is only apparent. A good random generator (Matlab s is good) will produce sequences with good statistical properties. One of these properties for rand is that the numbers being generated are uniformly distributed between and 1. This notion will be made more precise, but it roughly means that all 64-bit binary numbers between and 1 are equally likely, a bit like a fair die with 2 64 faces, if you can visualize this. You can then simulate Russian roulette with a probability p of survival ( p 1) with the instruction survival = rand <= p; The variable survival is set to 1 if the outcome of rand is at most p, and zero otherwise. Because the numbers generated by rand are uniformly distributed between and 1, the comparison succeeds in approximately a fraction p of the times, if the experiment is attempted sufficiently often. Thus, survival contains what is called a random binary outcome ( binary because two alternatives are possible, either or 1). If you need an a b array of random binary outcomes (such as a column vector, in which case b is 1), you would type survival = rand(a, b) <= p; Here, then, is the Matlab code for one iteration of Russian roulette: y = sum(rand(y, 1) <= p); The sum counts the number of ones (i.e., survivals) in the binary vector of length n that results from the comparison rand(y, 1) <= p. Please make sure you understand this line of code, perhaps by typing out pieces of it in Matlab. A full implementation of the recurrence might look like this: function y = roulette(y, p) if p < p >= 1 error( p must be at least and less than 1 ) while y() > y( + 1) = sum(rand(y(), 1) <= p); (What would happen with the call roulette(1, 1) if the error check were not present?) Here are a few sample runs:

5 6 SCALAR, STOCHASTIC, DISCRETE DYNAMIC SYSTEMS >> rand( seed, ) >> roulette(2,.4) ans = 2 8 2 1 1 >> roulette(2,.4) ans = 2 8 2 1 1 >> roulette(2,.4) ans = 2 6 3 >> roulette(2,.7) ans = 2 15 11 9 7 6 5 4 3 3 2 1 Note both the (pseudo-)randomness and the variable length of the output, even for the same value of the survival probability p. If you have the same version of Matlab that I have, you ought to get exactly the same numbers from the instructions above. An interesting question about the random binary outcomes is the following: Every time we execute the instruction y = sum(rand(m, 1) <= p); with, say, m = 3 and p =.6 we obtain a different number y. All such numbers are between and 3, but are all outcomes equally likely? One way is to try the experiment, say, 1, times, and then tally the frequency of each of the 31 possible outcomes, that is, the number of times each number came up, divided by 1,. The resulting plot is shown in Figure 1..16 p =.6, m = 3.14.12.1 Frequency.8.6.4.2 5 1 15 2 25 3 Outcome Figure 1: Frequency of occurrence of each outcome between and 3 in 1, trials of Russian roulette with p =.6 and m = 3. The values of this plot add up to 1. As expected, the outcomes are more frequently than not greater than 15, because the probability of survival is.6, so we can expect more than half of the 3 birds to survive. The peak of the

51 plot would move to the left for smaller values of p. The plot in Figure 1 is called a frequency distribution of the outcomes. The bell-shaped distribution in this Figure is approximately what is called a binomial distribution with parameters m = 3 and p =.6. The distribution would almost certainly be exactly binomial if one could run infinitely many trials, rather than just 1,. Because the binomial distribution is of general usefulness, it is useful to encapsulate the code that produces binomial values in a function: % Returns a row vector of n samples out of a binomial distribution with % parameters m (a natural number) and p (a real number between and 1) function y = binomial(m, p, n) if p < p > 1 error( p must be between and 1 ) if nargin < 3 isempty(n) n = 1; y = sum(rand(m, n) <= p); Here is the Matlab code that produced Figure 1: m = 3; p =.6; trials = 1; y = binomial(m, p, trials); h = hist(y, :m)/trials; values = :m; clf plot(values, h,., MarkerSize, 14) hold on for k = values + 1 plot(values(k) * [1 1], [ h(k)]) xlabel( Outcome ) ylabel( Frequency ) title(sprintf( %d trials with p = %g, m = %d, trials, p, m)) The plotting commands are a bit complicated, and the for loop merely draws the vertical lines in the plot. The core of this code are the two lines that compute y and h. Each of the columns in the binary matrix rand(m, trials) <= p represents one trial set of 3 live/die outcomes, and the sum adds up the ones in each column, thereby computing the number of survivals in each of the 1, trials. The resulting vector y has 1, entries (all between

52 6 SCALAR, STOCHASTIC, DISCRETE DYNAMIC SYSTEMS and 3), and hist(y, :m) tallies into a 31-dimensional vector the number of times that y contains a value of, 1,... 3 respectively. The sum of all the tallies would be 1,, so division by trials turns counts into fractions, that is, into frequencies of occurrence. Births Of course, random numbers can also be used to simulate births. The literature on sand-hill cranes 15 reports that the average annual production for any adult is p =.35 young per year. So we need to come up with a random number generator that produces on the average py(n) positive outcomes if y(n) birds are currently alive. 16 We could just use the Russian roulette mechanism with p =.35, but with a different interpretation: survival is replaced by reproduction, and the random number is added to the population. However, this choice is not very satisfactory from a conceptual standpoint: using a binomial distribution with parameters y(n) and p would imply that it is impossible for y(n) birds to give birth to more than y(n) young in any one year. In practice, because of the small value of p, the probability of this happening is very small, so the conceptual difficulty is merely theoretical. However, for more prolific species this would pose difficulties. For instance, the hare produces typically more than two litter per pair, so a bound of y(n) on the number of births in year n would be unacceptable. To address this difficulty, we use a different distribution for births, called a Poisson distribution. This is closely related to the binomial distribution through a limiting process illustrated in Figure 11. A vector of N integers that are distributed according to a Poisson distribution can be generated in various ways. For our purposes, the limiting argument above suggests a simple approximation: given a value λ = py(n), generate a binomial distribution with parameters m = Mλ and p = λ/m for some multiplier factor M (say, M = 3). This will add enough of a right tail to the binomial to approximate a Poisson distribution for any practical purpose: % Returns n samples out of an approximately Poisson distribution with % parameter lambda function y = poisson(lambda, n) if nargin < 2 isempty(n) n = 1; M = 3; m = ceil(lambda * M); p = lambda / m; y = binomial(m, p, n); 15 http://bna.birds.cornell.edu/bna/account/sandhill Crane/DEMOGRAPHY AND POPULATIONS.html 16 For simplicity, we ignore the fact that birds up to three years of age do not reproduce.

53.16 p =.6, m = 3.12 p =.6, m = 6.14.1.12.1.8 Frequency.8 Frequency.6.6.4.4.2.2 5 1 15 2 25 3 Outcome (a) 1 2 3 4 5 6 Outcome (b).12 p =.3, m = 6.12 λ =18.1.1.8.8 Frequency.6 Frequency.6.4.4.2.2 1 2 3 4 5 6 Outcome (c) 2 4 6 8 1 12 Outcome (d) Figure 11: First steps from a binomial distribution to a Poisson distribution. When doubling the number m of individuals from 3 (a) to 6 (b) while keeping the parameter p constant, the peak of the binomial distribution doubles from 3p = 18 to 6p = 36. To keep the peak in the same place (18), halve the parameter p from.6 (b) to.3 (c). If this is done indefinitely, the binomial distribution (a) ts to a Poisson distribution (d) with parameter λ = 3.6 = 6.3 =... = 18. This distribution is well defined (and nonzero) for every nonnegative integer. Of course, it becomes very small for large values of the outcome, and only a finite number of values can be plotted.

54 6 SCALAR, STOCHASTIC, DISCRETE DYNAMIC SYSTEMS We are now ready to rewrite the recurrence for the sand-hill crane in order to account for both births and deaths: y(n) = number of survivals out of a binomial distribution with In Matlab: m = y(n 1) and p = 1 d (26) + number of births out of a Poisson distribution with λ = by(n 1). function y = crane(birth, death, y, N) if birth < birth > 1 death < death > 1 error( birth and death rates must be between and 1 ) for n = 2:N if y() == break; y( + 1) = binomial(y(), 1 - death) + poisson(birth * y()); The results of several trials with different rates is shown in Figure 12. 12 Birth rate.1, death rate.15 22 Birth rate.1, death rate.8 1 2 18 8 Population 6 Population 16 14 4 12 2 1 5 1 15 2 25 3 Year 8 5 1 15 2 25 3 Year Figure 12: Each plot shows twenty trials of a stochastic simulation of the crane population with two different combinations of birth and death rate values. The general question we can ask of the plots thus produced is, What is the distribution of the values of y(n) at each time n? A restricted question, from which all of this Section started, would be, What is the average value of y(n) at each time n? The next obvious question seeks more detail:

55 what is the spread of the values of y(n) at each time n? We first peek at the answers in Figure 13. The next Section introduces some of the theory on which these questions and their answers are based. 1 Birth rate.1, death rate.15 2 Birth rate.1, death rate.8 9 18 8 7 16 Population 6 5 Population 14 4 12 3 1 2 1 5 1 15 2 25 3 35 Year 8 5 1 15 2 25 3 35 Year Figure 13: Means (curves) and standard deviations (bars) of the values in the two sets of plots in Figure 12.

56 6 SCALAR, STOCHASTIC, DISCRETE DYNAMIC SYSTEMS