Generating random numbers

Similar documents
2WB05 Simulation Lecture 8: Generating random variables

Probability density function : An arbitrary continuous random variable X is similarly described by its probability density function f x = f X

Testing Random- Number Generators

Chapter 3 RANDOM VARIATE GENERATION

Probability and Statistics Prof. Dr. Somesh Kumar Department of Mathematics Indian Institute of Technology, Kharagpur

Lecture 13 - Basic Number Theory.

Notes on Continuous Random Variables

Introduction to Probability

Statistics 100A Homework 7 Solutions

Probability and Random Variables. Generation of random variables (r.v.)

Random Variate Generation (Part 3)

Overview of Monte Carlo Simulation, Probability Review and Introduction to Matlab

Institute of Actuaries of India Subject CT3 Probability and Mathematical Statistics

Lecture 7: Continuous Random Variables

Sums of Independent Random Variables

Algebra 2 Chapter 1 Vocabulary. identity - A statement that equates two equivalent expressions.

FEGYVERNEKI SÁNDOR, PROBABILITY THEORY AND MATHEmATICAL

Multiple Choice: 2 points each

Jitter Measurements in Serial Data Signals

Important Probability Distributions OPRE 6301

LECTURE 16. Readings: Section 5.1. Lecture outline. Random processes Definition of the Bernoulli process Basic properties of the Bernoulli process

CHAPTER 6: Continuous Uniform Distribution: 6.1. Definition: The density function of the continuous random variable X on the interval [A, B] is.

Aachen Summer Simulation Seminar 2014

IEOR 6711: Stochastic Models I Fall 2012, Professor Whitt, Tuesday, September 11 Normal Approximations and the Central Limit Theorem

ECE 842 Report Implementation of Elliptic Curve Cryptography

CHAPTER 5. Number Theory. 1. Integers and Division. Discussion

Prime numbers and prime polynomials. Paul Pollack Dartmouth College

Network Protocol Design and Evaluation

Introduction to Queueing Theory and Stochastic Teletraffic Models

Generating Random Variables and Stochastic Processes

I. GROUPS: BASIC DEFINITIONS AND EXAMPLES

UNIT I: RANDOM VARIABLES PART- A -TWO MARKS

Random variables, probability distributions, binomial random variable

Lecture 6: Discrete & Continuous Probability and Random Variables

Generating Random Numbers Variance Reduction Quasi-Monte Carlo. Simulation Methods. Leonid Kogan. MIT, Sloan , Fall 2010

Network Security. Chapter 6 Random Number Generation

Continued Fractions and the Euclidean Algorithm

COMP 250 Fall 2012 lecture 2 binary representations Sept. 11, 2012

Homework until Test #2

1 Sufficient statistics

Internet Dial-Up Traffic Modelling

Systems of Linear Equations

THE NUMBER OF REPRESENTATIONS OF n OF THE FORM n = x 2 2 y, x > 0, y 0

Stochastic Processes and Queueing Theory used in Cloud Computer Performance Simulations

E3: PROBABILITY AND STATISTICS lecture notes

Quotient Rings and Field Extensions

Department of Mathematics, Indian Institute of Technology, Kharagpur Assignment 2-3, Probability and Statistics, March Due:-March 25, 2015.

U.C. Berkeley CS276: Cryptography Handout 0.1 Luca Trevisan January, Notes on Algebra

4 Sums of Random Variables

For a partition B 1,..., B n, where B i B j = for i. A = (A B 1 ) (A B 2 ),..., (A B n ) and thus. P (A) = P (A B i ) = P (A B i )P (B i )

1.5 / 1 -- Communication Networks II (Görg) Transforms

Random-Number Generation

Generating Random Variables and Stochastic Processes

Network Security. Chapter 6 Random Number Generation. Prof. Dr.-Ing. Georg Carle

5: Magnitude 6: Convert to Polar 7: Convert to Rectangular

1 Prior Probability and Posterior Probability

1 if 1 x 0 1 if 0 x 1

What is Statistics? Lecture 1. Introduction and probability review. Idea of parametric inference

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.436J/15.085J Fall 2008 Lecture 5 9/17/2008 RANDOM VARIABLES

Lecture 3: Continuous distributions, expected value & mean, variance, the normal distribution

SMT 2014 Algebra Test Solutions February 15, 2014

Chapter G08 Nonparametric Statistics

This document is published at Feb. 2008, as part of a software package.

Copy in your notebook: Add an example of each term with the symbols used in algebra 2 if there are any.

Binomial lattice model for stock prices

Master s Theory Exam Spring 2006

Business Statistics. Successful completion of Introductory and/or Intermediate Algebra courses is recommended before taking Business Statistics.

1 Review of Newton Polynomials

A Primer on Mathematical Statistics and Univariate Distributions; The Normal Distribution; The GLM with the Normal Distribution

Factoring Algorithms

Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model

Queuing Theory. Long Term Averages. Assumptions. Interesting Values. Queuing Model

MATH BOOK OF PROBLEMS SERIES. New from Pearson Custom Publishing!

Lecture Notes 1. Brief Review of Basic Probability

minimal polyonomial Example

Curriculum Map Statistics and Probability Honors (348) Saugus High School Saugus Public Schools

Chapter 3. if 2 a i then location: = i. Page 40

5. Continuous Random Variables

Definition: Suppose that two random variables, either continuous or discrete, X and Y have joint density

Lecture 2: Universality

Load Balancing and Switch Scheduling

. (3.3) n Note that supremum (3.2) must occur at one of the observed values x i or to the left of x i.

7 Gaussian Elimination and LU Factorization

Evaluating the Lead Time Demand Distribution for (r, Q) Policies Under Intermittent Demand

How To Understand And Solve A Linear Programming Problem

NEW YORK STATE TEACHER CERTIFICATION EXAMINATIONS

This unit will lay the groundwork for later units where the students will extend this knowledge to quadratic and exponential functions.

Covariance and Correlation

Normal distribution. ) 2 /2σ. 2π σ

Factoring & Primality

Course Text. Required Computing Software. Course Description. Course Objectives. StraighterLine. Business Statistics

M2S1 Lecture Notes. G. A. Young ayoung

1 The Brownian bridge construction

NAG C Library Chapter Introduction. g08 Nonparametric Statistics

Review of Random Variables

Monte Carlo Methods in Finance

Sections 2.11 and 5.8

1) Write the following as an algebraic expression using x as the variable: Triple a number subtracted from the number

Current Standard: Mathematical Concepts and Applications Shape, Space, and Measurement- Primary

SUM OF TWO SQUARES JAHNAVI BHASKAR

Transcription:

Generating random numbers Lecturer: Dmitri A. Moltchanov E-mail: moltchan@cs.tut.fi http://www.cs.tut.fi/kurssit/elt-53606/

OUTLINE: Why do we need random numbers; Basic steps in generation; Uniformly distributed random numbers; Statistical tests for uniform random numbers; Random numbers with arbitrary distributions; Statistical tests for random numbers with arbitrary distribution; Multidimensional distributions. Lecture: Generating random numbers 2

1. The need for random numbers Examples of randomness in telecommunications: interarrival times between arrivals of packets, tasks, etc.; service time of packets, tasks, etc.; time between failure of various components; repair time of various components;... Importance for simulations: random events are characterized by distributions; simulations: we cannot use distribution directly. For example, M/M/1 queuing system: arrival process: exponential distribution with mean 1/λ; service times: exponential distribution with mean 1/µ. Lecture: Generating random numbers 3

Discrete-event simulation of M/M/1 queue INITIALIZATION time:=0; queue:=0; sum:=0; throughput:=0; generate first interarrival time; MAIN PROGRAM while time < runlength do case nextevent of arrival event: time:=arrivaltime; add customer to a queue; start new service if the service is idle; generate next interarrival time; departure event: time:=departuretime; throughput:=throughtput + 1; remove customer from a queue; if (queue not empty) sum:=sum + waiting time; start new service; OUTPUT mean waiting time = sum / throughput Lecture: Generating random numbers 4

2. General notes General approach nowadays: transforming one random variable to another one; as a reference distribution a uniform distribution is often used. Note the following: most simulators contain generator of uniformly distributed numbers in interval (0, 1). may not contain arbitrarily distributed random numbers you want. The procedure is to: generate RN with inform distribution between a and b, b >>>> a; transform it somehow to random number with uniform distribution on (0, 1); transform it somehow to a random number with desired distribution. Lecture: Generating random numbers 5

2.1. Pseudo random numbers All computer generated numbers are pseudo ones: we know the method how they are generated; we can predict any random sequence in advance. The goal is then: imitate random sequences as good as possible. Requirements for generators: must be fast; must have low complexity; must have sufficiently long cycles; must allow to generate repeatable sequences; must be independent; must closely follow a given distribution. Lecture: Generating random numbers 6

2.2. Step 1: uniform random numbers in (a, b) Basic approach: generate random number with uniform distribution on (a, b); transform these random numbers to (0, 1); transform it somehow to a random number with desired distribution. Uniform generators: old methods: mostly based on radioactivity; Von Neumann s algorithm; congruential methods. Basic approach: next number is some function of previous one γ i+1 = F (γ i ), i = 0, 1,..., (1) recurrence relation of the first order; γ 0 is known and directly computed from the seed. Lecture: Generating random numbers 7

2.3. Step 2: transforming to random numbers in (0, 1) Basic approach: generate random number with uniform distribution on (0, 1); transform these random numbers to (0, 1); transform it somehow to a random number with desired distribution. Uniform U(0, 1) distribution has the following pdf: 1, 0 x 1 f(x) = 0, otherwise. (2) Lecture: Generating random numbers 8

Mean and variance are given by: E[X] = 1 0 xdx = x2 2 1 0 = 1 2, σ 2 [X] = 1 12. (3) How to get U(0, 1): by rescaling from U(0, m) as follows: y i = γ i /m, (4) where m is the modulo in linear congruential algorithm. What we get: something like: 0.12, 0.67, 0.94, 0.04, 0.65, 0.20,... ; sequence that appears to be random... Lecture: Generating random numbers 9

2.4. Step 3: non-uniform random numbers Basic approach: generate random number with uniform distribution on (a, b); transform these random numbers to (0, 1); transform it somehow to a random number with desired distribution. If we have generator U(0, 1) the following techniques are avalable: discretization: bernoulli, binomial, poisson, geometric; rescaling: uniform; inverse transform: exponential; specific transforms: normal; rejection method: universal method; reduction method: Erlang, Binomial; composition method: for complex distributions. Lecture: Generating random numbers 10

3. Uniformly distributed random numbers The generator is fully characterized by (S, s 0, f, U, g): S is a finite set of states; s 0 S is the initial state; f(s S) is the transition function; U is a finite set of output values; g(s U) is the output function. The algorithm is then: let u 0 = g(s 0 ); for i = 1, 2,... do the following recursion: s i = f(s i 1 ); u i = g(s i ). Note: functions f( ) and g( ) influence the goodness of the algorithm heavily. Lecture: Generating random numbers 11

user choice s 0 u 0 =g(s 0 ) u 0 u 4 s 1 =f(s 0 ) s 1 s 0 s 3 s 4 s 4 =f(s 3 ) u 1 =g(s 1 ) u 4 =g(s 4 ) u 3 =g(s 3 ) u 1 u 3 s 2 =f(s 1 ) s 3 =f(s 2 ) s 2 u 2 u 2 =g(s 2 ) Figure 1: Example of the operations of random number generator. Here s 0 is a random seed: allows to repeat the whole sequence; allows to manually assure that you get different sequence. Lecture: Generating random numbers 12

3.1. Von Neumann s generator The basic procedure: start with some number u 0 of a certain length x (say, x = 4 digits, this is seed); square the number; take middle 4 digits to get u 1 ; repeat... example: with seed 1234 we get 1234, 5227, 3215, 3362, 3030, etc. Shortcoming: sensitive to the random seed: seed 2345: 2345, 4990, 9001, 180, 324, 1049, 1004, 80, 64, 40... (will always < 100); may have very short period: seed 2100: 2100, 4100, 8100, 6100, 2100, 4100, 8100,... (period = 4 numbers). To generate U(0, 1): divide each obtained number by 10 x (x is the length of u 0 ). Note: this generator is also known as midsquare generator. Lecture: Generating random numbers 13

3.2. Congruential methods There are a number of versions: additive congruential method; multiplicative congruential method; linear congruential method; Tausworthe binary generator. General congruential generator: u i+1 = f(u i, u i 1,... ) mod m, (5) u i, u i 1,... are past numbers. For example, quadratic congruential generator: u i+1 = (a 1 u 2 i + a 2 u i 1 + c) mod m. (6) Note: if here a 1 = a 2 = 1, c = 0, m = 2 we have the same as midsquare method. Lecture: Generating random numbers 14

3.3. Additive congruential method Additive congruential generator is given: u i+1 = (a 1 u i + a 2 u i 1 + + a k u i k ) mod m. (7) The common special case is sometimes used: u i+1 = (a 1 u i + a 2 u i 1 ) mod m. (8) Characteristics: divide by m to get U(0, 1); maximum period is m k ; note: rarely used. Shortcomings: consider k = 2: consider three consecutive numbers u i 2, u i 1, u i ; we will never get: u i 2 < u i < u i 1 and u i 1 < u i < u i 2 (must be 1/6 of all sequences). Lecture: Generating random numbers 15

3.4. Multiplicative congruential method Multiplicative congruential generator is given: u i+1 = (au i ) mod m. (9) Characteristics: divide by m to get U(0, 1); theoretical maximum period is m; note: rarely used. Shortcomings: can never produce 0. Choice of a, m is very important: recommended m = (2 p 1) with p = 2, 3, 5, 7, 13, 17, 19, 31, 61 (Fermat numbers); if m = 2 q, q 4 simplifies the calculation of modulo; practical maximum period is at best no longer than m/4. Lecture: Generating random numbers 16

3.5. Linear congruential method Linear congruential generator is given: u i+1 = (au i + c) mod m, (10) where a, c, m are all positive. Characteristics: divide by m to get U(0, 1); maximum period is m; frequently used. Choice of a, c, m is very important. To get full period m choose: m and c have no common divisor; c and m are prime number (distinct natural number divisors 1 and itself only); if q is a prime divisor of m then a = 1, mod q; if 4 is a divisor of m then a = 1, mod 4. Lecture: Generating random numbers 17

The step-by-step procedure is as follows: set the seed x 0 ; multiply x by a and add c; divide the result by m; the reminder is x 1 ; repeat to get x 2, x 3,.... Examples: x 0 = 7, a = 7, c = 7, m = 10 we get: 7,6,9,0,7,6,9,0,... (period = 4); x 0 = 1, a = 1, c = 5, m = 13 we get: 1,6,11,3,8,0,5,10,2,7,12,4,9,1... (period = 13); x 0 = 8, a = 2, c = 5, m = 13 we get: 8,8,8,8,8,8,8,8,... (period = 1!). Recommended values: a = 314, 159, 269, c = 453, 806, 245, m = 231 for 32 bit machine. Lecture: Generating random numbers 18

Complexity of the algorithm: addition, multiplications and division: division is slow: to avoid it set m to the size of the computer word. Overflow problem when m equals to the size of the word: values a, c and m are such that the result ax i + c is greater than the word; it may lead to loss of significant digits but it does not hurt! How to deal with: register can accommodate 2 digits at maximum; the largest number that can be stored is 99; if m = 100: for a = 8, u 0 = 2, c = 10 we get (au i + c) mod 100 = 26; if m = 100: for a = 8, u 0 = 20, c = 10 we get (au i + c) mod 100 = 170; au i = 8 20 = 160 causing overflow; first significant digit is lost and register contains 60; the reminder in the register (result) is: (60 + 10) mod 70 = 70. the same as 170 mod 100 = 70. Lecture: Generating random numbers 19

3.6. How to get good congruental generator Characteristics of good generator: should provide maximum density: no large gaps in [0, 1] are produced by random numbers; problem: each number is discrete; solution: a very large integer for modulus m. should provide maximum period: achieve maximum density and avoid cycling; achieve by: proper choice of a, c, m, and x 0. effective for modern computers: set modulo to power of 2. Lecture: Generating random numbers 20

3.7. Tausworthe generator Tausworthe generator (case of linear congruential generator or order k): ( k ) z i = (a 1 z i 1 + a 2 z i 2 + + a k z i k + c) mod 2 = a j z i j + c mod 2. (11) where a j {0, 1}, j = 0, 1,..., k; the output is binary: 0011011101011101000101... j=1 Advantages: independent of the system (computer architecture); independent of the word size; very large periods; can be used in composite generators (we consider in what follows). Note: there are several bit selection techniques to get numbers. Lecture: Generating random numbers 21

A way to generate numbers: choose an integer l k; split in blocks of length l and interpret each block as a digit: u n = l 1 j=0 z nl+j 2 (j+1). (12) In practice, only two a i are used and set to 1 at places h and k. We get: Example: h = 3, k = 4, initial values 1,1,1,1; we get: 110101111000100110101111...; period is 2 k 1 = 15; if l = 4: 13/16, 7/16, 8/16, 9/16, 10/16, 15/16, 1/16, 3/16... z n = (z i h + z i k ) mod 2. (13) Lecture: Generating random numbers 22

3.8. Composite generator Idea: use two generators of low period to generate another with wider period. The basic principle: use the first generator to fill the shuffling table (address - entry (random number)); use random numbers of second generator as addresses in the next step; each number corresponding to the address is replaced by new random number of first generator. The following algorithm uses one generator to shuffle with itself: 1. create shuffling table of 100 entries (i, t i = γ i, i = 1, 2,..., 100); 2. draw random number γ k and normalize to the range (1, 100); 3. entry i of the table gives random number t i ; 4. draw the next random number γ k+1 and update t i = γ k+1 ; 5. repeat from step 2. Note: table with 100 entries gives fairly good results. Lecture: Generating random numbers 23

4. Tests for random number generators What do we want to check: independence; uniformity. Important notes: if and only if tests passed number can be treated as random; recall: numbers are actually deterministic! Commonly used tests for independence: runs test; correlation test. Commonly used tests for uniformity: Kolmogorov s test; χ 2 test. Lecture: Generating random numbers 24

4.1. Independence: runs test Basic idea: compute patterns of numbers (always increase, always decrease, etc.); compare to theoretical probabilities. 1/3 1/3 1/3 1/3 1/3 1/3 Figure 2: Illustration of the basic idea. Lecture: Generating random numbers 25

Do the following: consider a sequence of pseudo random numbers: {u i, i = 0, 1,..., n}; consider unbroken subsequences of numbers where numbers are monotonically increasing; such subsequence is called run-up; example: 0.78,081,0.89,0.81 is a run-up of length 3. compute all run-ups of length i: r i, i = 1, 2, 3, 4, 5; all run-ups of length i 6 are grouped into r 6. calculate: R = 1 n 1 i,j 6 (r i nb i )(r j nb j )a ij, 1 i, j 6, (14) where (b 1, b 2,..., b 6 ) = ( 1 6, 5 24, 11 120, 19 ) 720, 29 5040, 1, (15) 840 Lecture: Generating random numbers 26

Coefficients a ij must be chosen as an element of the matrix: Statistics R has χ 2 distribution: number of freedoms: 6; n > 4000. If so, observations are i.i.d. Lecture: Generating random numbers 27

4.2. Independence: correlation test Basic idea: compute autocorrelation coefficient for lag-1; if it is not zero and this is statistically significant result, numbers are not independent. Compute statistics (lag-1 autocorrelation coefficient) as: R = N (u j E[u])(u j+1 E[u])/ j=1 N (u j E[j]) 2. (16) j=1 Practice: if R is relatively big there is serial correlation. Important notes: exact distribution of R is unknown; for large N: if u j uncorrelated we have: P r{ 2/ N R 2/ N}; therefore: reject hypotheses of non-correlated at 5% level if R is not in { 2/ N, 2/ N}. Notes: other tests for correlation Ljung and Box test, Portmanteau test, etc. Lecture: Generating random numbers 28

4.3. Uniformity: χ 2 test The algorithm: divide [0, 1] into k, k > 100 non-overlapping intervals; compute the relative frequencies of falling in each category, f i : ensure that there are enough numbers to get f i > 5, i = 1, 2,..., k; values f i > 5, i = 1, 2,..., k are called observed values. if observations are truly uniformly distributed then: these values should be equal to r i = n/k, i = 1, 2,..., k; these values are called theoretical values. compute χ 2 statistics for uniform distribution: χ 2 = k n that must have k 1 degrees of freedom. k i=1 ( f i n k ) 2. (17) Lecture: Generating random numbers 29

Hypotheses: H 0 observations are uniformly distributed; H 1 observations are not uniformly distributed. H 0 is rejected if: computed value of χ 2 is greater than one obtained from the tables; you should check the entry with k 1 degrees of freedom and 1-a level of significance. Lecture: Generating random numbers 30

4.4. Kolmogorov test Facts about this test: compares empirical distribution with theoretical ones; empirical: F N (x) number of smaller than or equal to x, divided by N; theoretical: uniform distribution in (0, 1): F (x) = x, 0 < x < 1. Hypotheses: H 0 : F N (x) follows F (x); H 1 : F N (x) does not follow F (x). Statistics: maximum absolute difference over a range: R = max F (x) F N (x). (18) if R > R α : H 0 is rejected; if R R α : H 0 is accepted. Note: use tables for N, α (significance level), to find R α. Lecture: Generating random numbers 31

Example: we got 0.44, 0.81, 0.14, 0.05, 0.93: H 0 : random numbers follows uniform distribution; we have to compute: R (j) 0.05 0.14 0.44 0.81 0.93 j/n 0.20 0.40 0.60 0.80 1.00 j/n R (j) 0.15 0.26 0.16-0.07 R (j) (j-1)/n 0.05-0.04 0.21 0.13 compute statistics as: R = max F (x) F N (x) = 0.26; from tables: for α = 0.05, R α = 0.565 > R; H 0 is accepted, random numbers are distributed uniformly in (0, 1). Lecture: Generating random numbers 32

4.5. Other tests The serial test: consider pairs (u 1, u 2 ), (u 3, u 4 ),..., (u 2N 1, u 2N ); count how many observations fall into N 2 different subsquares of the unit square; apply χ 2 test to decide whether they follow uniform distribution; one can formulate M-dimensional version of this test. The permutation test look at k-tuples: (u 1, u k ), (u k+1, u 2k ),..., (u (N 1)k+1, u Nk ); in a k-tuple there k! possible orderings; in a k-tuple all orderings are equally likely; determine frequencies of orderings in k-tuples; apply χ 2 test to decide whether they follow uniform distribution. Lecture: Generating random numbers 33

The gap test let J be some fixed subinterval in (0, 1); if we have that: u n+j not in J, 0 j k, and both u n 1 J, u n+k+1 J; we say that there is a gap of length k. H 0 : numbers are independent and uniformly distributed in (0, 1): gap length must be geometrically distributed with some parameter p; p is the length of interval J: P r{gap of length k} = p(1 p) k. (19) practice: we observe a large number of gaps, say N; choose an integer and count number of gaps of length 0, 1,..., h 1 and h; apply χ 2 test to decide whether they independent and follow uniform distribution. Lecture: Generating random numbers 34

4.6. Important notes Some important notes on seed number: do not use seed 0; avoid even values; do not use the same sequence for different purposes in a single simulation run. Note: these instruction may not be applicable for a particular generator. General notes: some common generators are found to be inadequate; even if generator passed tests, some underlying pattern might still be undetected; if the task is important use composite generator. Lecture: Generating random numbers 35

5. Random numbers with arbitrary distribution Discrete distributions: discretization; for any discrete distribution. rescaling: for uniform random numbers in (a, b). methods for specific distributions. Continuous distributions: inverse transform; rejection method; composition method; methods for specific distributions. Lecture: Generating random numbers 36

5.1. Discrete distributions: discretization Consider arbitrary distributed discrete RV: P r{x = x j } = p j, j = 0, 1,..., p j = 1. (20) j=0 The following method can be applied: generate uniformly distributed RV; use the following to generate discrete RV: this method can be applied to any discrete RV; there are some specific methods for specific discrete RVs. Lecture: Generating random numbers 37

Figure 3: Illustration of the proposed approach. Lecture: Generating random numbers 38

The step-by-step procedure: compute probabilities p i = P r{x = x i }, i = 0, 1,... ; generate RV u with U(0, 1); if u < p 0, set X = x 0 ; if u < p 0 + p 1, set X = x 1 ; if u < p 0 + p 1 + p 2, set X = x 2 ;... Note the following: this is inverse transform method for discrete RVs: we determine the value of u; we determine the interval [F (x i 1 ), F (x i )] in which it lies. complexity depends on the number of intervals to be searched. Lecture: Generating random numbers 39

Example: p 1 = 0.2, p 2 = 0.1, p 3 = 0.25, p 4 = 0.45: determine generator for P r{x = x j } = p j. Algorithm 1: generate u = U(0, 1); if u < 0.2, set X = 1, return; if u < 0.3, set X = 2; if u < 0.55, set X = 3; set X = 4. Algorithm 2 (more effective): generate u = U(0, 1); If u < 0.45, set X = 4; if u < 0.7, set X = 3; if u < 0.9, set X = 1; set X = 2. Lecture: Generating random numbers 40

5.2. Example of discretization: Poisson RV Poisson RV have the following distribution: p i = P r{x = i} = λi e λ, i! i = 0, 1,.... (21) We use the property: p i+1 = λ i + 1 p i, i = 1, 2,.... (22) The algorithm: 1. generate u = U(0, 1); 2. i = 0, p = e λ, F = p; 3. if u < F, set X = i; 4. p = λp/(i + 1), F = F + p, i = i + 1; 5. go to step 3. Lecture: Generating random numbers 41

5.3. Example of discretization: binomial RV Binomial RV have the following distribution: p i = P r{x = i} = We are going to use the following property: The algorithm: 1. generate u = U(0, 1); p i+1 = n i i + 1 2. c = p/(1 p), i = 0, d = (1 p)n, F = d; 3. if u < F, set X = i 4. d = [c(n i)/(i + 1)]d, F = F + d, i = i + 1; 5. go to step 3. n! i!(n i)! pi (1 p) n i, i = 0, 1,.... (23) p 1 p p i, i = 0, 1,.... (24) Lecture: Generating random numbers 42

5.4. Continuous distributions: inverse transform method Inverse transform method: applicable only when cdf can be inversed analytically; works for a number of distributions: exponential, unform, Weibull, etc. Assume: we would like to generate numbers with pdf f(x) and cdf F (x); recall, F (x) is defined on [0, 1]. The generic algorithm: generate u = U(0, 1); set F (x) = u; find x = F 1 (u), F 1 ( ) is the inverse transformation of F ( ). Lecture: Generating random numbers 43

Example: we want to generate numbers from the following pdf f(x) = 2x, 0 x 1; calculate the cdf as follows: F (x) = x 0 2tdt = x 2, 0 x 1. (25) let u be the random number, we have u = x 2 or u = x; get the random number. Lecture: Generating random numbers 44

5.5. Inverse transform method: uniform continuous distribution Uniform continuous distribution has the following pdf and cdf: 1 (b a) f(x) =, a < x < b (x a) 0, otherwise, F (x) =, a < x < b (b a) 0, otherwise. (26) The algorithm: generate u = U(0, 1); set u = F (x) = (x a)/(b a); solve to get x = a + (b a)u. Lecture: Generating random numbers 45

5.6. Inverse transform method: exponential distribution Exponential distribution has the following pdf and cdf: f(x) = λe λx, F (x) = 1 e λx, λ > 0, x 0. (27) The algorithm: generate u = U(0, 1); set u = F (x) = e λx ; solve to get x = (1/λ) log u. Lecture: Generating random numbers 46

5.7. Inverse transform method: Erlang distribution Erlang distribution: convolution of k exponential distributions. The algorithm: generate u = U(0, 1); sum of exponential variables x 1,..., x k with mean 1/λ; solve to get: x = k x i = 1 λ i=1 k log u i = 1 k λ log u i. (28) i=1 i=1 Lecture: Generating random numbers 47

5.8. Specific methods: normal distribution Normal distribution has the following pdf: f(x) = 1 σ 1 2π e 2 (x µ) 2 where σ and µ are the standard deviation and the mean. σ 2, < x <, (29) Standard normal distribution (RV Z = (X µ/)σ) has the following pdf: f(z) = 1 2π e 1 2 z2, < z <, where µ = 0, σ = 1. (30) Lecture: Generating random numbers 48

Central limit theorem: if x 1, x 2,..., x n are independent with E[x i ] = µ, σ 2 [x i ] = σ 2, i = 1, 2,..., n; the sum of them approaches normal distribution if n : E[ x i ] = nµ, σ 2 [ x i ] = nσ 2. The approach: generate k random numbers u i = U(0, 1), i = 0, 1,..., k 1; each random numbers has: E[u i ] = (0 + 1)/2 = 1/2, σ 2 [u i ] = (1 0) 2 /12 = 1/12; sum of these number follows normal distribution with: ( ) k ui N 2, k ui k/2, or 12 k/ N(0, 1). (31) 12 if the RV we want to generate is x with mean µ and standard deviation σ: x µ σ finally (note that k should be at least 10): x µ ui k/2 12 = σ k/, or x = σ 12 k N(0, 1). (32) ( ui k ) + µ. (33) 2 Lecture: Generating random numbers 49

5.9. Specific method: empirical continuous distributions Assume we have a histogram: x i is the midpoint of the interval i; f(x i ) is the length of the ith rectangle. Note: the task is different from sampling from discrete distribution. Lecture: Generating random numbers 50

Construct the cdf as follows: F (x i ) = k {F (x i 1 ),F (x i )} f(x k ), (34) which is monotonically increasing within each interval [F (x i 1 ), F (x i )]. Lecture: Generating random numbers 51

The algorithm: generate u = U(0, 1); assume that u {F (x i 1 ), F (x i )}; use the following linear interpolation to get: u F (x i 1 ) x = x i 1 + (x i x i 1 ) F (x i ) F (x i 1 ). (35) Note: this approach can also be used for analytical continuous distribution. get (x i, f(x i )), i = 1, 2,..., k and follow the procedure. Lecture: Generating random numbers 52

5.10. Rejection method Works when: pdf f(x) is bounded; x has a finite range, say a x b. The basic steps are: normalize the range of f(x) by a scale factor such that cf(x) 1, a x b; define x as a linear function of u = U(0, 1), i.e. x = a + (b a)u; generate pairs of random numbers (u 1, u 2 ), u 1, u 2 = U(0, 1); accept the pair and use x = a + (b a)u 1 whenever: the pair satisfies u 2 cf(a + (b a)u 1 ); meaning that the pair (x, u 2 ) falls under the curve of cf(x). Lecture: Generating random numbers 53

The underlying idea: P r{u 2 cf(x)} = cf(x); if x is chosen at random from (a, b): we reject if u 2 > cf(x); we accept if u 2 cf(x); we match f(x). Lecture: Generating random numbers 54

Example: generate numbers from f(x) = 2x, 0 x 1: 1. select c such that cf(x) 1: for example: c = 0.5. 2. generate u 1 and set x = u 1 ; 3. generate u 2 : if u 2 < cf(u 1 ) = (0.5)2u 1 = u 1 then accept x; otherwise go back to step 2. Lecture: Generating random numbers 55

5.11. Convolution method The basis of the method is the representation of cdf F (x): F (x) = p j F j (x), (36) j=1 p j 0, j = 1, 2,..., j=1 p j = 1. Works when: it is easy to to generate RVs with distribution F j (x) than F (x); hyperexponential RV; Erlang RV. The algorithm: 1. generate discrete RV J, P r{j = j} = p j ; 2. given J = j generate RV with F j (x); 3. compute j=1 p jf j (x). Lecture: Generating random numbers 56

Example: generate from exponential distribution: divide (0, ) into intervals (i, i + 1), i = 0, 1,... ; the probabilities of intervals are given: p i = P r{i X < i + 1} = e i e (i+1) = e i (1 e 1 ), (37) gives geometric distribution. the conditional pdfs are fiven by: f i (x) = e (x i) /(1 e 1 ), i x < i + 1. (38) in the interval i(x i) has the pdf e x /(1 e 1 ), 0 x < 1. The algorithm: get I from geometric distribution p i = e i /(1 e 1 ), i = 0, 1,... ; get Y from e x /(1 e 1 ), 0 x < 1; X = I + Y. Lecture: Generating random numbers 57

6. Statistical tests for RNs with arbitrary distribution What we have to test for: independence; particular distribution. Tests for independence: correlation tests: Portmanteau test, modified Portmanteau test, ±2/ n, etc. note: here we test only for linear dependence... Tests for distribution: χ 2 test; Kolmogorov s test. Lecture: Generating random numbers 58

7. Multi-dimensional distributions Task: generate samples from RV (X 1, X 2,..., X n ). Write the joint density function as: f(x 1, x 2,..., x n ) = f 1 (x 1 )f 2 (x 2 x 1 )... f(x n x 1... x n 1 ). (39) f 1 (x 1 ) is the marginal distribution of X 1 ; f k (x k x 1,..., x k 1 ) is the conditional pdf of X k with condition on X 1 = x 1,..., X k 1 = x k 1. The basic idea: generate one number at a time: get x 1 from f 1 (x 1 ); get x 2 from f 2 (x 2 x 1 ), etc. The algorithm: get n random numbers u i = U(0, 1), i = 0, 1,..., n; subsequently get the following RVs: F 1 (X 1 ) = u 1, F 2 (X 2 X 1 ) = u 2,... F n (X n X 1,..., X n 1 ) = u n. (40) Lecture: Generating random numbers 59

Example: generate from f(x, y) = x + y: marginal pdf and cdf of X are given by: f(x) = 1 0 f(x, y)dy = x + 1 2, x F (x) = f(x )dx = 1 2 (x2 + x). (41) 0 conditional pdf and cdf of Y are given by: f(y x) = f(x, y) f(x) = x + y x + 1, F (y x) = 2 y 0 f(y x)dy = xy + 1 2 y2 x + 1. (42) 2 by inversion we get: x = 1 2 ( 8u 1 + 1 1), y = x 2 + u 2 (1 + 2x) x. (43) Lecture: Generating random numbers 60