Random Variables, Independence and Covariance Jack Xin (Lecture) and J. Ernie Esser (Lab) Abstract Classnotes on random variables, (joint) distribution, density function, covariance. 1 Basic Notion and Examples Consider throwing a die, there are 6 possible outcomes, denoted by ω i, i = 1,, 6; the set of all outcomes Ω = {ω 1,, ω 6 }, is called sample space. A subset of Ω, e.g. A = {ω 2, ω 4, ω 6 }, is called an event. Suppose we repeat N times the die experiment and event A happens N a times, then the probability of event A is P (A) = lim N N a /N. For a fair die, P (A) = 1/2. The events E and F are independent, if: P (E and F both occur) = P (E)P (F ). Conditional probability P (E F ) (probability of E occurs given that F already occurs) is given by: P (E F ) = P (E and F both occur)/p (F ). A random variable r.v. X(ω) is a function: Ω R, described by distribution function: Department of Mathematics, UCI, Irvine, CA 92617. F X (x) = P (X(ω) x), (1.1) 1
which satisfies: (1) lim x F X (x) = 0, lim x + F X (x) = 1. (2) F X (x) is nondecreasing, right continuous in x. (3) F X (x ) = P (X < x). (4) P (X = x) = F X (x) F X (x ). Conversely, if F satisfies (1)-(3), it s a distribution function of some r.v. When F X is smooth enough, we have a density function p(x) such that: F (x) = x p(y) dy. Examples of probability density function (PDF): (1) Uniform distribution on [a, b]: p(x) = χ [a,b] (x)/(b a); (2) unit or standard Gaussian (normal) distribution (σ > 0): (3) Laplace distribution (a > 0): p(x) = (2πσ 2 ) 1/2 e x2 /(2σ 2) ; p(x) = 1 2a e x /a Examples (discrete r.v): (1) two point r.v, taking x 1 with probability p (0, 1), x 2 with probability 1 p, distribution function is: F X = 0 x < x 1 p x [x 1, x 2 ) 1 x x 2, 2
(2) Poisson distribution with (λ > 0): p n = P (X = n) = λ n exp{ λ}/n!, n = 0, 1, 2,. Mean of a r.v. is: for the discrete case and: µ = E(X) = N j=1 x j p j, µ = E(X) = R 1 xp(x) dx, for the continuous case. Variance is: σ 2 = V ar(x) = E((X µ) 2 ), σ is called standard deviation. 2 Joint Distribution For n r.v s X 1, X 2,, X n, Joint Distribution Function is: F X1,,X n (x 1,, x n ) = P ({X i (ω) x i, i = 1, 2,, n}). n = 2, F X1,X 2 0, x i, F X1,X 2 1, x 1, x 2 +, F X1,X2 is nondecreasing and right continuous in x 1 and x 2. Marginal Distribution F X1 : F X1 (x 1 ) = lim x 2 F X 1,X 2 (x 1, x 2 ). 3
Continuous r.v: F X1,X 2 (x 1, x 2 ) = x 1 x 2 p(y 1, y 2 )dy 1 dy 2, p 0 is density function. Joint Gaussian with mean µ = (µ 1, µ 2 ) and covariance matrix C = (c ij ), c ij = E(X i µ i )(X j µ j ); density function is: p(x 1, x 2 ) = 1 2π det(c) exp{ 1 2 2 i,j=1 c i,j (x i µ i )(x j µ j )}, (2.2) where matrix (c i,j ) is the inverse of covariance matrix C. Independence: F X1 X 2 (x 1, x 2 ) = F X1 (x 1 )F X2 (x 2 ), p(x 1, x 2 ) = p 1 (x 1 )p 2 (x 2 ). 3 Random Number Generators On digital computers, psuedo-random numbers are used as approximations of random numbers. A common algorithm is the linear recursive scheme: X n+1 = ax n (mod c ), (3.3) a and c positive relatively prime integers, with initial value seed X 0. The numbers: U n = X n /c, will be approximately uniformly distributed over [0, 1]. c is usually a large integer in powers of 2, a is a large integer relative prime to c. 4
Matlab command rand(m,n) generates m n matrices with psuedo random entries uniformly distributed on (0, 1) ( c = 2 1492 ), using current state. S = rand( state ) is a 35-element vector containing the current state of the uniform generator. rand( state,0) resets the generator to its initial state. rand( state,j), for integer J, resets the generator to its J-th state. Similarly, randn(m,n) generates m n matrices with psuedo random entries standard-normally distributed, or unit Gaussian. Example: a way to visualize the generated random numbers is: t = (0 : 0.01 : 1) ; rand( state, 0); y1 = rand(size(t)); randn( state, 0); y2 = randn(size(t)); plot(t, y1, b, t, y2, g ) Two-point r.v. can be generated from uniformly distributed r.v. U [0, 1] as: x X = 1 U [0, p] x 2 U (p, 1] A continuous r.v with distribution function F X, can be generated from U as X = FX 1 (U) if FX 1 exists, or more generally: X = inf{x : U F X (x)}. This is called inverse transform method. It applies to exponential distribution, to give: X = ln(1 U)/λ, U (0, 1). 5
4 Matlab Exercises Exercise 1: Generate 100 uniformly distributed random variables X = rand(1, 100), and plot the histogram with 10 bins hist(x, 10). This histogram bins the elements of X into 10 equally spaced containers (non-overlapping intervals of length 1/10) and returns the number of elements in each container. Next, generate 10 5 uniformly distributed random variables by rand(1, 10 5 ), plot histogram hist(x, 100), comment on the difference. What density function is the histogram approximating? Exercise 2: Repeat Exercise 1 for Gaussian r.v. generated by X = randn(1, 10 5 ), with a plot of histogram hist(x, 100). Exercise 3: Let X = rand(1, 10 5 ), Y = ln(1 X)/0.3, find the density function of Y and compare with hist(y, 100). Exercise 4-I: Let Z = (N 1, N 2 ), where N 1 and N 2 are independent zero mean unit Gaussian random variables. Let S be an invertible 2x2 matrix, show that X = S T Z is jointly Gaussian with zero mean, and covariance matrix S T S. Exercise 4-II: Write a program to generate a pair of Gaussian random numbers (X 1, X 2 ) with zero mean and covariance E(X 2 1) = 1, E(X 2 2) = 1/3, E(X 1 X 2 ) = 1/2. Generate 1000 pairs of such numbers, evaluate their sample averages and sample covariances. 5 Stochastic Processes Sequence of r.v. s X 1, X 2,, X n, occuring at discrete times t 1 < t 2 < t n < is called a discrete stochastic process, with 6
joint distribution F Xi1,X i2,,x ij, i j = 1, 2, as its probability law. Gaussian Process: all joint distributions are Gaussian. Continuous Stochastic Process: X(t) = X(t, ω), t [0, 1], is a function of two variables, X : [0, 1] Ω R, where X is a r.v. for each t, for each ω, we have a sample path (a realization) or trajectory of the process. Statistical quantities: µ(t) = E(X(t)), σ 2 (t) = V ar(x(t)), covariance: C(s, t) = E((X(s) µ(s))(x(t) µ(t))), for s t. Process with independent increment if X(t j+1 ) X(t j ), j = 0, 1, 2,, are all independent r.v. s. Standard Wiener Process (Brownian Motion B.M.): Gaussian process W (t), t 0, with independent increment, and: W (0) = 0 1, E(W (t)) = 0, V ar(w (t) W (s)) = t s, for all s [0, t]. B.M. Covariance: C(s, t) = min(s, t). Stationary Process: all joint distributions are translation invariant. Stationary Gaussian Process: Covariance is translation invariant. B.M. Covariance: C(s, t) = min(s, t), not stationary. 7
6 Random Walk and BM Divide time interval [0, 1] into N equal length subintervals [t i, t i+1 ], i = 0, 1,, N. Consider a walker making steps ± δt, δt = 1/N with probability 1/2 each, starting from x = 0. In n steps, the walker s location is: S N (t n ) = δt n i=1 X i, (6.4) where X i are independent two point r.v s taking ±1 with equal probability. Define a piecewise continuous function: S N (t) = S N (t n ), t [t n, t n+1 ], n N 1. S N has independent increment X 1 δt, X2 δt etc for given subintervals, and in the limit N tends to a process with independent increment. Moreover: E(S N ) = 0, V ar(s N (t)) = δt In the limit N, V ar(s N (t)) t. Applying Central Limit Theorem, the approximate process S N (t) converges in law to a process with independent increment, zero mean, variance t, and Gaussian, so BM. Using S N (t) is a way to numerically construct BM as well. The X i s are generated from U(0, 1) as: X i = 1 if U [0, 1/2]; X i = 1, if U (1/2, 1]. A shortcut is to replace two point X i s by i.i.d unit Gaussian r.v s. Try the 2 line Matlab code to generate a BM sample path: randn( state,0); N=1e4; dt=1/n; w=sqrt(dt)*cumsum([0;randn(n,1)]); plot([0:dt:1],w); 8 t δt.
cumsum is a fast summation on vector input. Change the state number from 0 to 10 (or a larger number) to see different sample paths (Figure 1). Figure 1: Four Sample Paths of Numerical Approximation of Brownian Motion on [0,1]. 9