BROWNIAN MOTION. X 1 X 2. X n X =

Similar documents
Data Mining: Algorithms and Applications Matrix Math Review

Notes on Orthogonal and Symmetric Matrices MENU, Winter 2013

Inner Product Spaces

[1] Diagonal factorization

LS.6 Solution Matrices

Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model

Introduction to Matrix Algebra

1 The Brownian bridge construction

Math 4310 Handout - Quotient Vector Spaces

by the matrix A results in a vector which is a reflection of the given

IEOR 6711: Stochastic Models I Fall 2012, Professor Whitt, Tuesday, September 11 Normal Approximations and the Central Limit Theorem

Linear Algebra Notes for Marsden and Tromba Vector Calculus

Similarity and Diagonalization. Similar Matrices

2x + y = 3. Since the second equation is precisely the same as the first equation, it is enough to find x and y satisfying the system

MATH 4330/5330, Fourier Analysis Section 11, The Discrete Fourier Transform

6.3 Conditional Probability and Independence

MATRIX ALGEBRA AND SYSTEMS OF EQUATIONS

MATRIX ALGEBRA AND SYSTEMS OF EQUATIONS. + + x 2. x n. a 11 a 12 a 1n b 1 a 21 a 22 a 2n b 2 a 31 a 32 a 3n b 3. a m1 a m2 a mn b m

Notes on Determinant

Solution to Homework 2

0.8 Rational Expressions and Equations

8 Square matrices continued: Determinants

3. INNER PRODUCT SPACES

ARBITRAGE-FREE OPTION PRICING MODELS. Denis Bell. University of North Florida

Au = = = 3u. Aw = = = 2w. so the action of A on u and w is very easy to picture: it simply amounts to a stretching by 3 and 2, respectively.

To give it a definition, an implicit function of x and y is simply any relationship that takes the form:

MATH10212 Linear Algebra. Systems of Linear Equations. Definition. An n-dimensional vector is a row or a column of n numbers (or letters): a 1.

Linear Algebra I. Ronald van Luijk, 2012

Covariance and Correlation

Chapter 17. Orthogonal Matrices and Symmetries of Space

Linear Programming. March 14, 2014

MATH APPLIED MATRIX THEORY

Notes on Black-Scholes Option Pricing Formula

4.5 Linear Dependence and Linear Independence

NEW YORK STATE TEACHER CERTIFICATION EXAMINATIONS

LogNormal stock-price models in Exams MFE/3 and C/4

Student Outcomes. Lesson Notes. Classwork. Discussion (10 minutes)

1.6 The Order of Operations

1 Solving LPs: The Simplex Algorithm of George Dantzig

Vector and Matrix Norms

Polynomial Invariants

December 4, 2013 MATH 171 BASIC LINEAR ALGEBRA B. KITCHENS

Continued Fractions and the Euclidean Algorithm

Chapter 6. Orthogonality

LECTURE 9: A MODEL FOR FOREIGN EXCHANGE

1 Determinants and the Solvability of Linear Systems

Overview of Monte Carlo Simulation, Probability Review and Introduction to Matlab

Integrals of Rational Functions

FEGYVERNEKI SÁNDOR, PROBABILITY THEORY AND MATHEmATICAL

MATH Fundamental Mathematics IV

MULTIVARIATE PROBABILITY DISTRIBUTIONS

Random variables, probability distributions, binomial random variable

Moreover, under the risk neutral measure, it must be the case that (5) r t = µ t.

Linear Algebra Review. Vectors

Algebra 2 Chapter 1 Vocabulary. identity - A statement that equates two equivalent expressions.

A linear combination is a sum of scalars times quantities. Such expressions arise quite frequently and have the form

Mathematics Pre-Test Sample Questions A. { 11, 7} B. { 7,0,7} C. { 7, 7} D. { 11, 11}

PYTHAGOREAN TRIPLES KEITH CONRAD

Linear Algebra Notes

Monte Carlo Methods in Finance

Math 1050 Khan Academy Extra Credit Algebra Assignment

Algebra Unpacked Content For the new Common Core standards that will be effective in all North Carolina schools in the school year.

1 Sufficient statistics

Eigenvalues, Eigenvectors, Matrix Factoring, and Principal Components

The Determinant: a Means to Calculate Volume

Numerical Analysis Lecture Notes

x = + x 2 + x

1 Short Introduction to Time Series

Notes on Factoring. MA 206 Kurt Bryan

LOGNORMAL MODEL FOR STOCK PRICES

October 3rd, Linear Algebra & Properties of the Covariance Matrix

Vieta s Formulas and the Identity Theorem

Mathematics Course 111: Algebra I Part IV: Vector Spaces

MATH 304 Linear Algebra Lecture 18: Rank and nullity of a matrix.

CITY UNIVERSITY LONDON. BEng Degree in Computer Systems Engineering Part II BSc Degree in Computer Systems Engineering Part III PART 2 EXAMINATION

The Bivariate Normal Distribution

The Black-Scholes-Merton Approach to Pricing Options

Lecture L3 - Vectors, Matrices and Coordinate Transformations

Solving Systems of Linear Equations

CURVE FITTING LEAST SQUARES APPROXIMATION

Quadratic forms Cochran s theorem, degrees of freedom, and all that

Probability and Statistics Prof. Dr. Somesh Kumar Department of Mathematics Indian Institute of Technology, Kharagpur

Lecture 3: Finding integer solutions to systems of linear equations

Understanding and Applying Kalman Filtering

WHERE DOES THE 10% CONDITION COME FROM?

BANACH AND HILBERT SPACE REVIEW

Master s Theory Exam Spring 2006

a 11 x 1 + a 12 x a 1n x n = b 1 a 21 x 1 + a 22 x a 2n x n = b 2.

Similar matrices and Jordan form

LINEAR ALGEBRA W W L CHEN

LEARNING OBJECTIVES FOR THIS CHAPTER

Matrices 2. Solving Square Systems of Linear Equations; Inverse Matrices

Chapter 3. Distribution Problems. 3.1 The idea of a distribution The twenty-fold way

Linear and quadratic Taylor polynomials for functions of several variables.

This unit will lay the groundwork for later units where the students will extend this knowledge to quadratic and exponential functions.

DIFFERENTIABILITY OF COMPLEX FUNCTIONS. Contents

Pigeonhole Principle Solutions

Stochastic Processes and Advanced Mathematical Finance. The Definition of Brownian Motion and the Wiener Process

COLLEGE ALGEBRA. Paul Dawkins

Financial Mathematics and Simulation MATH Spring 2011 Homework 2

Transcription:

BROWNIAN MOTION 1 Expectations and Covariances of Random Vectors A random vector, or more precisely, a random n-vector is a column vector X R n whose coordinates are jointly defined random variables: X = X 1 X X n Detailed probability work with random vectors requires knowledge of the joint distribution of the coordinates X 1,, X n, often based on a joint probability density function But here we will not worry about the general theory, and will only concern ourselves with the joint distribution in one special case, namely Gaussian random vectors, to be introduced later For now, we will focus only on expected values and covariances for random vectors 11 Mean vectors We take the expected value of a random vector X in the following natural way: E(X 1 ) µ 1 E(X ) E( X) = E(X n ) = µ µ n = µ, where µ i = E(X i ) for each i We call the vector µ the mean vector of X Note that the mean vector is simply a vector of numbers These numbers are the means (expected values) of the random variables X i Exercise 1 Let X be a random -vector Calculate its mean vector if X is a point randomly selected from within (i) The unit square (that is, the square with vertices (0, 0), (1, 0), (0, 1), (1, 1)) (ii) The unit circle centered at the origin (iii) The triangle with vertices (0, 0), (1, 0), (0, 1) (iv) The region above the parabola y = x and inside the unit square 1

BROWNIAN MOTION Not surprisingly, linear algebra plays an important role with random vectors For example, it will be common to multiply a random n-vector X by a non-random m n matrix A, giving us the random m-vector A X We can also multiply random vectors by scalars, and add random vectors to other vectors (random or non-random) Suppose X and Y are random vectors, and let A and B be non-random matrices Then it is not hard to prove that (1) E(A X + B Y ) = AE( X) + BE( Y ), assuming that the sizes of the vectors X, Y and the matrices A, B are chosen so that the matrix products A X and B Y both make sense, and also so that the vector addition A X + B Y makes sense An important special case of (1) is the following: suppose X is a random n-vector, A is a non-random m n matrix, and b is a nonrandom m-vector (column) Then () E(A X + b) = A µ + b, where µ = E( X) Exercise Explain why Equation () is a special case of Equation (1) Exercise Prove Equation () in the case that X is a random - vector, A is a non-random matrix, and b is a non-random - vector Similarly, we can take transposes of our random vectors to get row vectors, and multiply by matrices on the right: () E( X T A + Y T B) = E( X) T A + E( Y ) T B, where we again assume that the sizes of the vectors and matrices are appropriate This means, for example, that if X is a random n-vector, then the matrix A must n m for some m, rather than m n Here is the special case corresponding to our previous Equation (): (4) E( X T A + b T ) = µ T A + b T, where X is a random n-vector with mean vector µ, A is a non-random n m matrix, and b is a non-random m-vector (column)

BROWNIAN MOTION 1 Covariance matrices We are also interested in defining something that would correspond to the variance of the random vector X This will be a matrix, consisting of all the possible covariances of pairs of the random variables X 1, X,, X n Because it contains covariances, we call it the covariance matrix of X: Cov(X 1, X 1 ) Cov(X 1, X ) Cov(X 1, X n ) Cov( X) Cov(X =, X 1 ) Cov(X, X ) Cov(X, X n ) Cov(X n, X 1 ) Cov(X n, X ) Cov(X n, X n ) Remember that for any random variable Y, Cov(Y, Y ) = Var(Y ), so the diagonal elements in the covariance matrix are actually variances, and we could write the covariance matrix this way: Var(X 1 ) Cov(X 1, X ) Cov(X 1, X n ) Cov( X) Cov(X =, X 1 ) Var(X ) Cov(X, X n ) Cov(X n, X 1 ) Cov(X n, X ) Var(X n ) Of course, in all of these definitions and formulas involving mean vectors and covariance matrices, we need to assume that the means and covariances of the underlying random variables X 1, X,, X n exist Since we will be primarily dealing with Gaussian random variables, this assumption will typically be satisfied Exercise 4 Calculate the covariance matrix of the random vector X for each of the parts of Exercise 1 In connection with the covariance matrix, it will be useful for us to take the expected value of a random matrix As you might expect, this gives us the matrix whose elements are the expected values of the random variables that are the corresponding elements of the random matrix Thus, if M is a random m n matrix whose elements are random variables X ij, then E(X 11 ) E(X 1 ) E(X 1n ) E(X E(M) = 1 ) E(X ) E(X n ) E(X m1 ) E(X m ) E(X mn ) It should be easy for you to see that if M and N are random m n matrices and if a and b are real numbers, then (5) E(aM + bn) = ae(m) + be(n)

4 BROWNIAN MOTION Now we have a fancy and useful way to write the covariance matrix: () Cov( X) = E(( X µ)( X µ) T ) = E( X X T ) µ µ T Can you see how this expression makes sense? In order to do so, you must recall that if v is a column vector in R n, then the product v v T is an n n matrix Thus, ( X µ)( X µ) T is a random n n matrix: (X 1 µ 1 ) (X 1 µ 1 )(X µ ) (X 1 µ 1 )(X n µ n ) ( X µ)( X (X µ) T = µ )(X 1 µ 1 ) (X µ ) (X µ )(X n µ n ) (X n µ n )(X 1 µ 1 ) (X n µ n )(X µ ) (X n µ n ) Now you should be able to see that if you take the expected value of this random matrix, you get the covariance matrix of X Exercise 5 Prove the second equality in () (Use (5)) Now we are equipped to get a formula for Cov(A X + b), where X is a random n-vector, A is a non-random m n matrix, and b is a non-random m-vector Note that since the vector b won t affect any of the covariances, we have Cov(A X + b) = Cov(A X), so we can ignore b But we obviously can t ignore A! Let s use our fancy formula () for the covariance matrix, keeping in mind that E(A X) = A µ, where µ is the mean vector X Cov(A X) = E((A X A µ)(a X A µ) T ) Now we use rules of matrix multiplication from linear algebra: (A X A µ)(a X A µ) T = A( X µ)( X µ) T A T Take the expected value, and then use () and (4) to pull A and A T outside of the expected value, giving us (7) Cov(A X + b) = A Cov( X)A T This is an important formula that will serve us well when we work with Gaussian random vectors

BROWNIAN MOTION 5 1 Change of coordinates Note that Cov( X) is a symmetric matrix In fact, this matrix is nonnegative definite How do we know this? From the Spectral Theorem, we know there exists an orthogonal matrix O such that O T Cov( X)O is a diagonal matrix D, with the eigenvalues of Cov( X) being the diagonal elements, and the columns of O being the corresponding eigenvectors, normalized to have length 1 But by (7), this diagonal matrix D is the covariance matrix of the random vector O T X, and the diagonal entries of any covariance matrix are variances, hence nonnegative This tells us that Cov( X) has nonnegative eigenvalues, making it nonnegative definite Let D stand for the diagonal matrix whose diagonal elements are the nonnegative square roots of the diagonal elements of D Then it is easy to see that Cov( X) = O D D T O T = ΣΣ T, where Σ = O D Even though the matrix Σ is not uniquely determined, it seems OK to call it the standard deviation matrix of X So we have D = O T Cov( X)O and Σ = O D, where λ 1 0 0 0 λ D = 0 and O = v 1 v v n 0 0 λ n and λ 1, λ,, λ n are the eigenvalues of Cov( X), with corresponding eigenvectors v 1, v,, v n We will always assume that we have put the eigenvalues into decreasing order, so that λ 1 λ λ n 0 Now suppose that X is a random vector with covariance matrix ΣΣ T, where Σ = O D and O and D are as in the previous paragraph Then clearly Y = O T X is a random vector with covariance matrix D This means that by simply performing a rotation in R n, we have transformed X into a random vector whose coordinates are uncorrelated If the coordinates of X are uncorrelated, then we say the X itself is uncorrelated Informally, we have the following statement: Every random vector is uncorrelated if you look at it from the right direction! Mathematically, this statement can be written as ( Cov O T X ) = O T ΣΣ T O = D

BROWNIAN MOTION Exercise For each part of Exercise 1, determine the change of coordinates (rotation) that changes X into an uncorrelated random vector 14 Standardization We can go a little bit further As stated earlier, we select the matrix O so that the nonzero diagonal elements of D, which are the eigenvalues λ 1, λ,, λ n, are in decreasing order This means that the nonzero eigenvalues come before the eigenvalues that equal 0 Let m n be the number of nonzero diagonal eigenvalues This is the same as saying that m is the rank of the covariance matrix Then for i > m, the coordinates Y i of the vector Y = O T ( X µ) (note that we centered X by subtracting its mean vector) have expected value 0 and variance 0, implying that Y i = 0 for i > m Let Y 1 / λ 1 Z = Y / λ Y m / λ m This is the same as letting Z = BY, where B is the m n matrix 1 λ1 0 0 0 0 0 1 λ 0 0 0 B = 1 0 0 λm 0 0 Then Z is a random m-vector with mean vector 0 and covariance matrix equal to the m m identity matrix I m We say that X has been centered and scaled, or in other words, standardized Informally speaking, we center X by subtracting its mean vector, and we scaled it by dividing by its standard deviation matrix (after eliminating the extra dimensions that correspond to 0 eigenvalues) To summarize, the standardized version of X is Z, where (8) Z = BO T ( X µ) In many cases, all of the diagonal elements of D are nonzero, and B = D 1 This last condition is equivalent to saying that Σ is invertible, and then we have Σ 1 = D 1 O T

BROWNIAN MOTION 7 Theorem 1 Let X be a random n-vector with mean vector µ and standard deviation matrix Σ If Σ has rank n (or equivalently, is invertible), then the standardized random vector (9) Z = Σ 1 ( X µ) is a random n-vector with mean vector 0 and covariance matrix I n, where I n is the n n identity matrix If Σ has rank m < n, then there is an m-dimensional subspace V R n such that P (( X µ) V ) = 1, and the random m-vector Z defined in (8) has mean vector 0 and covariance matrix I m Note that if the covariance matrix of a random vector is the identity matrix, then its standard deviation matrix is also the identity matrix Exercise 7 Standardize X for each part of Exercise 1 Exercise 8 Let V 1 and V be the outcomes (0 or 1) of independent tosses of a fair coin Let V 1 + V X = V 1 V 1 V Calculate the mean vector and covariance matrix of X, and then standardize X Gaussian Random Vectors Let X be a random n-vector We say that X is a Gaussian random n-vector if for all non-random n-vectors a, the random variable a X is normally distributed Note that a X = a1 X 1 + a X + + a n X n, so this condition simply says that X is a Gaussian random vector if any linear combination of its coordinates is normally distributed In particular, the individual coordinates X 1, X,, X n must all be normally distributed, but that alone is not enough for X to be Gaussian

8 BROWNIAN MOTION Here is how you get Gaussian random vectors Start with the random n-vector Z whose coordinates are independent random variables, all of which have the standard normal distribution Then it is easy to see that Z has mean vector 0 and covariance matrix I n We know from the text that any sum of independent normal random variables is itself normal, so Z satisfies the definition of a Gaussian random vector This is the standard normal (or Gaussian) random n-vector Let Σ be a non-random m n matrix and let µ be a non-random m-vector Then the random m-vector X = Σ Z + µ has mean vector µ and covariance matrix ΣΣ T Let a be a non-random m-vector Then a X = ( aσ) Z + a µ Since aσ is a non-random n-vector, we know that ( aσ) Z must be normally distributed And since a µ is just a constant, it follows that a X is normally distributed Therefore, by definition, X is a Gaussian random m-vector This procedure shows us how we can get a Gaussian random vector with any desired mean vector and covariance matrix The preceding argument also works to prove the following fact: if X is a Gaussian random n-vector with mean vector µ and covariance matrix Γ, and if A is a non-random m n matrix and b is a non-random m-vector, then the random m-vector Y = A X + b is a Gaussian random vector with mean vector A µ + b and covariance matrix AΓA T To summarize: (i) If Z is a random n-vector with independent coordinates that have the standard normal distribution, then Z is a Gaussian random n-vector, called the standard Gaussian random n-vector (ii) if Z is a standard Gaussian random n-vector, Σ a non-random m n matrix and µ a non-random m-vector, then X = Σ Z + µ is a Gaussian random m-vector with mean vector µ and covariance matrix ΣΣ T

(iii) if BROWNIAN MOTION 9 X is any Gaussian random n-vector, then for any nonrandom m n matrix A and non-random m-vector b, the random m-vector A X + b is a Gaussian random m-vector (iv) Theorem 1 shows how to use matrix multiplication and vector addition to turn any Gaussian random vector into a standard Gaussian random vector It turns out that this is all you need to know about Gaussian random vectors, because the distribution of a Gaussian random vector is uniquely determined by its mean vector and covariance matrix This fact requires advanced techniques for its proof, so we won t give the proof here But we state it as a theorem: Theorem The distribution of a Gaussian random n-vector X is uniquely determined by its mean vector µ and covariance matrix Γ If Γ is invertible, the joint probability density function of X is f(x 1, x,, x n ) = f( x) = 1 π n det(γ) exp( 1 ( x µ) T Γ 1 ( x µ)) If Γ is a diagonal matrix, then the coordinates of X are independent, and for each i, the i th coordinate X i has the distribution N (µ i, Γ ii ) If we combine Theorems 1 and, we see that Every Gaussian random vector has independent coordinates if you look at it from the right direction! Exercise 9 If Γ is a diagonal matrix with strictly positive diagonal elements, show that the joint density function given in Theorem is indeed the joint density function of n independent random variables X 1, X,, X n such that for each i, X i has the distribution N (µ i, Γ ii ) Exercise 10 (i) Which of the following two matrices is a covariance matrix? 1 0 1 0 1 1 0 1 0 1 (ii) Using three independent standard normal random variables Z 1, Z, Z, create a random vector X whose mean vector is ( 1, 1, 0) T and whose covariance matrix is from (i)

10 BROWNIAN MOTION Example 1 Let Γ = 11 8 8 8 8 8 11 8 11 It turns out that the eigenvalues of Γ are λ 1 = 9, λ = λ = 1, with corresponding orthogonal eigenvectors v 1 = 0 v = v = Let O be the orthogonal matrix whose columns are these eigenvectors, and let D be the diagonal matrix whose diagonal elements are the eigenvalues 9, 1, 1 Then D is the diagonal matrix whose diagonal elements are, 1, 1 You should check that all of these statements are correct, or at least check some of them and know how to check the rest Since Γ is positive definite, we can think of Γ as a covariance matrix, with corresponding standard deviation matrix 0 Σ = O D = You should check that Γ = ΣΣ T Now, if X is a random -vector with covariance matrix Γ and mean vector, say, µ = 1 0,

BROWNIAN MOTION 11 then Y = O T ( X µ) is uncorrelated, with covariance matrix D and mean vector 0: Y = O T ( X µ) = 0 Doing the matrix multiplication gives (X 1 X + X + 1) Y = (X + X + ) ( X 1 X + X + 4) X 1 1 X X + You should be able to check directly that this is correct, and also that this random vector is uncorrelated and has mean vector 0 In order to fully standardize the random vector X, we need to multiply Y by D 1 to get Z = D 1 Y = 9 (X 1 X + X + 1) (X + X + ) ( X 1 X + X + 4) This random vector Z has mean vector 0 (that s easy to check) and covariance matrix I Let s do a little checking of the covariance matrix According to the rules we learned about covariances of ordinary random variables, Var(Z 1 ) = 1 7 [Var(X 1) + Var(X ) + Var(X ) Cov(X 1, X ) + Cov(X 1, X ) Cov(X, X )] We can obtain these variances and covariances from the covariance matrix Γ, so Var(Z 1 ) = 1 [ 11 7 + 11 + 11 + 1 + 1 + 1 ] = 1

1 BROWNIAN MOTION That s what we expected Let s try one of the covariances: 1 Cov(Z, Z ) = 1 [ Cov(X, X 1 ) Var(X ) + Cov(X, X ) Cov(X, X 1 ) + Cov(X, X ) + Var(X )] Filling in the values of the variances and covariances from Γ, you can easily check that this equals 0, as expected Let s do one more thing Suppose we started with a standard Gaussian random -vector Z How can we use it to create a random Gaussian -vector X with covariance matrix Γ and mean vector µ? We simply multiply Z by the matrix Σ and then add µ: 0 Z 1 1 X = ΣZ + µ = Z + 0 Z You can carry out the indicated operations to find that, for example, X 1 = Z 1 Z / + 1 Random Walk Let X 1, X, be independent random variables, all having the same distribution, each having mean µ and variance σ For n = 1,,,, define W (1) n = X 1 + X + + X n Let W be the following random vector (with infinitely many coordinates): W (1) = (W (1) 1, W (1), W (1), ) This is the random walk with steps X 1, X, X, (The reason superscript (1) will be clear later) We often think of the subscript as is the position of the random walk W (1) at time n time, so that W (1) n That is, we think of W (1) as a (random) dynamical system in discrete time, and at each time n, the system takes the step X n to move from its previous position W (1) n 1 to its current position W n (1) In those cases where it is desirable to talk about time 0, then we define W (1) 0 = 0 But we typically don t include this value as one of the coordinates of W (1)

BROWNIAN MOTION 1 It is easy to calculate the mean vector µ and covariance matrix of W (1) : µ σ σ σ µ µ = µ and Γ = σ σ σ σ σ σ Thus, µ n = nµ and Γ ij = min(i, j)σ Exercise 11 Verify these formulas for the mean vector and covariance matrix of W (1) Exercise 1 The simple symmetric random walk has steps X i such that P (X i = 1) = P (X i = 1) = 1/ Find the mean vector and covariance matrix of this random walk In the next section we will want to make the transition from random walks in discrete time to Brownian motion in continuous time The key to this transition is to chop discrete time up into smaller and smaller pieces Fix a value > 0 For n = 1,,, and t = n, define where for k = 1,,,, W ( ) t = Y 1 + Y + + Y n, Y k = (X k µ) + µ Note that the random walk W (1) defined earlier agrees with the case = 1 We think of the random variables Y k as the steps taken by a random walk at times t that are multiples of Since the random variables X k are assumed to be independent, and have the same distribution, with mean µ and variance σ, the random variables Y k are also independent, and they have the same distribution as each other, but their common mean is µ and their common variance is σ Thus, if t = n, the random variable W ( ) t has mean tµ and variance tσ And if s = m, then Cov(W s ( ), W ( ) t ) = min(s, t)σ Thus, for any > 0, we have created a random walk W ( ) that takes steps at times that are multiples of The mean vector µ of W ( ) has coordinates µ t = tµ covariance matrix Γ with elements Γ s,t = min(s, t)σ, for all times s, t that are multiples of Note that these do not depend on, so we might expect something good to happen as we try to make the transition to continuous time by letting

14 BROWNIAN MOTION 0 In the next section, we will see exactly what it is that does happen Exercise 1 The following numbers are the results of rolls of a standard -sided die: 1, 5,, 4,,, 1, 1,, 1,, 4,, 5, 4, 5, 5, 1, 5,, 5,,, 1, 1, 1, 5,, 1,,, 1 Let X 1, X,, X be these numbers For = 1, 1, 1, 1, draw graphs 4 8 of W ( ) for appropriate values of t between 0 and 4 4 The Central Limit Theorem for Random Walks We start by considering a special collection of random walks, the Gaussian random walks These are defined as in the previous section, with the additional assumption that the steps X i are normally distributed, so that for each, W ( ) is a Gaussian random vector (make sure you understand why this is true) If each step has mean µ and variance σ, then W ( ) is the Gaussian random walk with drift µ, diffusion coefficient σ and time step size If the drift is 0 and the diffusion coefficient is 1, then the steps of W ( ) all have mean 0 and variance, and then W ( ) is called the standard Gaussian random walk with time step size Note that for a fixed time step size, if W ( ) is a Gaussian random walk with drift µ and diffusion coefficient σ, then the random walk whose position at time t is 1 ( ) (W t µt) σ is the standard Gaussian random walk with time step size (for times t that are multiples of ) On the other hand, if W (1) is a standard Gaussian random walk with time step size, then we get a Gaussian random walk with drift µ and diffusion coefficient σ by going the other way That is, the random walk whose position at time t is σw ( ) t + µt is a Gaussian random walk with drift µ and diffusion coefficient σ So just as we can easily convert back and forth between a standard normally distributed random variable and a normally distributed random variable with mean µ and variance σ, we can also easily convert back and forth between a standard Gaussian random walk and a Gaussian random walk with drift µ and diffusion coefficient σ

BROWNIAN MOTION 15 What do we know about Gaussian random walks? In the following, the times (such as s and t) are all multiples of If W ( ) is a Gaussian random walk with drift µ, diffusion coefficient σ and time step size, then: (i) For each t > s 0, the increment W ( ) t W s ( ) has the distribution N (µ(t s), σ (t s)) (ii) If (s 1, t 1 ), (s, t ),, (s n, t n ) are disjoint open intervals in (0, ), then the increments W ( ) t 1 W s ( ) 1, W ( ) t W s ( ),, W ( ) t n W s ( ) n are independent Since the distribution of an increment of a Gaussian random walk depends only on the time difference t s, we say that the increments are stationary We summarize the two properties together by saying that a Gaussian random walk has stationary, independent increments Note that the distributions of the increments do not depend on, except for the assumption that the times involved must be multiples of It seems reasonable to guess that if we let 0, we will get a dynamical system W = {W t, t 0} that is defined for all nonnegative times t, and this stochastic process will have stationary independent normally distributed increments Furthermore, it seems reasonable to guess that W t, the position at time t, depends continuously on t That is, there is a stochastic process W with the following properties: (i) The position W t depends continuously on t for t 0 (ii) For each t > s 0, the increment W t W s has the distribution N (µ(t s), σ (t s)) (iii) If (s 1, t 1 ), (s, t ),, (s n, t n ) are disjoint open intervals in (0, ), then the increments W t1 W s1, W t W s,, W tn W sn are independent Einstein first talked about this in a meaningful mathematical way, and Norbert Wiener was the first to establish the existence of W with some rigor The stochastic process W is called Brownian motion with drift µ and diffusion coefficient σ When µ = 0 and σ = 1, it is called standard Brownian motion, or the Wiener process Here is a summary of what we have so far: if we fix the drift µ and the diffusion coefficient σ, then as the time step size goes to 0, the corresponding sequence of Gaussian random walks converges (in a rigorous mathematical sense that will not be made explicit here) to Brownian motion with drift µ and diffusion coefficient σ

1 BROWNIAN MOTION Here is a wonderful fact: the statement in the previous paragraph remains true, even if we drop the Gaussian assumption It doesn t matter what the original distribution of the steps X 1, X, X, is, when goes to 0, we get Brownian motion This is the Central Limit Theorem for random walks Why is this result true? The first point is that for any random walk, we know no matter what the time step size equals, the increments of the random walk W ( ) for non-overlapping time intervals are independent, and we know that the means, variances, and covariances of increments do not depend on The second point is that for any fixed t > s 0, the ordinary Central Limit Theorem tells us that the increment W ( ) t W s ( ) is approximately normal when is small, because this increment is the sum of a lot of independent random variables that all have the same distribution This means that for small enough > 0, it is hard to distinguish a non-gaussian random walk from a Gaussian random walk In the limit as 0, non-gaussian and Gaussian random walks become indistinguishable One final comment For any > 0, the relationship between W (1) and W ( ) is merely a change of coordinates If µ = 0, then the change of coordinates is particularly simple: speed up time by the factor 1/ and scale space by = ( ) W ( ) t W (1) t/ This means that if you start with a graph of the original random walk W (1) and do the proper rescaling, you can get a graph of W ( ) for really small, and this graph will look at lot like a graph of Brownian motion! This works whether your random walk is the simple symmetric random walk, with P (X 1 = 1) = P (X 1 = 1) = 1/, or a standard Gaussian random walk 5 Geometric Brownian Motion Let s recall a few facts about the binomial tree model We assume there is a risk-free interest rate r, and we will let > 0 be the length of a time step When the stock price goes up, it is multiplied by a factor u, and when it goes down, it is multiplied by a factor d There is a probability p of going up at each time step that comes from the No Arbitrage Theorem, given by the formula: p = er d u d

We will assume that BROWNIAN MOTION 17 u = e σ and d = 1/u = e σ The idea is that at each time step, the stock price is multiplied by a random quantity e R, which equals u with probability p and d with probability 1 p Therefore, R = σ with probability p and σ with probability 1 p This makes R like the step of a random walk In this setting, σ is called the volatility Let s calculate the approximate mean and variance of R In order to do this, we will repeatedly use the approximation e x 1 + x when x is small Then we get e r 1 + r, u 1 + σ and d 1 σ, so p r + σ σ = 1 + r σ We know that E(R) = σ(p 1), so we get E(R) r We also know that Var(R) = 4σ p(1 p) σ Now we look at the geometric random walk, which we will call S ( ) : S ( ) t = S 0 exp( (R 1 + R + + R t/ ), where R 1, R, are the steps of the underlying random walk When the time step size is small, the expression in the exponent is approximately a Brownian motion W Note that E(W t ) = E(R)/ r, and Var(W t ) = (t/ ) Var(R) σ This is exactly what we expect: using the probabilities given by the No Arbitrage Theorem, the approximate average rate of growth should equal the risk-free rate and we have arranged things so that σ is the approximate variance of the growth rate It turns out that we have been a little sloppy with our approximations Brownian motion is wild enough so that the approximation e x 1 + x isn t good enough Instead, we should use e x 1 + x + x / in some places When this is done, the diffusion coefficient of the underlying Brownian motion W t remains equal to σ, but the drift is changed The key change occurs in the approximation for p: p r + σ 1 σ σ = 1 + r σ σ 4 Once this change is made, we get E(R) (r σ /)

18 BROWNIAN MOTION Calculations with Brownian motion Throughout this section, W t will be the position at time t of a standard Brownian motion, X t will be the position at time t of a Brownian motion with drift µ and diffusion coefficient σ, and S t will be the position at time t of a geometric Brownian motion with risk-free interest rate r and volatility σ We will assume that W 0 and X 0 are always equal to 0, and that the initial stock price S 0 is a positive quantity In order to do calculations with a general Brownian motion X t, it is often best to turn it into a standard Brownian motion: W t = X t µt σ Similarly, it can be useful to turn a geometric Brownian motion S t into a standard Brownian motion We could do this in two steps First take the logarithm of S t and subtract the logarithm of the initial stock price to get a Brownian motion X t with drift µ = r σ / and diffusion coefficient σ: X t = log(s t /S 0 ) = log(s t ) log(s 0 ) Then we get the standard Brownian motion by subtracting the drift and dividing by the diffusion coefficient: W t = X t rt + σ t σ The upshot is that for many purposes, if you can do probability calculations with standard Brownian motion, then you can do probability calculations for all Brownian motions, including geometric Brownian motion The main exception to this is calculations that involve random times in some way But we will avoid such calculations in this course You will learn how to do those next year For the rest of this section, we will focus on doing some probability calculations with standard Brownian motion W t The simplest calculation has to do with the position of the Brownian motion at a specific time For example, calculate P (W 4 ) To do this, we use the fact that W 4 has the distribution N (0, 4) That means that Z = W 4 / has the standard normal distribution, so P (W 4 ) = P (Z /) = 1 Φ(15), where Φ is the standard normal cumulative distribution function The next simplest calculation has to do with non-overlapping increments For such calculations, we use the fact that Brownian motion

BROWNIAN MOTION 19 has stationary independent increments For example, P ( W 1 1 and W > W and W 5 < W + 1) = P ( 1 W 1 1)P (W W > 0)P (W 5 W < 1) = P ( 1 Z 1)P (Z > 0)P (Z < /) ( ) ( ( )) 1 1 = (Φ(1) 1) Φ The hardest calculations are for probabilities involving the positions of Brownian motion at several different times For these calculations, we use the joint distribution function To keep things from getting too messy, we will restrict our attention to events involving only two different times For example, suppose we want to calculate P (W < 1 and W 5 > ) To do this, we use the fact that the random -vector (W, W 5 ) T is a ( Gaussian ) random -vector with mean vector 0 and covariance matrix By Theorem, the joint density function of this random 5 -vector is ( f(x, y) = 1 π exp 1 ( 1 ( ) x (x y) 5) ) y Since ( ) 1 = 5 5 1 1 1, the quadratic expression in the exponential function is Therefore P (W < 1 and W 5 > ) = 1 π 5 1 x + 1 xy 1 y 1 exp ( 51 x + 1 xy 1 ) y dx dy It is best to use computer software to calculate the final numerical answer Another approach would be to make a rotation that would convert the Gaussian -vector into a vector with independent coordinates This simplifies the joint density function but complicates the region of integration

0 BROWNIAN MOTION 7 The fundamental stochastic differential equation Recall that the basic differential equation for the growth of a bank account with interest rate r (compounded continuously) is ds dt = rs Let s write this equation in a different way: ds t = rs t dt This emphasizes the way in which S depends on t, and also makes us think about the approximation to this differential equation in terms of increments: S t = S t+ t S t rs t t There is no randomness in this equation It describes a risk-free situation Its solution is S t = S 0 exp(rt) Here is a way to introduce some randomness: (10) S t rs t t + σs t (W t+ t W t ) = rs t t + σs t W t, where W t represents the position of a standard Brownian motion at time t The idea here is that the price of the stock S t is affected in two ways: first, there is the growth that comes from the risk-free interest rate r, and then there is a random fluctuation which, like the growth due to interest, is expressed as a factor σw t multiplied by the current price S t This factor can be positive or negative Roughly speaking, one thinks of it as an accumulation of many tiny random factors, so that it should be (approximately) normally distributed For reasons that go back to the No Arbitrage Theorem, these fluctuations should have mean 0 But the size of the fluctuations varies with the stock, and that is the reason for the volatility constant σ You ll learn more about why this is a reasonable model next year As we let t go to 0, we get the stochastic differential equation (11) ds t = rs t dt + σs t dw t This equation does not make sense in the ordinary way, because of the term dw t The paths of Brownian motion are so wild that a naive approach to understanding dw t doesn t work It took several geniuses (including Norbert Wiener and Kiyosi Ito) to find a way to make sense of this We won t go into detail here, but we will see a little bit of what goes on by checking that geometric Brownian motion with growth rate r and volatility σ is a solution to this stochastic differential equation Thus, let X t be a Brownian motion with drift µ = r σ / and diffusion

BROWNIAN MOTION 1 coefficient σ, and let S t = S 0 exp(x t ), where S 0 is the initial price of the stock To justify our claim that this choice for S t gives a solution to (11), we check to see if it seems to work in the approximation (10) (1) S t = S t+ t S t = S 0 exp(x t + X t ) exp(x t ) = S t (exp( X t ) 1) S t ( X t + 1 ( X t) ) In this expression, we used the nd degree Taylor expansion of the exponential function to get our approximation of S t on the right side Remember that X t = µt+σw t, where W t is a standard Brownian motion, so ) X t = (r σ t + σ W t = r t + σ W t σ t We also have ( X t ) = σ ( W t ) + (terms involving ( t) and t W t ) We saw last semester that the terms involving ( t) can be ignored in such approximations It turns out that the terms involving t W t can also be ignored they are obviously o( t) But Brownian motion is wild enough so that ( W t ) cannot be ignored Remember: E(( W t ) ) = E((W t+ t W t ) ) = Var(W t ) = t, so ( W t ) is comparable to t Putting everything into (1) gives S t rs t t + σs t W t + σ (( W t) t) It turns out that the difference ( W t ) t can also be ignored it is a random quantity that is typically small compared to t In other words, the random quantity ( W t ) t is typically o( t) After removing that term, we see that our choice of S t satisfies (10), as desired You can see that there is a lot of fancy mathematics hiding behind all of this Welcome to the world of stochastic differential equations!