Part III Lecture 3: Probability and Stochastic Processes Stephen Kinsella (UL) EC4024 February 8, 2011 54 / 149
Today Basics of probability Empirical distributions Properties of probability distributions The Law of large numbers Stochastic processes Stephen Kinsella (UL) EC4024 February 8, 2011 55 / 149
Basics of Probability Probability began with betting: Henry IV, Part II: We all that are engaged to this loss Knew that we ventured on such dangerous seas / That if we wrought out life twas ten to one. And yet we ventured, for the gain proposed / Choked the respect of likely peril feared. Stephen Kinsella (UL) EC4024 February 8, 2011 56 / 149
Basics Take an event with possible outcomes A 1, A 2,...,A n, the probability for A k is p k n k /N, wheren is the number of repeated identical experiments or observations and n k is the number of times that n k is observed to occur. Example: Rolling a Die. For equally probable events, p =1/N, in the case of the fair 6-sided die, p =1/6. Take 2 events, A and B. Formutuallyexclusiveevents,A and B s probabilities add up, so P(A + B) = P(A) + P(B). The complete set of mutually exclusive alternatives is exhaustive, meaning P(Ak ) = 1. Stephen Kinsella (UL) EC4024 February 8, 2011 57 / 149
Example Coin: the probability that a coin lands on Heads plus the probability it doesn t adds up to 1. Exhaust the set of possible alternatives: A H, A T, they sum to 1. Dice: for a fair die, p k = 1/6. What is k? For statistically independent events, probabilities multiply. So P(AandB) =P(A)P(B). Example: Probability of successive coin tosses (p=1/2) = p 2 =(1/2) 2. Note: statistical independence is not randomness. Stephen Kinsella (UL) EC4024 February 8, 2011 58 / 149
Thinking around the problem We need to find the probability that something does happen, because we re betting on the outcome. So I want the die to roll a 6 (or whatever). Iwanttofindp. Now to do this, we need to know the probability of the event not occurring, or q =(1 p). The probability of getting at least 1 event we want is 1 (q) n. Probability of getting a 6 in n tosses of a fair die is 1 (5/6) n. Clearly you need to do this a lot to make money. Where is the break even point? n 4as1/2 =(5/6) n. What does this tell you? You can make money by getting lots of people to bet a 6 won t occur in four or more throws of a die. Note that you still might get Gamblers Ruin, there is no cream for this. Stephen Kinsella (UL) EC4024 February 8, 2011 59 / 149
Empirical distributions 1 You ll have seen histograms before. 2 Consider a collection of N data points along a straight line, call them x 1, x 2,...,x n. 3 Let P(x) be the probability that any point lies to the left of some point x 4 The empirical probability distribution is given by P(x) = k θ(x x i )/n (9) i 1 5 x k is the nearest point to the left of x, x k x and θ(x )=1if0 x and 0 otherwise. 6 P(x) is non decreasing. It defines a sort of staircase of steps, is constant between 2 data points, and discontinouos at any data point 7 Does this sound familiar? 8 It s the demand/supply curve description we gave in lecture 1. Stephen Kinsella (UL) EC4024 February 8, 2011 60 / 149
Measure theory 1 P(x) is called a probability measure. 2 P(x) has a probability density function (pdf), f (x), where dp(x) =f (x)dx, and f (x) = n δ(x x i )/n. (10) i=1 3 You ve seen averages before. We can use the pdf to compute the average of the empirical distribution, which is: x = xdp(x) = 1 n n x i (11) 1 4 When are data well described by the mean? The characteristic function of any random variable completely defines its probability distribution. Stephen Kinsella (UL) EC4024 February 8, 2011 61 / 149
Measure theory, 2 x 2 = x 2 dp(x) = 1 n n 1 x 2 i The mean square fluctuation is defined by x 2 = (x =x) 2 = x 2 = x 2 (12) The root mean square fluctuation is an indication of the usefulness of the average to characterise the data. The data are accurately characterised by the mean, if x 2 1/2 x (13) Stephen Kinsella (UL) EC4024 February 8, 2011 62 / 149
Distributions The Gaussian distribution is defined by the density With its mean square fluctuation given by f (x) = 1 2πσ e (x x)2 /2σ 2 (14) x 2 = σ 2 (15) We like the Gaussian a lot, because it is a limit distribution coming from the law of large numbers, and because when we take x =lnp, then g(p)dp = f (x)dx defines the density g(p), which is lognormal in the variable p. This is a cool mathematical trick, which most of the time we need to assume to get the models working. Stephen Kinsella (UL) EC4024 February 8, 2011 63 / 149
The Law of Large Numbers Consider the empirical distribution where x k occurs a fraction p k times with k =1,...,m. Then x = xdp(x) = 1 n n x j = j=1 n p k x k. (16) x is a random variable, so x = 1 n n k=1 x k. Via Tschebychev s inequality, which we ll prove in class, it will be shown that if the n random variables are distributed identically with mean square fluctuation σ 2 then σx 2 = σ2 (17) n This suggests expected uncertainty can be reduced by studying the sum x of n independent variables, instead of the independent variables x k.thisis the weak law of large numbers, which gives rise to the central limit theorem McCauley (2004, 2003). k=1 Stephen Kinsella (UL) EC4024 February 8, 2011 64 / 149
Stochastic Processes Let s call a random variable B if it s described by a probability distribution P( B). We care whether this variable evolves deterministically or randomnly in time If financial markets exhibited smooth changes at small time scales, then we d be sorted. They don t. We see jumps, bumps, and drops, even at the smallest time scales. So we need stochastic differential equations to describe these phenomena. Yeh Wha? Stephen Kinsella (UL) EC4024 February 8, 2011 65 / 149
SDEs [Prof. James Gleeson] Stephen Kinsella (UL) EC4024 February 8, 2011 66 / 149
Summary Finance is about betting. Describing betting properly requires (and in fact gave birth to) probability theory One way to describe financial data is stochastic processes, and stochastic differential equations. These require restrictive assumptions to work well, like the Gaussian distribution. When underlying data are not well described as Gaussian, these methods rapidly fall apart. Stephen Kinsella (UL) EC4024 February 8, 2011 67 / 149