Monte-Carlo Methods for Risk Management

Size: px
Start display at page:

Download "Monte-Carlo Methods for Risk Management"

Transcription

1 IEOR E460: Quantitative Risk Management Spring 010 c 010 by Martin Haugh Monte-Carlo Methods for Risk Management In these lecture notes we discuss Monte-Carlo (MC) techniques that are particularly useful in risk-management applications. We focus on importance sampling and stratified sampling, both of which are variance reduction techniques that can be very useful in estimating risk measures associated with rare-events. Because applications to risk management tend to be quite involved, we will only discuss one such application in these lecture notes. But see Chapter 9 of Glasserman s Monte-Carlo Methods in Financial Engineering (004) for further applications of importance and stratified sampling to credit risk and the estimation of risk measures in both light- and heavy-tailed settings. We also discuss the simulation of stochastic differential equations (SDEs) and low-discrepancy sequences (LDS), both of which can also be used in risk management applications. The simulation of SDE s is often necessary in pricing applications and in evaluating model risk. As we discussed in the Model Risk lecture notes, we should generally price exotic derivatives using a range of different but plausible models. In many instances, pricing can only be accomplished by simulating the underlying SDE s. Low discrepancy sequences have also become quite popular in MC applications, particularly when regular MC methods are too demanding from a computational viewpoint. As risk management applications are often very computationally demanding, LDS methods can also be of use here. We assume that readers are already familiar with Monte-Carlo simulation and know, in particular, how to generate random variables and analyze simulation output. We also assume that readers have had some exposure to variance reduction methods. 1 Importance Sampling Suppose we wish to estimate θ := P(X 0) = E[I {X 0} ] where X N(0, 1). The raw estimator given by the usual Monte-Carlo approach is obtained as follows: 1. Generate X 1,..., X n IID N(0, 1). Set I j = I {Xj 0} for j = 1,..., n 3. Set θ n = n j=1 I j/n 4. Compute approximate CI s For this problem, however, the usual approach would be completely inadequate since approximating θ to any reasonable degree of accuracy would require n to be inordinately large. For example, on average we would have to set n in order to obtain just one non-zero value of I. 1 Clearly this is impractical and a much smaller value of n would have to be used. Using a much smaller value of n, however, would almost inevitably result in an estimate, θ n = 0, and an approximate confidence interval [L, U] = [0, 0]! So the naive Monte-Carlo approach does not work. Before proceeding any further, it is not unreasonable to ask why such a problem would be important. After all, if you want to estimate θ = P(X 0), isn t it enough to know that θ is very close to 0? Put another way, who cares whether θ = or θ = 10 0? For many problems, this is a valid objection, and indeed for such 1 See Example 1.

2 Monte-Carlo Methods for Risk Management problems the answer is that we don t care. However, for many other other problems it is very important to know θ to a much greater level of accuracy. For example, suppose we are designing a nuclear power plant and we want to estimate the probability, θ, of a meltdown occurring sometime in the next 100 years. We would expect θ to be very small, even for a poorly designed power plant. However, this is not enough. Should a meltdown occur, then clearly the consequences could be catastrophic and so we need to know θ to a high degree of accuracy. For another example, suppose we want to price a deep-out-of-the-money option using simulation. Then the price of the option will be very small, perhaps lying between 10 cents and.1 cents. Clearly a bank is not going to suffer if it misprices an option and sells it for.1 cents when the correct value is 10 cents. But what if the bank sells 1 million of these options? And what if the bank makes similar trades several times a week? Then it becomes very important to price the option correctly. A particularly rich source of examples can be found in risk-management where we seek to estimate risk measures such as the VaR α or ES α of a given portfolio. Note that these examples require estimating the probability of a rare event. Even though the events are rare, they are very important because when they do occur their impact can be very significant. We will study importance sampling, a variance reduction technique that can be invaluable when estimating rare event probabilities and expectations. The Importance Sampling Estimator Suppose we wish to estimate θ = E f [h(x)] where X has PDF 3 f. Let g be another PDF with the property that g(x) 0 whenever f(x) 0. That is, g has the same support as f. Then θ = E f [h(x)] = h(x)f(x) dx = h(x) f(x) [ ] h(x)f(x) g(x) g(x) dx = E g g(x) where E g [ ] denotes an expectation with respect to the density g. This has very important implications for estimating θ. The original simulation method is to generate n samples of X from the density, f, and set θ n = h(x j )/n. An alternative method, however, is to generate n values of X from the density, g, and set n h(x j )f(x j ) θ n,is =. ng(x j ) θ n,is is then 4 an importance sampling estimator of θ. We often define j=1 h (X) := h(x)f(x) g(x) so that θ = E g [h (X)]. We refer to f and g as the original density and importance sampling densities 5, respectively. We also refer to f/g as the likelihood ratio. Example 1 Consider again the problem where we want to estimate θ = P(X 0) = E[I {X 0} ] when X N(0, 1). We may then write θ = E[I {X 0} ] = = = I {X 0} 1 π e x dx I {X 0} 1 π e x 1 π e (x µ) 1 π e (x µ) dx I {X 0} e µx+µ / 1 π e (x µ) dx = E µ [ I {X 0} e µx+µ / ] It can of course also be used for estimating non rare event probabilities and expectations but it is not as useful in such circumstances. An exception to this statement is when importance sampling is easier than the regular simulation method. 3 Or PMF, if X is a discrete random variable. 4 We will see later where the name importance sampling comes from. 5 Obviously if X was discrete we would use mass function in place of density function.

3 Monte-Carlo Methods for Risk Management 3 where now X N(µ, 1) and e µx+µ / is the likelihood ratio. If we set µ = 0, for example, and use n = 1 million samples, then we find an approximate 95% confidence interval for θ is given by [.75,.778] We can of course also estimate expectations using importance sampling. Example Suppose we wish to estimate θ = E[X 4 e X /4 I {X } ] where X N(0, 1). Then the same argument as before implies that we may also write θ = E µ [X 4 e X /4 e µx+µ / I {X } ] where now X N(µ, 1). The General Formulation Let X = (X 1,..., X n ) be a random vector with joint PDF f(x 1,..., x n ) and suppose we wish to estimate θ = E f [h(x)]. Let g(x 1,..., x n ) be another PDF such that g(x) 0 whenever f(x) 0. Then θ = E f [h(x)] = =... x 1 x n h(x 1,..., x n )f(x 1,..., x n ) dx 1... dx n... x 1 x n h(x 1,..., x n ) f(x 1,..., x n ) g(x 1,..., x n ) g (x 1,..., x n ) dx 1... dx n = E g [h (X)] where h (X) := h(x)f(x)/g(x). Again have two methods for estimating θ: the original method where we simulate with respect to the density function, f, and the importance sampling method where we simulate with respect to the density, g. Example 3 Suppose we wish to estimate θ = P ( n X i 50) where the X i s are IID N(0, 1). Then θ = E[h(X)] where h(x) := I { X i 50} and X := (X 1,..., X n ). We could estimate θ using importance sampling as follows. θ = E[h(X)] =... x 1 = σ x n... 1 x n ( ( = σ x n... 1 x n = E g [ σ n ( e X 1 xn e x 1 / π... e x n / π e x 1 / e x 1 /σ e x (1 1 1 I { X i 50} dx 1... dx n )... e x n / e x 1 /σ e... e x n /σ x n /σ πσ πσ (1 1/σ)... e x n (1 1/σ ) σ )... e X n ) (1 1 σ ) I { X i 50} ) e x 1 /σ πσ... e x n /σ πσ ] I { X i 50} dx 1... dx n I { X i 50} dx 1... dx n = E g [h (X)] where E g [.] denotes expectation under a multivariate normal distribution where the X MVN(0, σ I n ). Thus far we have not addressed the issue of how to choose a good sampling density, g, so that we obtain a variance reduction when we sample from g instead of f. We will address this question in the next two sections and also explain the term importance sampling.

4 Monte-Carlo Methods for Risk Management 4 Obtaining a Variance Reduction As before, suppose we wish to estimate θ = E f [h(x)] where X is a random vector with joint PDF, f. We will assume that h(x) 0. Now let g be another density with support equal to that of f. Then we know and this gives rise to two estimators: 1. h(x) where X f and. h (X) where X g θ = E f [h(x)] = E g [h (X)] The variance of the importance sampling estimator is given by Var g (h (X)) = h (x) g(x) dx θ = h(x) f(x) f(x) dx θ g(x) while the variance of the original estimator is given by Var f (h(x)) = h(x) f(x) dx θ. So the reduction in variance 6 is then given by ( Var f (h(x)) Var g (h (X)) = h(x) 1 f(x) ) f(x) dx. (1) g(x) In order to achieve a variance reduction, the integral in (1) should be positive. For this to happen, we would like 1. f(x)/g(x) > 1 when h(x)f(x) is small and. f(x)/g(x) < 1 when h(x)f(x) is large. Now the important part of the density, f, could plausibly be defined to be that region, A say, in the support of f where h(x)f(x) is large. But, by the above observation, we would like to choose g so that f(x)/g(x) is small whenever x is in A. That is, we would like a density, g, that puts more weight on A: hence the name importance sampling. Note that when h involves a rare event so that h(x) = 0 over most of the state space, it can then be particularly valuable to choose g so that we sample often from that part of the state space where h(x) 0. This is why importance sampling is most useful for simulating rare events. Further guidance on how to choose g is obtained from the following observation. As we are free to choose g, let s suppose we choose 7 g(x) = h(x)f(x)/θ. Then it is easy to see that Var g (h (X)) = θ θ = 0 so that we have a zero variance estimator! This means that if we sample with respect to this particular choice of g, then we would only need one sample and this sample would equal θ with probability one. 8 Of course this is not feasible in practice. After all, since it is θ that we are trying to estimate, it does not seem likely that we could simulate a random variable whose density is given by g(x) = h(x)f(x)/θ. However, all is not lost and this observation can often guide us towards excellent choices of g that lead to extremely large variance reductions. 6 A negative reduction means a variance increase. 7 Note that this choice of g is valid since g(x) dx = 1 and we have assumed h is non-negative. 8 Recall that with this choice of g, h (x) = h(x)f(x)/g(x) = θ.

5 Monte-Carlo Methods for Risk Management 5 How to Choose a Good Sampling Distribution We saw above that if we could choose g(x) = h(x)f(x)/θ, then we would obtain the best possible estimator of θ, that is, one that has zero variance. In general, we cannot do this, but it does suggest that if we could choose g( ) so that it is similar to h( )f( ), then we might reasonably expect to obtain a large variance reduction. What does the phrase similar mean? One obvious thing to do would be to choose g( ) so that it has a similar shape to h( )f( ). In particular, we could try to choose g so that g(x) and h(x)f(x) both take on their maximum values at the same value, x, say. When we choose g this way, we say that we are using the maximum principle. Of course this only partially defines g since there are infinitely many density functions that could take their maximum value at x. Nevertheless, this is often enough to obtain a significant variance reduction and in practice, we often take g to be from the same family of distributions as f. For example, if f is multivariate normal, then we might also take g to be multivariate normal but with a different mean vector and / or variance-covariance matrix. 9 Example 4 Returning to Example, recall that we wished to estimate θ = E[h(X)] = E[X 4 e X /4 I {X } ] where X N(0, 1). If we sample from a PDF, g, that is also normal with variance 1 but mean µ, then we know that g takes it maximum value at x = µ. Therefore, a good choice of µ might be µ = arg max x h(x)f(x) = arg max x x4 e x /4 = 8. Then θ = E g [h (X)] = E g [X 4 e X /4 e 8X+4 I {X } ] where g( ) denotes the N( 8, 1) PDF. Example 5 (Pricing an Asian call option) For the purpose of option pricing, we assume as usual that S t GBM(r, σ ), where S t is the price of the stock at time t and r is the risk-free interest rate. Suppose now that we want to price an Asian call option whose payoff at time t is given by ( m h(s) := max 0, S ) it/m K () m where S := {S it/m : i = 1,..., m}, T is the expiration date and K is the strike price. The price of this option is then given by C a = E[e rt h(s)]. Now we can write S it/m = S 0 e (r σ /) it m +σ T m (X X i ) where the X i s are IID N(0, 1). This means that if f is the joint PDF of X = (X 1,..., X m ), then (with a mild abuse of notation) we may write C a = E f [h(x 1,..., X n )]. Now if K is large relative to S 0 so that the option is deep out-of-the-money then pricing the option using simulation amounts to performing a rare event simulation. This is particularly true when K is very large relative to S 0, and the option is deep out-of-the-money. As a result, estimating C a using importance sampling will often result in a very large variance reduction. In order to apply importance sampling, we need to choose the sampling density, g( ). For this, we could take g( ) to be multivariate normal with variance-covariance matrix equal to the identity, I m, and mean vector, µ. That is we shift f(x) by µ. As before, a good possible value of µ might be µ = arg max x h(x)f(x) which can be found using numerical methods. 9 We note that it is not necessary that f and g come from the same family of distributions. In fact sometimes it is necessary to choose g from a different family of distributions. This might occur, for example, if it is difficult or inefficient to simulate from the family of distributions to which f belongs. In that case, our reason for using importance sampling in the first place is so that we can simulate from an easier distribution, g.

6 Monte-Carlo Methods for Risk Management 6 Potential Problems with the Maximum Principle Sometimes applying the maximum principle to choose g( ) will be difficult. For example, it may be the case that there are multiple or even infinitely many solutions to µ = arg max x h(x)f(x). Even when there is a unique solution, it may be the case that finding it is very difficult. In such circumstances, an alternative method for choosing g is to scale f. We will demonstrate this by example. Example 6 (Using Scaling to Select g( )) Assume in Example 3 that n =. Then θ = P ( X 1 + X 50 ) = E[I {X 1 +X 50} ] where X 1, X are IID N(0, 1). Then e (x 1 +x )/ h(x)f(x) = I {x 1 +x 50}. π Therefore, h(x)f(x) = 0 inside the circle x 1 + x 50 and it takes on its maximum value at every point on the circle x 1 + x = 50. As a result, it is not possible to apply the maximum principle. Before choosing a sampling density, g, recall that we would like g to put more weight on those parts of the sample space where h(x)f x (x) is large. One way to achieve this is by scaling the density of X = (X 1, X ) so that X is more dispersed. For example, we could take g to be multivariate normal with mean vector 0 and variance-covariance matrix ( ) σ 0 Σ g = 0 σ where σ > 1. Note that this simply means that under g, X 1 and X are IID N(0, σ ). Furthermore, when σ > 1, then more probability mass is given to the region X 1 + X 50 as desired. We could choose the value of σ using heuristic methods. One method would be to choose σ so that E g [X 1 + X ] = 50 which in this case would imply that σ = 5. Why? Then 10 ( ) θ = E[I {X 1 +X 50} ] = E g [σ exp X 1 (1 1/σ ) X (1 1/σ ) I {X 1 +X 50} ]. For the more general case where n >, we could proceed by again choosing σ so that E g [ n X i ] = 50. Tilted Densities Suppose f is light-tailed so that it has a moment generating function (MGF). Then a common way of generating the sampling density, g, from the original density, f, is to use the MGF of f. We use M x (t) to denote the MGF and it is defined by M x (t) = E f [e tx ]. Then a tilted density of f is given by f t (x) = etx f(x) M x (t) for < t <. The tilted densities are useful since a random variable with density f t ( ) tends to be larger than one with density f when t > 0, and smaller when t < 0. This means, for example, that if we want to sample more often from the region where X tends to be large, we might want to use a tilted density with t > 0 as our sampling density g. Similarly, if we want to sample more often from the region where X tends to be small, then we might use a tilted density with t < 0. Example 7 Suppose X is an exponential random variable with mean 1/λ. Then f(x) = λe λx for x 0, and it is easy to see that f t (x) = Ce (λ t)x where C is the constant that makes the density integrate to This is the same expression we had earlier in Example 3 except n =.

7 Monte-Carlo Methods for Risk Management 7 Example 8 Suppose X 1,..., X n are independent random variables, where X i has density f i ( ). Let S n := n X i and suppose we want to estimate θ := P(S n a) for some constant, a. If a is large so that we are dealing with a rare event we should use importance sampling to estimate θ. Since S n is large when the X i s are large it makes sense to sample each X i from its tilted density function, f i,t ( ) for some value of t > 0. Then we may write ] ( n n ) ] f i (X i ) θ = E[I {Sn a}] = E t [I {Sn a} = E t [I {Sn a} M i (t) e ts n f i,t (X i ) where E t [.] denotes expectation with respect to the X i s under the tilted densities, f i,t ( ), and M i (t) is the moment generating function of X i. If we write M(t) := n M i(t), then it is easy to see that the importance sampling estimator, θ n,i, satisfies θ n,i M(t)e ta. (3) Therefore a good choice of t would be that value that minimizes the bound in (3). Why is this? We can minimize this by minimizing log(m(t)e ta ) = log(m(t)) ta. It is straightforward to check that the minimizing value of t satisfies µ t = a where µ t := E t [S n ]. If we define the stopping time τ a := min{n 0 : S n a}, then P (τ a < ) is the probability that S n ever exceeds a. If E[X 1 ] > 0 and the X i s are IID with MGF, M X (t), then this probability equals one. The case of interest is then when E[X 1 ] 0. A similar argument to that of Example 8 implies θ = P (τ a < ) = E[I {τa< }] ( τa ) = E t [I {τa < } M X (t) = E t [ I {τa< }e tsτ a +τaψ(t)] e ts τa where ψ(t) := log(m X (t)) is the cumulant generating function. Note that if E t [X 1 ] > 0 then τ a < almost surely and so we obtain θ = E t [ e ts τ a +τaψ(t)]. In fact, this is an example where importance sampling can be used to ensure that the simulation stops almost surely. In Assignment 6 we discuss how to choose a good value of t based on the cumulant generating function. Note that this example has direct applications to the estimation of ruin probabilities in the context of insurance risk. For example, suppose X i := Y i ct i where Y i is the size of the i th claim, T i is the inter-arrival time between claims, c is the premium received per unit time and a is the initial reserve. Then θ is the probability that the insurance company ever goes bankrupt. Only in very simple models is it possible to calculate θ analytically. In general, Monte-Carlo approaches are required. Estimating Conditional Expectations Importance sampling can also be very useful for computing conditional expectations when the event being conditioned upon is a rare event. For example, suppose we wish to estimate θ = E[h(X) X A] where A is a rare event and X is a random vector with PDF, f. Then the density of X, given that X A, is so f(x x A) = f(x) P(X A) θ = E[h(X)I {X A}]. E[I {X A} ] for x A Now since A is a rare event we would be much better off if we could simulate using a sampling density, g, that makes A more likely to occur. Then, as usual, we would have θ = E g[h(x)i {X A} f(x)/g(x)]. E g [I {X A} f(x)/g(x)] ]

8 Monte-Carlo Methods for Risk Management 8 So to estimate θ using importance sampling, we would generate X 1,..., X n with density g( ), and set n θ n,i = h(x i)i {Xi A}f(X i )/g(x i ) n I. {X i A}f(X i )/g(x i ) In contrast to our usual estimators, θ n,i is no longer an average of n IID random variables but instead, it is the ratio of two such averages. This has implications for computing approximate confidence intervals for θ. In particular, confidence intervals should now be estimated using bootstrapping techniques. An obvious application of this methodology in risk management is the estimation of ES or CVaR. Difficulties with Importance Sampling The most difficult aspect to importance sampling is in choosing a good sampling density, g. In general, one needs to be very careful for it is possible to choose g according to some good heuristic such as the maximum principle, but to then find that g results in a variance increase. In fact it is possible to choose a g that results in an importance sampling estimator that has an infinite variance! This situation would typically occur when g puts too little weight relative to f on the tails of the distribution. An Application to Portfolio Credit Risk We consider 11 a portfolio loss of the form L = m e iy i where e i is the deterministic and positive exposure to the i th credit and Y i is the default indicator with corresponding default probability, p i. We assume that Y follows a Bernoulli mixture model which we now define. Definition 1 Let p < m and let Ψ = (Ψ 1,..., Ψ p ) be a p-dimensional random vector. Then we say the random vector Y = (Y 1,..., Y m ) follows a Bernoulli mixture model with factor vector Ψ if there are functions p i : R p [0, 1], 1 i m, such that conditional on Ψ the components of Y are independent Bernoulli random variables satisfying P (Y i = 1 Ψ = ψ) = p i (ψ). We are interested in the problem of estimating θ := P (L c) where c is substantially larger E[L]. Note that a good importance sampling distribution for θ should also yield a good importance sampling distribution for computing risk measures associated with the α-tail of the loss distribution where q α (L) c. We begin with the case where the default indicators are independent. Independent Default Indicators We define Ω to be the state space of Y so that Ω = {0, 1} m. Then Then P ({y}) = M L (t) = E f [e tl ] = m p yi i (1 p i) 1 y i, y Ω. m E[e te iy i ] = Let Q t be the corresponding tilted probability measure so that m Q t ({y}) = et eiyi M L (t) P ({y}) = = m m m ( pi e te i ) + 1 p i. e teiyi (p i e te i + 1 pi ) py i i (1 p i) 1 yi q y i t,i (1 q t,i) 1 yi 11 This application is described in Section 8.5 of MFE, but is based on the original paper of Glasserman and Li (005).

9 Monte-Carlo Methods for Risk Management 9 where q t,i := p i e tei /(p i e tei + 1 p i ) is the Q t probability of the i th credit defaulting. Note that the default indicators remain independent Bernoulli random variables under Q t. Since q t,i 1 as t and q t,i 0 as t it is clear that we shift the mean of L to any value in (0, m e i). The same argument that was used at the end of Example 8 suggests that we take t equal to that value that solves m q i,te i = c. This value can be found easily using numerical methods. Dependent Default Indicators Suppose now that there is a p-dimensional factor vector, Ψ, such that the default indicators are independent with default probabilities p i (ψ) conditional on Ψ = ψ. Suppose also that Ψ MVN p (0, Σ). The Monte-Carlo scheme for estimating θ is to first simulate Ψ and to then simulate Y conditional on Ψ. We can apply importance sampling to the second step using our discussion of independent default indicators above. However, we can also apply importance sampling to the first step. A natural way to do this is to simulate Ψ form the MVN p (µ, Σ) distribution for some µ R p. The corresponding likelihood ratio, r µ (Ψ) say, (f/g in our earlier notation) is given by the ratio of the two multivariate normal densities and satisfies exp ( 1 r µ (Ψ) = Ψ Σ 1 Ψ ) exp ( 1 (Ψ µ) Σ 1 (Ψ µ) ) = exp( µ Σ 1 Ψ + 1 µ Σ 1 µ). How Do We Choose µ? Recall that the quantity of interest is θ := P (L c) = E[P (L c Ψ)]. We know from our earlier discussion of importance sampling that we would like to choose the importance sampling density, g (Ψ) say, so that g (Ψ) P (L c Ψ) exp( 1 Ψ Σ 1 Ψ). (4) Of course this is not possible since we do not know P (L c Ψ), the very quantity that we wish to estimate. The maximum principle applied to the MVN p (µ, Σ) distribution would then suggest taking µ equal to the value of Ψ which maximizes the right-hand-side of (4). Again it is not possible to solve this problem exactly as we do not know P (L c Ψ) but numerical methods can be used to find good approximate solutions. See Glasserman and Li (005) for further details. The Importance Sampling Algorithm for Estimating θ = P (L c) 1. Generate Ψ 1,..., Ψ n independently from the MVN p (µ, Σ) distribution. For each Ψ i estimate P (L c Ψ = Ψ i ) using the importance sampling distribution that we described in IS our discussion of independent default indicators. Let θ n 1 (Ψ i ) be the corresponding estimator based on n 1 samples. 3. The full importance sampling estimator is then given by θ IS n = 1 n n r µ (Ψ i ) θ IS n 1 (Ψ i ). 3 Stratified Sampling Consider a game show where contestants first pick a ball at random from an urn and then receive a payoff, Y. The payoff is random and depends on the color of the selected ball so that if the color is c then Y is drawn from the PDF, f c. The urn contains red, green, blue and yellow balls, and each of the four colors is equally likely to be chosen. The producer of the game show would like to know how much a contestant will win on average when he plays the game. To answer this question, she decides to simulate the payoffs of n contestants and take their average payoff as her estimate. The payoff, Y, of each contestant is simulated as follows:

10 Monte-Carlo Methods for Risk Management Simulate a random variable, I, where I is equally likely to take any of the four values r, g, b and y. Simulate Y from the density f I (y). The average payoff, θ := E[Y ], is then estimated by n j=1 θ n := Y j. n Now suppose n = 1000, and that a red ball was chosen 46 times, a green ball 70 times, a blue ball 6 times and a yellow ball 58 times. Question: Would this influence your confidence in θ n? What if f g tended to produce very high payoffs and f b tended to produce very low payoffs? Is there anything that we could have done to avoid this type of problem occurring? The answer is yes. We know that each ball color should be selected 1/4 of the time so we could force this to be true by conducting four separate simulations, one each to estimate E[X I = c] for c = r, g, b, y. Note that E[Y ] = E[E[Y I]] = 1 4 E[Y I = r] E[Y I = g] E[Y I = b] + 1 E[Y I = y] 4 so that an unbiased estimator of θ is obtained by setting θ st,n := 1 4 θ r,nr θ g,ng θ b,nb θ y,ny (5) where θ c := E[Y I = c] for c = r, g, b, y. 1 How does the variance of θ st,n compare with the variance of θ n, the original raw simulation estimator? To answer this question, assume for now that n c = n/4 for each c, and that Y c is a sample from the density, f c. Then a fair 13 comparison of Var( θ n ) with Var( θ st,n ) should compare Var(Y 1 + Y + Y 3 + Y 4 ) with Var(Y r + Y g + Y b + Y y ) (6) where Y 1, Y, Y 3 and Y 4 are IID samples from the original simulation algorithm (i.e., where we first select the ball randomly and then receive the payoff), and the Y c s are independent with density f c ( ), for c = r, g, b, y. Now recall the conditional variance formula which states Each term in the right-hand-side of (7) is non-negative so this implies Var(Y ) E[Var(Y I)] Var(Y ) = E[Var(Y I)] + Var(E[Y I]). (7) = 1 4 Var(Y I = r) Var(Y I = g) Var(Y I = b) + 1 Var(Y I = y) 4 which implies = Var(Y r + Y g + Y b + Y y ) 4 Var(Y 1 + Y + Y 3 + Y 4 ) = 4Var(Y ) Var(Y r + Y g + Y b + Y y ). (8) As a result, we may conclude that using θ st,n instead of θ n leads to a variance reduction. This variance reduction will be substantial if I accounts for a large fraction of the variance of Y. Note also that the computational requirements for computing θ st,n are similar 14 to those required for computing θ n. We call θ st,n a stratified sampling estimator of θ, and we say that I is the stratification variable. 1 θc,nc is an estimate of θ c using n c samples. θ st,n is an estimate of θ using n samples, so it is implicitly assumed in (5) that n r + n g + n b + n y = n. 13 Fair here means that each estimator is based on the same total number of samples. 14 For this example, the stratified estimator will actually require less work, but it is also possible in general for it to require more work.

11 Monte-Carlo Methods for Risk Management 11 The Stratified Sampling Algorithm We will now formally describe the stratified sampling algorithm. Suppose as usual that we wish to estimate θ := E[Y ] where Y is a random variable. Let W be another random 15 variable that satisfies the following two conditions: Condition 1: For any R, P(W ) can be easily computed. Condition : It is easy to generate (Y W ), i.e., Y given W. Now divide R into m non-overlapping subintervals, 1,..., m, such that p j := P(W j ) > 0 and m j=1 p j = 1 where. Note that if W can take any value in R, then the first interval should be [, a], while the final interval should be [b, ] for some finite a and b. Notation 1. Let θ j := E[Y W j ] and σ j := Var(Y W j).. We define the random variable I by setting I := j if W j. 3. Let Y (j) denote a random variable with the same distribution as (Y W j ) (Y I = j). Our notation then implies θ j = E[Y I = j] = E[Y (j) ] and σ j = Var(Y I = j) = Var(Y (j) ). In particular we have θ = E[Y ] = E[E[Y I]] = p 1 E[Y I = 1] p m E[Y I = m] = p 1 θ p m θ m. Note that to estimate θ we only need to estimate the θ i s since by condition 1 above, the p i s are easily computed. Furthermore, we know how to estimate the θ i s by condition. If we use n i samples to estimate θ i, then an estimate of θ is given by θ st,n = p 1 θ1,n p m θm,nm. It is clear that θ st,n will be unbiased if for each i, θ i,ni is an unbiased estimate of θ i. Obtaining a Variance Reduction How does the stratification estimator compare with the usual raw simulation estimator? As was the case with the game show example, to answer this question we would like to compare Var( θ n ) with Var( θ st,n ). First we need to choose n 1,..., n m such that n n m = n. That is, we need to determine the number of samples, n i, that will be used to estimate each θ i, but in such a way that the total number of samples is equal to n. Clearly, the optimal approach would be to choose the n i s so as to minimize Var( θ st,n ). Consider for now, however, the sub-optimal allocation where we set n j := np j for j = 1,..., m. Then Var( θ st,n ) = Var(p 1 θ1,n p m θm,nm ) = p σ p m n 1 m j=1 = p jσj. n On the other hand, the usual simulation estimator has variance σ /n where σ := Var(Y ). Therefore, we need only show that m j=1 p jσ j < σ to prove that the non-optimized 16 stratification estimator has a lower variance 15 Y and W must be dependent to achieve a variance reduction. 16 The optimized stratification estimator refers to the estimator where we choose the n i s to minimize the variance of θ st,n. σm n m

12 Monte-Carlo Methods for Risk Management 1 than the usual raw estimator. 17 The proof that m j=1 p jσj < σ is precisely the same as that used for the game show example. In particular, equation (7) implies σ = Var(Y ) E[Var(Y I)] = m j=1 p jσj and the proof is complete! Optimizing the Stratified Estimator We know n1 θ st,n = p Y (1) nm i p m n 1 where for a fixed j, the Y (j) i s are IID Y (j). This then implies Y (m) i n m Var( θ st,n ) = p σ p σm m = n 1 n m m j=1 p j σ j n j. (9) Therefore, to minimize Var( θ st,n ) we must solve the following constrained optimization problem: min n j m j=1 p j σ j n j subject to n n m = n. (10) We can easily solve (10) using a Lagrange multiplier. The optimal solution is given by ( ) n p j σ j j = m j=1 p n (11) jσ j ( m ) and the minimized variance is given by Var( θ st,n ) = j=1 p jσ j /n. Note that the solution in (11) makes intuitive sense: if p j is large, then other things being equal, it makes sense to expend more effort simulating from stratum j, i.e., the region where W j j. Similarly, if σj is large then, other things again being equal, it makes sense to simulate more often from stratum j so as to get a more accurate estimate of θ j. Remark 1 It is interesting to note the following connection between stratified sampling and importance sampling. We know that when we importance sample, we like to sample more often from the important region. The choice of n j in (11) also means that we simulate more often from the important region when we use optimized stratified sampling. Remark Note also the connection between stratified sampling and conditional Monte-Carlo. Both methods rely on the conditional variance formula to prove that they lead to a variance reduction. The difference between the two methods can best be explained as follows. Suppose we wish to estimate θ := E[Y ] using simulation and we do this by first generating random variable, W, and then generating Y given W. In the conditional expectation method, we simulate W first, but then compute E[Y W ] analytically. In the stratified sampling method, we effectively generate W analytically, and then simulate Y given W. Advantages and Disadvantages of Stratified Sampling The obvious advantage of stratified sampling is that it leads to a variance reduction which can be very substantial if the stratification variable, W, accounts for a large fraction of the variance of Y. The main disadvantage of stratified sampling is that typically we do not know the σj s so it is impossible to compute the optimal n j s exactly. Of course we can overcome this problem by first doing m pilot simulations to estimate each σ j. If we let N p denote the total number of pilot simulations, then a good heuristic is to use N p /m runs for each individual pilot simulation. In order to obtain a reasonably good estimate of σj, a useful rule-of-thumb 17 The optimized stratification estimator would then of course achieve an even greater variance reduction.

13 Monte-Carlo Methods for Risk Management 13 is that N p /m should be greater than 30. If m is large however, and each simulation run is computationally expensive, then it may be the case that a lot of effort is expended in trying to estimate the optimal n j s. One method of overcoming this problem is to abandon the pilot simulations and simply use the sub-optimal allocation where n j = np j. We saw earlier that this allocation still results in a variance reduction which sometimes can be substantial. In practice, both methods are used. The decision to conduct pilot simulations should depend on the problem at hand. For example, if you have reason to believe that the σ j s will not vary too much then it should be the case that the optimal allocation and the sub-optimal allocation will be very similar. In this case, it is probably not worth doing the pilot simulations. On the other hand, if the σ j s vary considerably, then conducting the pilot runs may be worthwhile. Of course, a combination of the two is also possible where a only a subset of the pilot simulations is conducted. The stratified simulation algorithm is given below. We assume that the pilot simulations have already been completed, or it has been decided not to conduct them at all; either way, the n j s have been computed. We also show how the estimate, θ n,st, and the estimated variance, σ n,st, can be computed without having to store all the generated samples. That is, we simply keep track of Y (j) (j) i and Y i for each j since these quantities are all that is required 18 to compute θ n,st and σ n,st. Stratification Simulation Algorithm for Estimating θ set θ n,st = 0; σ n,st = 0; for j = 1 to m set sum j = 0; sum squares j = 0; for i = 1 to n j generate Y (j) i set sum j = sum j + Y (j) i set sum squares j = sum squares j + Y (j) i end for set θ j = sum j /n j set σ j = ( sum squares j sum j /n j) /(nj 1) set θ n,st = θ n,st + p j θ j set σ n,st = σ n,st + σ j p j /n j end for set approx. 100(1 α) % CI = θ n,st ± z 1 α/ σ n,st Financial Applications Most of our financial examples relate to option pricing in a geometric Brownian motion (GBM) setting. Since GBM is a very poor model of asset price dynamics, however, these examples are not directly applicable in practice. However, the specific techniques and results that we discuss are certainly applicable. For example, we know a multivariate t random vector can be generated using a multivariate normal random vector together with an independent chi-squared random variable. Therefore the techniques we discuss below for multivariate normal random vectors can also be used 19 for multivariate t random vectors. They are also applicable more generally to 18 Recall that θ n,st = m j=1 ( nj ) Y (j) i p n j, Var( θ n,st ) = ( nj ) ( m j j=1 Var Y (j) nj ) i p n j j and Var Y (j) i can be estimated n j knowing just n j Y (j) i and n j Y (j) i. As stated previously, any simulation study that requires a large number of samples should only keep track of these quantities, thereby avoiding the need to store every sample. 19 See Chapter 9 of Glasserman (004) for a risk management application in a multivariate t setting.

14 Monte-Carlo Methods for Risk Management 14 normal-mixture distributions, Gaussian and t copulas, as well as the simulation of SDE s. As stated earlier, Chapter 9 of Glasserman (004) should be sufficient to convince any reader of the general applicability of stratified sampling (as well as importance sampling) to risk management. Example 9 (Pricing a European Call Option) Suppose that we wish to price a European call option where we assume as usual that S t GBM(r, σ ). Then C 0 = E [ e rt max(0, S T K) ] = E[Y ] ( where Y = h(x) = e rt max 0, S 0 e (r σ /)T +σ ) T X K for X N(0, 1). While we know how to compute C 0 analytically, it is worthwhile seeing how we could estimate it using stratified simulation. Let W = X be our stratification variable. To see that we can stratify using this choice of W note that: (1) Computing P(W ) For R, P(W ) can easily be computed. Indeed, if = [a, b], then P(W ) = Φ(b) Φ(a), where Φ(.) is the CDF of a standard normal random variable. () Generating (Y W ) (h(x) X ) can easily be generated. We do this by first generating X := (X X ) and then take h( X). We generate X as follows. First note that if X N(0, 1), then we can generate an X using the inverse transform method by setting X = Φ 1 (U). The problem with such an X is that it may not lie in = [a, b]. However, we can overcome this problem by simply generating Ũ U(Φ(a), Φ(b)) and then setting X = Φ 1 (Ũ). It is then straightforward to check that X (X X [a, b]). It is therefore clear that we can estimate C 0 using X as a stratification variable. Example 10 (Pricing an Asian Call Option) Recall that the discounted payoff of an Asian call option is given by ( m Y := e rt max 0, S it/m m ) K and that it s price is given by C a = E[Y ] where we assume S t GBM(r, σ ). Now each S it/m may be expressed as ( ) S it/m = S 0 exp (r σ /) it m + σ T m (X X i ) (1) (13) where the X i s are IID N(0, 1). This means that we may then write C a = E [h(x 1,..., X m )] where the function h(.) is given implicitly by equations (1) and (13). So to estimate C a using our standard simulation algorithm, we would simply generate sample values of h(x 1,..., X m ) and take their average as our estimate. We can also, however, estimate C a using stratified sampling. 0 To do so, we must first choose a stratification variable, W. One possible choice would be to set W = X j for some j. However, this is unlikely to capture much of the variability of h(x 1,..., X m ). A much better choice would be to set W = m j=1 X j. Of course, we need to show that such a choice is possible. That is, we need to show that P(W ) is easily computed, and that (Y W ) is easily generated. (1) Computing P(W ) Since X 1,..., X m are IID N(0, 1), we immediately have that W N(0, m). If = [a, b] then P(W ) = P (N(0, m) ) = P (a N(0, m) b) 0 The method we now describe is also useful for pricing other path dependent options. See Glasserman (004) for further details.

15 Monte-Carlo Methods for Risk Management 15 ( a = P N(0, 1) b ) m m ( ) ( ) b a = Φ Φ. m m ( ( b Similarly, if = [b, ), then P(W ) = 1 Φ m ), and if = (, a], then P(W ) = Φ a m ). () Generating (Y W ) We need two results from the theory of multivariate normal random variables. The first result is well known to us: Result 1: Suppose X = (X 1,..., X m ) MVN(0, Σ). If we wish to generate a sample vector X, we first generate Z MVN(0, I m ) 1 and then set X = C T Z (14) where C T C = Σ. One possibility of course is to let C be the Cholesky decomposition of Σ, but in fact any matrix C that satisfies C T C = Σ will do. Result : Let a = (a 1 a... a m ) satisfy a = 1, i.e. a a m = 1, and let Z = (Z 1,..., Z m ) MVN(0, I m ). Then { } m (Z 1,..., Z m ) a i Z i = w MVN(wa T, I m a T a). Therefore, to generate {(Z 1,..., Z m ) m a iz i = w} we just need to generate a vector, V, where V MVN(wa T, I m a T a) = wa T + MVN(0, I m a T a). Generating such a V is very easy since (I m a T a) T (I m a T a) = I m a T a. That is, Σ T Σ = Σ where Σ = I m a T a, so we can take C = Σ in (14). Now, we can return to the problem of generating (Y W ). Since Y = h(x 1,..., X m ), we can clearly generate (Y W ) if we can generate [(X 1,..., X m ) m X i ]. To do this, suppose again that = [a, b]. Then [ (X 1,..., X m ) ] [ m X i [a, b] (X 1,..., X m ) Now we can generate [(X 1,..., X m ) m X i ] in two steps: [ 1 m Generate m X i 1 m X i m [ a m, [ 1 m Step 1: m X a i b m, m ]]. This is easy to do since ( [ 1 m m X a i N(0, 1) so we just need to generate N(0, 1) N(0, 1) m, the method described in Example 9. Let w be the generated value. Step : Now generate the comments that follow it. [ (X 1,..., X m ) b m ] ]. b m ]) which we can do using ] 1 m m X i = w which we can do by the second result above and 1 That is, we generate m IID N(0, 1) random variables.

16 Monte-Carlo Methods for Risk Management 16 Example 11 (Pricing a Barrier Option) Recall again the problem of pricing an option that has payoff { max(0, ST K h(x) = 1 ) if S T/ L, max(0, S T K ) otherwise. where X = (S T/, S T ). We can write the price of the option as [ ( )] C 0 = E e rt max(0, S T K 1 )I {ST / L} + max(0, S T K )I {ST / >L} where as usual, we assume that S t GBM(r, σ ). Using conditional Monte-Carlo, we can write (why?) C 0 = E[Y ] where ( ) Y := e rt/ c(s T/, T/, K 1, r, σ)i {ST / L} + c(s T/, T/, K, r, σ)i {ST / L} (15) and where c(x, t, k, r, σ) is the price of a European call option with strike k, interest rate r, volatility σ, time to maturity t, and initial stock price x. Question 1: Having conditioned on S T/, could we now also use stratified sampling? Question : Could we use importance sampling? Question 3: What about using importance sampling before doing the conditioning? 4 Simulating Stochastic Differential Equations We now discuss the simulation of stochastic differential equations (SDE s). This is an important computational tool that is useful for the pricing of exotic securities as well as evaluating model risk. Consider, for example, the down-and-out barrier option that was discussed in the Model Risk lecture notes. This option was priced used seven different models, all of which had been calibrated to the same implied volatility surface. Option prices for the seven different models were plotted in Figure 4 as a function of the barrier level and all of these prices were obtained by simulating the underlying SDE s. Brigo et al. (007) also discuss the simulation and estimation of SDE s from a risk-management perspective and provide Matlab code for their various algorithms. Brigo et al. (007) and Glasserman (004) can be consulted for further details and references. Simulating a 1-Dimensional SDE: The Euler Scheme Let us assume that we are faced with an SDE of the form dx t = µ(t, X t ) dt + σ(t, X t ) dw t (16) and that we wish to simulate values of X T but do not know its distribution. (This could be due to the fact that we cannot solve (16) to obtain an explicit solution for X T, or because we simply cannot determine the distribution of X T even though we do know how to solve (16)). When we simulate an SDE, what we mean is that we simulate a discretized version of the SDE. In particular, we simulate a discretized process, { X h, X h,..., X mh }, where m is the number of time steps, h is a constant and mh = T. The smaller the value of h, the closer our discretized path will be to the continuous-time path we wish to simulate. Of course this will be at the expense of greater computational effort. While there are a number of discretization schemes available, we will focus on the simplest and perhaps most common scheme, the Euler scheme. The Euler scheme is intuitive, easy to implement and satisfies X kh = X ) ( ) (k 1)h + µ ((k 1)h, X(k 1)h h + σ (k 1)h, X(k 1)h h Zk (17)

17 Monte-Carlo Methods for Risk Management 17 where the Z k s are IID N(0, 1). If we want to estimate θ := E[f(X T )] using the Euler scheme, then for a fixed number of paths, n, and discretization interval, h, we have the following algorithm. Using the Euler Scheme to Estimate θ = E[f(X T )] When X t Follows a 1-Dimensional SDE for j = 1 to n t = 0; X = X0 for k = 1 to T/h = m generate Z N(0, 1) set X = X + µ(t, X)h + σ(t, X) h Z set t = t + h end for set f j = f( X) end for set θ n = (f f n ))/n set σ n = n j=1 (f j θ n ) /(n 1) set approx. 100(1 α) % CI = θ σ n ± z n 1 α/ n Remark 3 Observe that even though we only care about X T, we still need to generate intermediate values, X ih, if we are to minimize the discretization error. Because of this discretization error, θ n is no longer an unbiased estimator of θ. Remark 4 If we wished to estimate θ = E[f(X t1,..., X tp )] then in general we would need to keep track of (X t1,..., X tp ) inside the inner for-loop of the algorithm. Exercise 1 Can you think of a derivative where the payoff depends on (X t1,..., X tp ), but where it would not be necessary to keep track of (X t1,..., X tp ) on each sample path? Simulating a Multidimensional SDE In the multidimensional case, X t, W t and µ(t, X t ) in (16) are now vectors, and σ(t, X t ) is a matrix. This situation arises when we have a series of SDE s in our model. This could occur in a number of financial engineering contexts. Some examples include: (1) Modeling the evolution of multiple stocks. This might be necessary if we are trying to price derivatives whose values depend on multiple stocks or state variables, or if we are studying the properties of some portfolio strategy with multiple assets. () Modeling the evolution of a single stock where we assume that the volatility of the stock is itself stochastic. Such a model is termed a stochastic volatility model. (3) Modeling the evolution of interest rates. For example, if we assume that the short rate, r t, is driven by a number of factors which themselves are stochastic and satisfy SDE s, then simulating r t amounts to simulating the SDE s that drive the factors. Examples include the multi-factor Gaussian and CIR models. Such models also occur in HJM and LIBOR market models. In all of these cases, whether or not we will have to simulate the SDE s will depend on the model in question and on the particular quantity that we wish to compute. If we do need to discretize the SDE s and simulate their discretized versions, then it is very straightforward. If there are n correlated Brownian motions driving the SDE s, then at each time step, t i, we must generate n IID N(0, 1) random variables. We would then use the

18 Monte-Carlo Methods for Risk Management 18 Cholesky Decomposition to generate X ti+1. This is exactly analogous to our method of generating correlated geometric Brownian motions. In the context of simulating multidimensional SDE s, however, it is more common to use independent Brownian motions as any correlations between components of the vector, X t, can be induced through the matrix, σ(t, X t ). Applications to Financial Engineering Example 1 (Option Pricing Under Stochastic Volatility) Suppose the evolution of the stock price, S t, under the risk-neutral probability measure is given by ds t = rs t dt + V t S t dw (1) t (18) dv t = α (b V t ) dt + σ V t dw () t. (19) If we want to price a European call option on the stock with expiration, T, and strike K, then the price is given by C 0 = exp( rt )E[max(S T K, 0)]. We could estimate C 0 by simulating n sample paths of {S t, V t } up to time T, and taking the average of exp( rt ) max(s T K, 0) over the n paths as our estimated call option price, Ĉ0. Exercise Write out the details of the algorithm that you would use to estimate C 0 in Example 1. Remark 5 It is certainly worth mentioning that the Euler scheme can perform poorly in practice when applied to the Heston stochastic volatility model. See the discussion of implementation risk below. Example 13 (The CIR Model with Time-Dependent Parameters) We assume the Q-dynamics of the short-rate, r t, are given by dr t = α[µ(t) r t ] dt + σ r t dw t (0) where µ(t) is a deterministic function of time. This generalized CIR model is used when we want to fit a CIR-type model to the initial term-structure. Suppose now that we wish to price a derivative security maturing at time T with payoff C T (r T ). Then its time 0 price, C 0, is given by [ C 0 = E 0 e ] T 0 rs ds C T (r T ). (1) The distribution of r t is not available in an easy-to-use closed form so perhaps the easiest way to estimate C 0 is by simulating the dynamics of r t. Towards this end, we could either use (0) and simulate r t directly or alternatively, we could simulate X t := f(r t ) where f( ) is an invertible transformation. Note that because of the discount factor in (1), it is also necessary to simulate the process, Y t, given by Y t = exp( t 0 r s ds). Exercise 3 Describe in detail how you would you would estimate C 0 in Example 13. Note that there are alternative ways to do this. What way do you prefer? Exercise 4 Suppose we wish to simulate the known dynamics of a zero-coupon bond. How would you ensure that the simulated process satisfies 0 < Z T t < 1?

19 Monte-Carlo Methods for Risk Management 19 Variance Reduction Techniques for Simulating SDE s Simulating SDE s is a computationally intensive task as we need to do a lot of work for each sample that we generate. As a result, variance reduction techniques are often very useful in such contexts. We give one example based on stratified sampling and the Brownian bridge. Note that these ideas could be applied very generally to many different models and risk management applications. Example 14 (The Brownian Bridge and Stratified Sampling) Consider a short rate model of the form dr t = µ(t, r t ) dt + σ(t, r t ) dw t. When pricing a derivative that matures at time T using an Euler scheme it is necessary to generate the path (W h, W h,..., W mh = W T ). It will often be the case, however, that the value of W T will be particularly significant in determining the payoff. As a result, we might want to stratify using the random variable, W T. This is easy to do for the following two reasons. (i) W T N(0, T ) so we can easily generate a sample of W T and (ii) We can easily generate (W h, W h,..., W T h W T ) by computing the relevant conditional distributions and then simulating from them. For example, it is straightforward to see that ( ) (v t)x + (t s)y (v t)(t s) (W t W s = x, W v = y) N, for s < t < v () v s v s and we can use this result to generate (W h W 0, W T ). More generally, we can use () to successively simulate (W h W 0, W T ), (W h W h, W T ),..., (W T h W T h, W T ). We can in fact simulate the points on the sample path in any order we like. In particular, to simulate W v we use () and condition on the two closest sample points before and after v, respectively, that have already been sampled. This method of pinning the beginning and end points of the Brownian motion is known as the Brownian bridge. Exercise 5 If we are working with a multi-dimensional correlated Brownian motion, W t, (e.g. in the context of a multi-factor model of the short rate) is it still easy to use the Brownian bridge construction where we first generate the random vector, W T? Allocation of Computational Resources An important issue that arises when simulating SDE s is the allocation of computational resources. In particular, we need to determine how many sample paths, n, to generate and how many steps, m, to simulate on each sample path. A smaller value of m will result in greater bias and numerical error, whereas a smaller value of n will result in greater statistical noise. If there is a fixed computational budget then it is important to choose n and m in an optimal manner. We now discuss this issue. Suppose dx t = µ(t, X t ) dt + σ(t, X t ) db t and that we wish to estimate θ := E[f(X T )] using an Euler approximation scheme. Let θ h n be an estimate of θ based upon simulating n sample paths using a total of m = T/h discretization points per path. In particular, we have θ h n = f( X h 1 ) f( X h n) n where X i h is the value of X T h on the ith path. Under some technical conditions on the process, X t, it is known that the Euler scheme has a weak order of convergence equal to 1 so that See Glasserman (004) for further details. E[f(X T )] E[f( X h T )] am 1 (3)

20 Monte-Carlo Methods for Risk Management 0 Table 1: Call Option Price Estimates Using Euler Scheme in Heston s Stochastic Volatility Model Time Steps Sticky Zero Reflection for some constant a and all m greater than some fixed m 0. Suppose now that we have a fixed computational budget, C, and that each simulation step costs c. We must therefore have n = C/mc. We would like to choose the optimal values of m (and therefore n) as a function of C. We do this by minimizing the mean squared error (MSE), which is the sum of the bias squared and the variance, v. In particular, (3) implies MSE a m + v (4) n for sufficiently large m. Substituting for n in (4), it is easy to see that it is optimal (for sufficiently large C) to take m C 1/3 (5) n C /3. (6) When it comes to estimating θ, (5) and (6) provide guidance as follows. We begin by using n 0 paths and m 0 discretization points per path to compute an initial estimate, θ 0, of θ. If we then compute a new estimate, θ 1, by setting m 1 = m 0, then (5) and (6) suggest we should set n 1 = 4n 0. We may then continue to compute new estimates, θ i, in this manner until the estimates and their associated confidence intervals converge. In general, if we increase m by a factor of then we should increase n by a factor of 4. Although estimating θ in this way requires additional computational resources, it is not usually necessary to perform more than two or three iterations, provided we begin with sufficiently large values of m 0 and n 0. Implementation Risk There are other important issues that arise when simulating SDE s. For example, while we have only described the Euler scheme, there are other more sophisticated discretization schemes that can also be used. They include, for example, predictor-corrector schemes or the Milstein scheme. In a sense that we will not define, these schemes have superior convergence properties than the Euler scheme. However, they are sometimes more difficult to implement, particularly in the multi-dimensional setting. It is also certainly worth mentioning that the Euler scheme can perform poorly in practice when applied to the Heston stochastic volatility 3 model of Example 1. In particular, depending on the model parameters, α, b and σ, it can perform poorly when applied to the process in (19), despite the fact that theoretical convergence is guaranteed. For example, Andersen reports the following results for pricing an at-the-money 10-year call option when r = q = 0. He takes α =.5, V 0 = b =.04, σ =.1, S 0 = 100 and ρ := Corr(W (1) t, W () t ) = 0.9. Using one million sample paths and a sticky zero or reflection assumption 4, he obtains the estimates displayed in Table 4 for the option price as a function of m, the number of discretization points. One therefore needs to be very careful when applying an Euler scheme to this SDE. Note that these convergence problems would be easily identified if we followed the procedure outlined in the paragraph immediately following (6). But for this particular pr0cess, one should use a better scheme such as that proposed by Andersen (007). More generally, it 3 The volatility process in the Heston model is also known as the CIR model in term applications. 4 The sticky zero assumption simply means that anytime the variance process, V t, goes negative in the Monte-Carlo it is replaced by 0. The reflection assumption replaces V t with V t. In the limit as m, the variance will stay non-negative with probability 1 so both assumptions are unnecessary in the limit.

Simulating Stochastic Differential Equations

Simulating Stochastic Differential Equations Monte Carlo Simulation: IEOR E473 Fall 24 c 24 by Martin Haugh Simulating Stochastic Differential Equations 1 Brief Review of Stochastic Calculus and Itô s Lemma Let S t be the time t price of a particular

More information

Overview of Monte Carlo Simulation, Probability Review and Introduction to Matlab

Overview of Monte Carlo Simulation, Probability Review and Introduction to Matlab Monte Carlo Simulation: IEOR E4703 Fall 2004 c 2004 by Martin Haugh Overview of Monte Carlo Simulation, Probability Review and Introduction to Matlab 1 Overview of Monte Carlo Simulation 1.1 Why use simulation?

More information

The Monte Carlo Framework, Examples from Finance and Generating Correlated Random Variables

The Monte Carlo Framework, Examples from Finance and Generating Correlated Random Variables Monte Carlo Simulation: IEOR E4703 Fall 2004 c 2004 by Martin Haugh The Monte Carlo Framework, Examples from Finance and Generating Correlated Random Variables 1 The Monte Carlo Framework Suppose we wish

More information

Variance Reduction Methods I

Variance Reduction Methods I Monte Carlo Simulation: IEOR E4703 Fall 2004 c 2004 by Martin Haugh Variance Reduction Methods I 1 Simulation Efficiency Suppose as usual that we wish to estimate θ := E[h(X)]. Then the standard simulation

More information

Generating Random Numbers Variance Reduction Quasi-Monte Carlo. Simulation Methods. Leonid Kogan. MIT, Sloan. 15.450, Fall 2010

Generating Random Numbers Variance Reduction Quasi-Monte Carlo. Simulation Methods. Leonid Kogan. MIT, Sloan. 15.450, Fall 2010 Simulation Methods Leonid Kogan MIT, Sloan 15.450, Fall 2010 c Leonid Kogan ( MIT, Sloan ) Simulation Methods 15.450, Fall 2010 1 / 35 Outline 1 Generating Random Numbers 2 Variance Reduction 3 Quasi-Monte

More information

Maximum Likelihood Estimation

Maximum Likelihood Estimation Math 541: Statistical Theory II Lecturer: Songfeng Zheng Maximum Likelihood Estimation 1 Maximum Likelihood Estimation Maximum likelihood is a relatively simple method of constructing an estimator for

More information

Principle of Data Reduction

Principle of Data Reduction Chapter 6 Principle of Data Reduction 6.1 Introduction An experimenter uses the information in a sample X 1,..., X n to make inferences about an unknown parameter θ. If the sample size n is large, then

More information

5. Continuous Random Variables

5. Continuous Random Variables 5. Continuous Random Variables Continuous random variables can take any value in an interval. They are used to model physical characteristics such as time, length, position, etc. Examples (i) Let X be

More information

Binomial lattice model for stock prices

Binomial lattice model for stock prices Copyright c 2007 by Karl Sigman Binomial lattice model for stock prices Here we model the price of a stock in discrete time by a Markov chain of the recursive form S n+ S n Y n+, n 0, where the {Y i }

More information

Numerical Methods for Option Pricing

Numerical Methods for Option Pricing Chapter 9 Numerical Methods for Option Pricing Equation (8.26) provides a way to evaluate option prices. For some simple options, such as the European call and put options, one can integrate (8.26) directly

More information

1 The Brownian bridge construction

1 The Brownian bridge construction The Brownian bridge construction The Brownian bridge construction is a way to build a Brownian motion path by successively adding finer scale detail. This construction leads to a relatively easy proof

More information

Basics of Statistical Machine Learning

Basics of Statistical Machine Learning CS761 Spring 2013 Advanced Machine Learning Basics of Statistical Machine Learning Lecturer: Xiaojin Zhu jerryzhu@cs.wisc.edu Modern machine learning is rooted in statistics. You will find many familiar

More information

Monte Carlo Methods in Finance

Monte Carlo Methods in Finance Author: Yiyang Yang Advisor: Pr. Xiaolin Li, Pr. Zari Rachev Department of Applied Mathematics and Statistics State University of New York at Stony Brook October 2, 2012 Outline Introduction 1 Introduction

More information

Credit Risk Models: An Overview

Credit Risk Models: An Overview Credit Risk Models: An Overview Paul Embrechts, Rüdiger Frey, Alexander McNeil ETH Zürich c 2003 (Embrechts, Frey, McNeil) A. Multivariate Models for Portfolio Credit Risk 1. Modelling Dependent Defaults:

More information

Chapter 2: Binomial Methods and the Black-Scholes Formula

Chapter 2: Binomial Methods and the Black-Scholes Formula Chapter 2: Binomial Methods and the Black-Scholes Formula 2.1 Binomial Trees We consider a financial market consisting of a bond B t = B(t), a stock S t = S(t), and a call-option C t = C(t), where the

More information

Review of Basic Options Concepts and Terminology

Review of Basic Options Concepts and Terminology Review of Basic Options Concepts and Terminology March 24, 2005 1 Introduction The purchase of an options contract gives the buyer the right to buy call options contract or sell put options contract some

More information

Martingale Pricing Applied to Options, Forwards and Futures

Martingale Pricing Applied to Options, Forwards and Futures IEOR E4706: Financial Engineering: Discrete-Time Asset Pricing Fall 2005 c 2005 by Martin Haugh Martingale Pricing Applied to Options, Forwards and Futures We now apply martingale pricing theory to the

More information

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.436J/15.085J Fall 2008 Lecture 5 9/17/2008 RANDOM VARIABLES

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.436J/15.085J Fall 2008 Lecture 5 9/17/2008 RANDOM VARIABLES MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.436J/15.085J Fall 2008 Lecture 5 9/17/2008 RANDOM VARIABLES Contents 1. Random variables and measurable functions 2. Cumulative distribution functions 3. Discrete

More information

Options 1 OPTIONS. Introduction

Options 1 OPTIONS. Introduction Options 1 OPTIONS Introduction A derivative is a financial instrument whose value is derived from the value of some underlying asset. A call option gives one the right to buy an asset at the exercise or

More information

Normal distribution. ) 2 /2σ. 2π σ

Normal distribution. ) 2 /2σ. 2π σ Normal distribution The normal distribution is the most widely known and used of all distributions. Because the normal distribution approximates many natural phenomena so well, it has developed into a

More information

For a partition B 1,..., B n, where B i B j = for i. A = (A B 1 ) (A B 2 ),..., (A B n ) and thus. P (A) = P (A B i ) = P (A B i )P (B i )

For a partition B 1,..., B n, where B i B j = for i. A = (A B 1 ) (A B 2 ),..., (A B n ) and thus. P (A) = P (A B i ) = P (A B i )P (B i ) Probability Review 15.075 Cynthia Rudin A probability space, defined by Kolmogorov (1903-1987) consists of: A set of outcomes S, e.g., for the roll of a die, S = {1, 2, 3, 4, 5, 6}, 1 1 2 1 6 for the roll

More information

Lecture 13: Martingales

Lecture 13: Martingales Lecture 13: Martingales 1. Definition of a Martingale 1.1 Filtrations 1.2 Definition of a martingale and its basic properties 1.3 Sums of independent random variables and related models 1.4 Products of

More information

Moreover, under the risk neutral measure, it must be the case that (5) r t = µ t.

Moreover, under the risk neutral measure, it must be the case that (5) r t = µ t. LECTURE 7: BLACK SCHOLES THEORY 1. Introduction: The Black Scholes Model In 1973 Fisher Black and Myron Scholes ushered in the modern era of derivative securities with a seminal paper 1 on the pricing

More information

Generating Random Variables and Stochastic Processes

Generating Random Variables and Stochastic Processes Monte Carlo Simulation: IEOR E4703 c 2010 by Martin Haugh Generating Random Variables and Stochastic Processes In these lecture notes we describe the principal methods that are used to generate random

More information

Statistics 100A Homework 8 Solutions

Statistics 100A Homework 8 Solutions Part : Chapter 7 Statistics A Homework 8 Solutions Ryan Rosario. A player throws a fair die and simultaneously flips a fair coin. If the coin lands heads, then she wins twice, and if tails, the one-half

More information

ST 371 (IV): Discrete Random Variables

ST 371 (IV): Discrete Random Variables ST 371 (IV): Discrete Random Variables 1 Random Variables A random variable (rv) is a function that is defined on the sample space of the experiment and that assigns a numerical variable to each possible

More information

Numerical methods for American options

Numerical methods for American options Lecture 9 Numerical methods for American options Lecture Notes by Andrzej Palczewski Computational Finance p. 1 American options The holder of an American option has the right to exercise it at any moment

More information

Pricing Barrier Option Using Finite Difference Method and MonteCarlo Simulation

Pricing Barrier Option Using Finite Difference Method and MonteCarlo Simulation Pricing Barrier Option Using Finite Difference Method and MonteCarlo Simulation Yoon W. Kwon CIMS 1, Math. Finance Suzanne A. Lewis CIMS, Math. Finance May 9, 000 1 Courant Institue of Mathematical Science,

More information

The sample space for a pair of die rolls is the set. The sample space for a random number between 0 and 1 is the interval [0, 1].

The sample space for a pair of die rolls is the set. The sample space for a random number between 0 and 1 is the interval [0, 1]. Probability Theory Probability Spaces and Events Consider a random experiment with several possible outcomes. For example, we might roll a pair of dice, flip a coin three times, or choose a random real

More information

Generating Random Variables and Stochastic Processes

Generating Random Variables and Stochastic Processes Monte Carlo Simulation: IEOR E4703 Fall 2004 c 2004 by Martin Haugh Generating Random Variables and Stochastic Processes 1 Generating U(0,1) Random Variables The ability to generate U(0, 1) random variables

More information

An Introduction to Modeling Stock Price Returns With a View Towards Option Pricing

An Introduction to Modeling Stock Price Returns With a View Towards Option Pricing An Introduction to Modeling Stock Price Returns With a View Towards Option Pricing Kyle Chauvin August 21, 2006 This work is the product of a summer research project at the University of Kansas, conducted

More information

IEOR 6711: Stochastic Models I Fall 2012, Professor Whitt, Tuesday, September 11 Normal Approximations and the Central Limit Theorem

IEOR 6711: Stochastic Models I Fall 2012, Professor Whitt, Tuesday, September 11 Normal Approximations and the Central Limit Theorem IEOR 6711: Stochastic Models I Fall 2012, Professor Whitt, Tuesday, September 11 Normal Approximations and the Central Limit Theorem Time on my hands: Coin tosses. Problem Formulation: Suppose that I have

More information

Gaussian Conjugate Prior Cheat Sheet

Gaussian Conjugate Prior Cheat Sheet Gaussian Conjugate Prior Cheat Sheet Tom SF Haines 1 Purpose This document contains notes on how to handle the multivariate Gaussian 1 in a Bayesian setting. It focuses on the conjugate prior, its Bayesian

More information

5 Numerical Differentiation

5 Numerical Differentiation D. Levy 5 Numerical Differentiation 5. Basic Concepts This chapter deals with numerical approximations of derivatives. The first questions that comes up to mind is: why do we need to approximate derivatives

More information

Inner Product Spaces

Inner Product Spaces Math 571 Inner Product Spaces 1. Preliminaries An inner product space is a vector space V along with a function, called an inner product which associates each pair of vectors u, v with a scalar u, v, and

More information

Example 1. Consider the following two portfolios: 2. Buy one c(s(t), 20, τ, r) and sell one c(s(t), 10, τ, r).

Example 1. Consider the following two portfolios: 2. Buy one c(s(t), 20, τ, r) and sell one c(s(t), 10, τ, r). Chapter 4 Put-Call Parity 1 Bull and Bear Financial analysts use words such as bull and bear to describe the trend in stock markets. Generally speaking, a bull market is characterized by rising prices.

More information

Mathematical Finance

Mathematical Finance Mathematical Finance Option Pricing under the Risk-Neutral Measure Cory Barnes Department of Mathematics University of Washington June 11, 2013 Outline 1 Probability Background 2 Black Scholes for European

More information

SF2940: Probability theory Lecture 8: Multivariate Normal Distribution

SF2940: Probability theory Lecture 8: Multivariate Normal Distribution SF2940: Probability theory Lecture 8: Multivariate Normal Distribution Timo Koski 24.09.2015 Timo Koski Matematisk statistik 24.09.2015 1 / 1 Learning outcomes Random vectors, mean vector, covariance matrix,

More information

Pricing of an Exotic Forward Contract

Pricing of an Exotic Forward Contract Pricing of an Exotic Forward Contract Jirô Akahori, Yuji Hishida and Maho Nishida Dept. of Mathematical Sciences, Ritsumeikan University 1-1-1 Nojihigashi, Kusatsu, Shiga 525-8577, Japan E-mail: {akahori,

More information

LogNormal stock-price models in Exams MFE/3 and C/4

LogNormal stock-price models in Exams MFE/3 and C/4 Making sense of... LogNormal stock-price models in Exams MFE/3 and C/4 James W. Daniel Austin Actuarial Seminars http://www.actuarialseminars.com June 26, 2008 c Copyright 2007 by James W. Daniel; reproduction

More information

3. Monte Carlo Simulations. Math6911 S08, HM Zhu

3. Monte Carlo Simulations. Math6911 S08, HM Zhu 3. Monte Carlo Simulations Math6911 S08, HM Zhu References 1. Chapters 4 and 8, Numerical Methods in Finance. Chapters 17.6-17.7, Options, Futures and Other Derivatives 3. George S. Fishman, Monte Carlo:

More information

e.g. arrival of a customer to a service station or breakdown of a component in some system.

e.g. arrival of a customer to a service station or breakdown of a component in some system. Poisson process Events occur at random instants of time at an average rate of λ events per second. e.g. arrival of a customer to a service station or breakdown of a component in some system. Let N(t) be

More information

Sensitivity Analysis 3.1 AN EXAMPLE FOR ANALYSIS

Sensitivity Analysis 3.1 AN EXAMPLE FOR ANALYSIS Sensitivity Analysis 3 We have already been introduced to sensitivity analysis in Chapter via the geometry of a simple example. We saw that the values of the decision variables and those of the slack and

More information

Hypothesis Testing for Beginners

Hypothesis Testing for Beginners Hypothesis Testing for Beginners Michele Piffer LSE August, 2011 Michele Piffer (LSE) Hypothesis Testing for Beginners August, 2011 1 / 53 One year ago a friend asked me to put down some easy-to-read notes

More information

Probability and Statistics Prof. Dr. Somesh Kumar Department of Mathematics Indian Institute of Technology, Kharagpur

Probability and Statistics Prof. Dr. Somesh Kumar Department of Mathematics Indian Institute of Technology, Kharagpur Probability and Statistics Prof. Dr. Somesh Kumar Department of Mathematics Indian Institute of Technology, Kharagpur Module No. #01 Lecture No. #15 Special Distributions-VI Today, I am going to introduce

More information

INTEREST RATES AND FX MODELS

INTEREST RATES AND FX MODELS INTEREST RATES AND FX MODELS 8. Portfolio greeks Andrew Lesniewski Courant Institute of Mathematical Sciences New York University New York March 27, 2013 2 Interest Rates & FX Models Contents 1 Introduction

More information

MATH 4330/5330, Fourier Analysis Section 11, The Discrete Fourier Transform

MATH 4330/5330, Fourier Analysis Section 11, The Discrete Fourier Transform MATH 433/533, Fourier Analysis Section 11, The Discrete Fourier Transform Now, instead of considering functions defined on a continuous domain, like the interval [, 1) or the whole real line R, we wish

More information

Lecture 7: Continuous Random Variables

Lecture 7: Continuous Random Variables Lecture 7: Continuous Random Variables 21 September 2005 1 Our First Continuous Random Variable The back of the lecture hall is roughly 10 meters across. Suppose it were exactly 10 meters, and consider

More information

Article from: Risk Management. June 2009 Issue 16

Article from: Risk Management. June 2009 Issue 16 Article from: Risk Management June 2009 Issue 16 CHAIRSPERSON S Risk quantification CORNER Structural Credit Risk Modeling: Merton and Beyond By Yu Wang The past two years have seen global financial markets

More information

How To Price A Call Option

How To Price A Call Option Now by Itô s formula But Mu f and u g in Ū. Hence τ θ u(x) =E( Mu(X) ds + u(x(τ θ))) 0 τ θ u(x) E( f(x) ds + g(x(τ θ))) = J x (θ). 0 But since u(x) =J x (θ ), we consequently have u(x) =J x (θ ) = min

More information

Final Mathematics 5010, Section 1, Fall 2004 Instructor: D.A. Levin

Final Mathematics 5010, Section 1, Fall 2004 Instructor: D.A. Levin Final Mathematics 51, Section 1, Fall 24 Instructor: D.A. Levin Name YOU MUST SHOW YOUR WORK TO RECEIVE CREDIT. A CORRECT ANSWER WITHOUT SHOWING YOUR REASONING WILL NOT RECEIVE CREDIT. Problem Points Possible

More information

Chapter 3 RANDOM VARIATE GENERATION

Chapter 3 RANDOM VARIATE GENERATION Chapter 3 RANDOM VARIATE GENERATION In order to do a Monte Carlo simulation either by hand or by computer, techniques must be developed for generating values of random variables having known distributions.

More information

General Sampling Methods

General Sampling Methods General Sampling Methods Reference: Glasserman, 2.2 and 2.3 Claudio Pacati academic year 2016 17 1 Inverse Transform Method Assume U U(0, 1) and let F be the cumulative distribution function of a distribution

More information

Variance Reduction. Pricing American Options. Monte Carlo Option Pricing. Delta and Common Random Numbers

Variance Reduction. Pricing American Options. Monte Carlo Option Pricing. Delta and Common Random Numbers Variance Reduction The statistical efficiency of Monte Carlo simulation can be measured by the variance of its output If this variance can be lowered without changing the expected value, fewer replications

More information

Computational Finance Options

Computational Finance Options 1 Options 1 1 Options Computational Finance Options An option gives the holder of the option the right, but not the obligation to do something. Conversely, if you sell an option, you may be obliged to

More information

Chapter 4 Lecture Notes

Chapter 4 Lecture Notes Chapter 4 Lecture Notes Random Variables October 27, 2015 1 Section 4.1 Random Variables A random variable is typically a real-valued function defined on the sample space of some experiment. For instance,

More information

Practice problems for Homework 11 - Point Estimation

Practice problems for Homework 11 - Point Estimation Practice problems for Homework 11 - Point Estimation 1. (10 marks) Suppose we want to select a random sample of size 5 from the current CS 3341 students. Which of the following strategies is the best:

More information

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.436J/15.085J Fall 2008 Lecture 14 10/27/2008 MOMENT GENERATING FUNCTIONS

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.436J/15.085J Fall 2008 Lecture 14 10/27/2008 MOMENT GENERATING FUNCTIONS MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.436J/15.085J Fall 2008 Lecture 14 10/27/2008 MOMENT GENERATING FUNCTIONS Contents 1. Moment generating functions 2. Sum of a ranom number of ranom variables 3. Transforms

More information

INDIRECT INFERENCE (prepared for: The New Palgrave Dictionary of Economics, Second Edition)

INDIRECT INFERENCE (prepared for: The New Palgrave Dictionary of Economics, Second Edition) INDIRECT INFERENCE (prepared for: The New Palgrave Dictionary of Economics, Second Edition) Abstract Indirect inference is a simulation-based method for estimating the parameters of economic models. Its

More information

Math 4310 Handout - Quotient Vector Spaces

Math 4310 Handout - Quotient Vector Spaces Math 4310 Handout - Quotient Vector Spaces Dan Collins The textbook defines a subspace of a vector space in Chapter 4, but it avoids ever discussing the notion of a quotient space. This is understandable

More information

CA200 Quantitative Analysis for Business Decisions. File name: CA200_Section_04A_StatisticsIntroduction

CA200 Quantitative Analysis for Business Decisions. File name: CA200_Section_04A_StatisticsIntroduction CA200 Quantitative Analysis for Business Decisions File name: CA200_Section_04A_StatisticsIntroduction Table of Contents 4. Introduction to Statistics... 1 4.1 Overview... 3 4.2 Discrete or continuous

More information

Lecture 3: Linear methods for classification

Lecture 3: Linear methods for classification Lecture 3: Linear methods for classification Rafael A. Irizarry and Hector Corrada Bravo February, 2010 Today we describe four specific algorithms useful for classification problems: linear regression,

More information

Finite Differences Schemes for Pricing of European and American Options

Finite Differences Schemes for Pricing of European and American Options Finite Differences Schemes for Pricing of European and American Options Margarida Mirador Fernandes IST Technical University of Lisbon Lisbon, Portugal November 009 Abstract Starting with the Black-Scholes

More information

4. Continuous Random Variables, the Pareto and Normal Distributions

4. Continuous Random Variables, the Pareto and Normal Distributions 4. Continuous Random Variables, the Pareto and Normal Distributions A continuous random variable X can take any value in a given range (e.g. height, weight, age). The distribution of a continuous random

More information

Summary of Formulas and Concepts. Descriptive Statistics (Ch. 1-4)

Summary of Formulas and Concepts. Descriptive Statistics (Ch. 1-4) Summary of Formulas and Concepts Descriptive Statistics (Ch. 1-4) Definitions Population: The complete set of numerical information on a particular quantity in which an investigator is interested. We assume

More information

Master s Theory Exam Spring 2006

Master s Theory Exam Spring 2006 Spring 2006 This exam contains 7 questions. You should attempt them all. Each question is divided into parts to help lead you through the material. You should attempt to complete as much of each problem

More information

CS 522 Computational Tools and Methods in Finance Robert Jarrow Lecture 1: Equity Options

CS 522 Computational Tools and Methods in Finance Robert Jarrow Lecture 1: Equity Options CS 5 Computational Tools and Methods in Finance Robert Jarrow Lecture 1: Equity Options 1. Definitions Equity. The common stock of a corporation. Traded on organized exchanges (NYSE, AMEX, NASDAQ). A common

More information

The Heston Model. Hui Gong, UCL http://www.homepages.ucl.ac.uk/ ucahgon/ May 6, 2014

The Heston Model. Hui Gong, UCL http://www.homepages.ucl.ac.uk/ ucahgon/ May 6, 2014 Hui Gong, UCL http://www.homepages.ucl.ac.uk/ ucahgon/ May 6, 2014 Generalized SV models Vanilla Call Option via Heston Itô s lemma for variance process Euler-Maruyama scheme Implement in Excel&VBA 1.

More information

6.041/6.431 Spring 2008 Quiz 2 Wednesday, April 16, 7:30-9:30 PM. SOLUTIONS

6.041/6.431 Spring 2008 Quiz 2 Wednesday, April 16, 7:30-9:30 PM. SOLUTIONS 6.4/6.43 Spring 28 Quiz 2 Wednesday, April 6, 7:3-9:3 PM. SOLUTIONS Name: Recitation Instructor: TA: 6.4/6.43: Question Part Score Out of 3 all 36 2 a 4 b 5 c 5 d 8 e 5 f 6 3 a 4 b 6 c 6 d 6 e 6 Total

More information

Hedging Illiquid FX Options: An Empirical Analysis of Alternative Hedging Strategies

Hedging Illiquid FX Options: An Empirical Analysis of Alternative Hedging Strategies Hedging Illiquid FX Options: An Empirical Analysis of Alternative Hedging Strategies Drazen Pesjak Supervised by A.A. Tsvetkov 1, D. Posthuma 2 and S.A. Borovkova 3 MSc. Thesis Finance HONOURS TRACK Quantitative

More information

More Exotic Options. 1 Barrier Options. 2 Compound Options. 3 Gap Options

More Exotic Options. 1 Barrier Options. 2 Compound Options. 3 Gap Options More Exotic Options 1 Barrier Options 2 Compound Options 3 Gap Options More Exotic Options 1 Barrier Options 2 Compound Options 3 Gap Options Definition; Some types The payoff of a Barrier option is path

More information

The Black-Scholes pricing formulas

The Black-Scholes pricing formulas The Black-Scholes pricing formulas Moty Katzman September 19, 2014 The Black-Scholes differential equation Aim: Find a formula for the price of European options on stock. Lemma 6.1: Assume that a stock

More information

MATH4427 Notebook 2 Spring 2016. 2 MATH4427 Notebook 2 3. 2.1 Definitions and Examples... 3. 2.2 Performance Measures for Estimators...

MATH4427 Notebook 2 Spring 2016. 2 MATH4427 Notebook 2 3. 2.1 Definitions and Examples... 3. 2.2 Performance Measures for Estimators... MATH4427 Notebook 2 Spring 2016 prepared by Professor Jenny Baglivo c Copyright 2009-2016 by Jenny A. Baglivo. All Rights Reserved. Contents 2 MATH4427 Notebook 2 3 2.1 Definitions and Examples...................................

More information

Lectures. Sergei Fedotov. 20912 - Introduction to Financial Mathematics. No tutorials in the first week

Lectures. Sergei Fedotov. 20912 - Introduction to Financial Mathematics. No tutorials in the first week Lectures Sergei Fedotov 20912 - Introduction to Financial Mathematics No tutorials in the first week Sergei Fedotov (University of Manchester) 20912 2010 1 / 1 Lecture 1 1 Introduction Elementary economics

More information

Discrete Mathematics and Probability Theory Fall 2009 Satish Rao, David Tse Note 18. A Brief Introduction to Continuous Probability

Discrete Mathematics and Probability Theory Fall 2009 Satish Rao, David Tse Note 18. A Brief Introduction to Continuous Probability CS 7 Discrete Mathematics and Probability Theory Fall 29 Satish Rao, David Tse Note 8 A Brief Introduction to Continuous Probability Up to now we have focused exclusively on discrete probability spaces

More information

WHERE DOES THE 10% CONDITION COME FROM?

WHERE DOES THE 10% CONDITION COME FROM? 1 WHERE DOES THE 10% CONDITION COME FROM? The text has mentioned The 10% Condition (at least) twice so far: p. 407 Bernoulli trials must be independent. If that assumption is violated, it is still okay

More information

The Exponential Distribution

The Exponential Distribution 21 The Exponential Distribution From Discrete-Time to Continuous-Time: In Chapter 6 of the text we will be considering Markov processes in continuous time. In a sense, we already have a very good understanding

More information

2WB05 Simulation Lecture 8: Generating random variables

2WB05 Simulation Lecture 8: Generating random variables 2WB05 Simulation Lecture 8: Generating random variables Marko Boon http://www.win.tue.nl/courses/2wb05 January 7, 2013 Outline 2/36 1. How do we generate random variables? 2. Fitting distributions Generating

More information

Lecture 6 Black-Scholes PDE

Lecture 6 Black-Scholes PDE Lecture 6 Black-Scholes PDE Lecture Notes by Andrzej Palczewski Computational Finance p. 1 Pricing function Let the dynamics of underlining S t be given in the risk-neutral measure Q by If the contingent

More information

Lecture 12: The Black-Scholes Model Steven Skiena. http://www.cs.sunysb.edu/ skiena

Lecture 12: The Black-Scholes Model Steven Skiena. http://www.cs.sunysb.edu/ skiena Lecture 12: The Black-Scholes Model Steven Skiena Department of Computer Science State University of New York Stony Brook, NY 11794 4400 http://www.cs.sunysb.edu/ skiena The Black-Scholes-Merton Model

More information

Nonparametric adaptive age replacement with a one-cycle criterion

Nonparametric adaptive age replacement with a one-cycle criterion Nonparametric adaptive age replacement with a one-cycle criterion P. Coolen-Schrijner, F.P.A. Coolen Department of Mathematical Sciences University of Durham, Durham, DH1 3LE, UK e-mail: Pauline.Schrijner@durham.ac.uk

More information

1 Prior Probability and Posterior Probability

1 Prior Probability and Posterior Probability Math 541: Statistical Theory II Bayesian Approach to Parameter Estimation Lecturer: Songfeng Zheng 1 Prior Probability and Posterior Probability Consider now a problem of statistical inference in which

More information

CHAPTER 2 Estimating Probabilities

CHAPTER 2 Estimating Probabilities CHAPTER 2 Estimating Probabilities Machine Learning Copyright c 2016. Tom M. Mitchell. All rights reserved. *DRAFT OF January 24, 2016* *PLEASE DO NOT DISTRIBUTE WITHOUT AUTHOR S PERMISSION* This is a

More information

Statistical Machine Learning

Statistical Machine Learning Statistical Machine Learning UoC Stats 37700, Winter quarter Lecture 4: classical linear and quadratic discriminants. 1 / 25 Linear separation For two classes in R d : simple idea: separate the classes

More information

Lecture. S t = S t δ[s t ].

Lecture. S t = S t δ[s t ]. Lecture In real life the vast majority of all traded options are written on stocks having at least one dividend left before the date of expiration of the option. Thus the study of dividends is important

More information

Lecture 6: Discrete & Continuous Probability and Random Variables

Lecture 6: Discrete & Continuous Probability and Random Variables Lecture 6: Discrete & Continuous Probability and Random Variables D. Alex Hughes Math Camp September 17, 2015 D. Alex Hughes (Math Camp) Lecture 6: Discrete & Continuous Probability and Random September

More information

Random variables, probability distributions, binomial random variable

Random variables, probability distributions, binomial random variable Week 4 lecture notes. WEEK 4 page 1 Random variables, probability distributions, binomial random variable Eample 1 : Consider the eperiment of flipping a fair coin three times. The number of tails that

More information

1. (First passage/hitting times/gambler s ruin problem:) Suppose that X has a discrete state space and let i be a fixed state. Let

1. (First passage/hitting times/gambler s ruin problem:) Suppose that X has a discrete state space and let i be a fixed state. Let Copyright c 2009 by Karl Sigman 1 Stopping Times 1.1 Stopping Times: Definition Given a stochastic process X = {X n : n 0}, a random time τ is a discrete random variable on the same probability space as

More information

Stock Price Dynamics, Dividends and Option Prices with Volatility Feedback

Stock Price Dynamics, Dividends and Option Prices with Volatility Feedback Stock Price Dynamics, Dividends and Option Prices with Volatility Feedback Juho Kanniainen Tampere University of Technology New Thinking in Finance 12 Feb. 2014, London Based on J. Kanniainen and R. Piche,

More information

A SNOWBALL CURRENCY OPTION

A SNOWBALL CURRENCY OPTION J. KSIAM Vol.15, No.1, 31 41, 011 A SNOWBALL CURRENCY OPTION GYOOCHEOL SHIM 1 1 GRADUATE DEPARTMENT OF FINANCIAL ENGINEERING, AJOU UNIVERSITY, SOUTH KOREA E-mail address: gshim@ajou.ac.kr ABSTRACT. I introduce

More information

Valuation of the Surrender Option Embedded in Equity-Linked Life Insurance. Brennan Schwartz (1976,1979) Brennan Schwartz

Valuation of the Surrender Option Embedded in Equity-Linked Life Insurance. Brennan Schwartz (1976,1979) Brennan Schwartz Valuation of the Surrender Option Embedded in Equity-Linked Life Insurance Brennan Schwartz (976,979) Brennan Schwartz 04 2005 6. Introduction Compared to traditional insurance products, one distinguishing

More information

Section 5.1 Continuous Random Variables: Introduction

Section 5.1 Continuous Random Variables: Introduction Section 5. Continuous Random Variables: Introduction Not all random variables are discrete. For example:. Waiting times for anything (train, arrival of customer, production of mrna molecule from gene,

More information

Likewise, the payoff of the better-of-two note may be decomposed as follows: Price of gold (US$/oz) 375 400 425 450 475 500 525 550 575 600 Oil price

Likewise, the payoff of the better-of-two note may be decomposed as follows: Price of gold (US$/oz) 375 400 425 450 475 500 525 550 575 600 Oil price Exchange Options Consider the Double Index Bull (DIB) note, which is suited to investors who believe that two indices will rally over a given term. The note typically pays no coupons and has a redemption

More information

Information Theory and Coding Prof. S. N. Merchant Department of Electrical Engineering Indian Institute of Technology, Bombay

Information Theory and Coding Prof. S. N. Merchant Department of Electrical Engineering Indian Institute of Technology, Bombay Information Theory and Coding Prof. S. N. Merchant Department of Electrical Engineering Indian Institute of Technology, Bombay Lecture - 17 Shannon-Fano-Elias Coding and Introduction to Arithmetic Coding

More information

SF2940: Probability theory Lecture 8: Multivariate Normal Distribution

SF2940: Probability theory Lecture 8: Multivariate Normal Distribution SF2940: Probability theory Lecture 8: Multivariate Normal Distribution Timo Koski 24.09.2014 Timo Koski () Mathematisk statistik 24.09.2014 1 / 75 Learning outcomes Random vectors, mean vector, covariance

More information

Aggregate Loss Models

Aggregate Loss Models Aggregate Loss Models Chapter 9 Stat 477 - Loss Models Chapter 9 (Stat 477) Aggregate Loss Models Brian Hartman - BYU 1 / 22 Objectives Objectives Individual risk model Collective risk model Computing

More information

Notes on Continuous Random Variables

Notes on Continuous Random Variables Notes on Continuous Random Variables Continuous random variables are random quantities that are measured on a continuous scale. They can usually take on any value over some interval, which distinguishes

More information

7: The CRR Market Model

7: The CRR Market Model Ben Goldys and Marek Rutkowski School of Mathematics and Statistics University of Sydney MATH3075/3975 Financial Mathematics Semester 2, 2015 Outline We will examine the following issues: 1 The Cox-Ross-Rubinstein

More information

Modern Optimization Methods for Big Data Problems MATH11146 The University of Edinburgh

Modern Optimization Methods for Big Data Problems MATH11146 The University of Edinburgh Modern Optimization Methods for Big Data Problems MATH11146 The University of Edinburgh Peter Richtárik Week 3 Randomized Coordinate Descent With Arbitrary Sampling January 27, 2016 1 / 30 The Problem

More information