Notes for Math 450 Lecture Notes 3

Size: px
Start display at page:

Download "Notes for Math 450 Lecture Notes 3"

Transcription

1 Notes for Math 45 Lecture Notes 3 Renato Feres 1 Moments of Random Variables We introduce some of the standard parameters associated to a random variable. The two most common are the expected value and the variance. These are special cases of moments of a probability distribution. Before defining these quantities, it may be helpful to recall some basic concepts associated to random variables. 1.1 Discrete and continuous random variables A random variable X is said to be discrete if, with probability one, it can take only a finite or countably infinite number of possible values. That is, there is a set {x 1, x 2,... } R such that P (X = x k ) = 1. k=1 X is a continuous random variable if there exists a function f X : R [, ) (not necessarily continuous) such that, for all x, P (X x) = x f X (s)ds. The probability density function, or PDF, f X (x), must satisfy: 1. f X (x) for all x R; 2. f X(x)dx = 1; 3. P (a X b) = b a f X(x)dx for any a b. If f X (x) is continuous, it follows from item 3 and the fundamental theorem of calculus that P (x X x + h) f X (x) = lim. h h 1

2 The cumulative distribution function of X is defined (both for continuous and discrete random variables) as: F X (x) = P (X x), for all x. In terms of the probability density function, F x (x) takes the form F X (x) = P (X x) = x f X (z)dz. Proposition 1.1 (PDF of a linear transformation) Let X be a continuous random variable with PDF f X (x) and cumulative distribution function F X (x) and let Y = ax + b. Then the PDF of Y is given by f Y (y) = 1 a f X((y b)/a). Proof. First assume that a >. Since Y y if and only if X (y b)/a, we have F Y (y) = P (Y y) = P (X (y b)/a) = F X ((y b)/a). Differentiating both sides, we find f Y (y) = 1 a f X((y b)/a). The case a < is left as an exercise. This proposition is a special case of the following. Proposition 1.2 (PDF of a transformation of X) Let X be a continuous random variable with PDF f X (x) and y = g(x) a differentiable one-to-one function with inverse x = h(y). Define a new random variable Y = g(x). Then Y is a continuous random variable with PDF f Y (y) given by f Y (y) = f X (h(y)) h (y). The higher dimensional generalization of the proposition was already mentioned in the second set of lecture notes. 1.2 Expectation The most basic parameter associated to a random variable is its expected value or mean. Fix a probability space (S, F, P ) and let X : S R be a random variable. Definition 1.1 (Expectation) The expectation or mean value of the random variable X is defined as i=1 x ip (X = x i ) if X is discrete E[X] = xf X(x)dx if X is continuous. 2

3 Example 1.1 (A game of dice) A game consists in tossing a die and receiving a payoff X equal to $n for n pips. It is natural to define the fair price to play one round of the game as being the expected value of X. If you could play the game for less than E[X], you would make a sure profit by playing it long enough, and if you pay more you are sure to lose money in the long run. The fair price is then 6 E[X] = i/6 = 21/6 = $3.5 i=1 Example 1.2 (Waiting in line) Let us suppose that the waiting time to be served at the post office at a particular location and time of day is known to follow an exponential distribution with parameter λ = 6 (in units 1/hour). What is the expected time of wait? We have now a continuous random variable T with probability density function f T (t) = λe λt. The expected value is easily calculated to be: E[T ] = tλe λt dt = 1 λ. Therefore, the mean time of wait is one-sixth of an hour, or 1 minutes. It is a bit inconvenient to have to distinguish the continuous and discrete cases every time we refer to the expected value of a random variable. For this reason, we need a uniform notation that represents all cases. We will use the notation for the Lebesgue integral, introduced in the appendix of the previous set of notes. (You do not need to know about the Lebesgue integral. We are only using the notation.) So we will often denote the expected value of a random variable X, of any type, by E[X] = X(s)dP (s). S Sometimes dp (s) is written P (ds). The notation should be understood as follows. Suppose that we decompose the range of values of X into intervals (x i, x i + h], where h is a small step-size. Then E[X] i x i P (X (x i, x i + h]). For discrete random variables, the same integral represents the sum in definition 1.1. Here are a few simple properties of expectations. Proposition 1.3 Let X and Y be random variables on the probability space (S, F, P ). Then: 1. If X then E[X]. 2. For any real number a, E[aX] = ae[x]. 3

4 3. E[X + Y ] = E[X] + E[Y ]. 4. If X is constant equal to a, then E[a] = a. 5. E[XY ] 2 E[X 2 ]E[Y 2 ], with the equality if and only if X and Y are linearly dependent, i.e., there are constants a, b, not both zero, such that P (ax + by = ) = 1. (This is the Cauchy-Schwartz inequality.) 6. If X and Y are independent and both E[ X ] and E[ Y ] are finite, then E[XY ] = E[X]E[Y ]. It is not difficult to obtain from the defintion that if Y = g(x) for some function g(x) and X is a continuous random variable with probability density function f X (x), then E[g(X)] = g(x(s))dp (s) = g(x)f X (x)dx. 1.3 Variance S The variance of a random variable X refines our knowledge of the probability distribution of X by giving a broad measure of how X is dispersed around its mean. Definition 1.2 (Variance) Let (S, F, P ) be a probability space and consider a random variable X : S R with expectation m = E[X]. (We assume that m exists and is finite.) We define the variance of X as the mean square of the difference X m; that is, Var(X) = E[(X m) 2 ] = (X(s) m) 2 dp (s). S The standard deviation of X is defined as σ(x) = Var(X). Example 1.3 Let D be the determinant of the matrix ( ) X Y D = det, Z T where X, Y, Z, T are independent random variables uniformly distributed in [, 1]. We wish to find E[D] and Var(D). The probability space of this problem is S = [, 1] 4, F the σ-algebra generated by 4-dimensional parallelepipeds, and P the 4-dimensional volume obtained by integrating dv = dx dy dz dt. Thus The variance is E[D] = Var(D) = (xt zy)dx dy dz dt =. (xt zy) 2 dx dy dz dt = 7/72. 4

5 Some of the general properties of variance are enumerated in the next proposition. They can be derived from the definitions by simple calculations. The details are left as exercises. Proposition 1.4 Let X, Y be random variables on a probability space (S, F, P ). 1. Var(X) = E[X 2 ] E[X] Var(aX) = a 2 Var(X), where a is any real constant. 3. If X and Y have finite variance and are independent, then Var(X + Y ) = Var(X) + Var(Y ). The variance of a sum of any number of independent random variables now follows from the above results. The next proposition implies that the standard deviation of the arithmetic mean of independent random variables X 1,..., X n decreases like 1/ n. Proposition 1.5 Let X 1, X 2,..., X n be independent random variables having the same standard deviation σ. Denote their sum by S n = X X n. Then Var ( ) Sn = σ2 n n. Mean and variance are examples of moments and central moments of probability distributions. These are defined as follows. Definition 1.3 (Moments) The moment of order k = 1, 2, 3,... of a random variable X is defined as E[X k ] = X(s) k dp (s). The central moment of order k of the random variable with mean µ is E[(X µ) k ] = (X(s) µ) k dp (s). S S The meaning of the central moments, and the variance in particular, is easier to interpret using Chebyshev s inequality. Broadly speaking, this inequality says that if the central moments are small, then the random variable cannot deviate much from its mean. Theorem 1.1 (Chebyshev inequality) Let (S, F, P ) be a probability space, X : S R a random variable, and ɛ > a fixed number. 5

6 1. Let X (with probability 1) and let k N. Then P (X ɛ) E[X k ]/ɛ k. 2. Let X be a random variable with finite expectation m. Then P ( X m ɛ) Var(X)/ɛ Let X be a random variable with finite expectation m and finite standard deviation σ >. For any positive c > we have P ( X m cσ) 1/c 2. Proof. The third inequality follows from the second by taking ɛ = cσ, and the second follows from the first by substituting X m for X. Thus we only need to prove the first. This is done by noting that E[X k ] = X(s) k dp (s) S X(s) k dp (s) {s S:X(s) ɛ} {s S:X(s) ɛ} = ɛ k P (X ɛ). ɛ k dp (s) So we get P (X ɛ) E[X k ]/ɛ k as claimed. Example 1.4 Chebyshev s inequality, in the form of inequality 3 in the theorem, implies that if X is a random variable with finite mean m and finite variance σ 2, then the probability that X lies in the interval (m 3σ, m + 3σ) is at least 1 1/3 2 = 8/9. Example 1.5 (Tosses of a fair coin) We make N = 1 tosses of a fair coin and denote by S N the number of heads. Notice that S N = X 1 + X X N, where X i is 1 if the i-th toss obtains head, and if tail. We assume that the X i are independent and P (X i = ) = P (X i = 1) = 1/2. Then E[S N ] = N/2 = 5 and Var(S N ) = N/4 = 25. From the second inequality in theorem 1.1 we have P (45 S N 55) is at least 1 25/5 2 =.9. A better estimate of the dispersion around the mean will be provided by the central limit theorem, discussed later. 6

7 2 The Laws of Large Numbers The frequentist interpretation of probability rests on the intuitive idea that if we perform a large number of independent trials of an experiment, yielding numerical outcomes x 1, x 2,..., then the averages (x x n )/n converge to some value x as n grows. For example, if we toss a fair coin n times and let x i be or 1 when the coin gives, respectively, head or tail, then the running averages should converge to the relative frequency of tails, which is.5. We discuss now two theorem that make this idea precise. 2.1 The weak law of large numbers Given a sequence X 1, X 2,... of random variables, we write S n = X X n. Theorem 2.1 (Weak law of large numbers) Let (S, F, P ) be a probability space and let X 1, X 2,... be a sequence of independent random variables with means m i and variances σi 2. We assume that the means are finite and there is a constant L such that σi 2 L for all i. Then, for every ɛ >, ( ) lim P S n E[S n ] n n ɛ =. If, in particular, E[X i ] = m for all i, then ( ) lim P S n n n n ɛ =. Proof. The independence of the random variables implies Var(S n ) = Var(X 1 ) + + Var(X n ) nl. Chebyshev s (second) inequality applied to S n then gives ( ) S n E[S n ] P n ɛ = P ( S n E[S n ] nɛ) Var(S n) (nɛ) 2 L nɛ 2. This proves the theorem. For example, let X 1, X 2,... be a sequence of independent, identically distributed random variables with two outcomes: 1 with probability p and with probability 1 p. Then the weak law of large numbers says that the arithmetic mean S n /n converges to p = E[X i ] in the sense that, for any ɛ >, the probability that S n /n lies outside the interval [p ɛ, p + ɛ] goes to zero as n goes to. 7

8 2.2 The strong law of large numbers The weak law of large numbers, applied to a sequence X i {, 1} of coin tosses, says that S n /n must lie in an arbitrarily small interval around 1/2 with high probability (arbitrarily close to 1) if n is taken big enough. A stronger statement would be to say that, with probability one, a sequence of coin tosses yields a sum S n such that S n /n actually converges to 1/2. To explain the meaning of the stronger claim, let us be more explicit and view the random variables as functions X i : S R on the same probability space (S, F, P ). Then, for each s S we can consider the sample sequence X 1 (s), X 2 (s),..., as well as the arithmetic averages S n (s)/n, and ask whether S n (s)/n (an ordinary sequence of numbers) actually converges to 1/2. The strong law of large numbers states that the set of s for which this holds is an event of probability 1. This is a much more subtle result than the weak law, and we will be content with simply stating the general theorem. Theorem 2.2 (Strong law of large numbers) Let (S, F, P ) be a probability space and let X 1, X 2,... be random variables defined on S with finite means and variances satisfying Var(X i ) i 2 <. i=1 Then, there is an event E F of probability 1 such that for all s E, S n n E[S n] n as n. In particular, if in addition all the means are equal to m then for all s in a subset of S of probability 1, S n (s) lim = m. n n 3 The Central Limit Theorem The reason why the normal distribution arises so often is the central limit theorem. We state this theorem here without proof, although experimental evidence for its validity will be given in a number of examples. Let (S, F, P ) be a probability space and X 1, X 2,... be independent random variables defined on S. Assume that the X i have a common distribution with finite expectation m and finite nonzero variance σ 2. Define the sum S n = X 1 + X X n. Theorem 3.1 (The central limit theorem) If X 1, X 2,... are independent random variables with mean µ and variance σ 2, then the random variable Z n = S n nµ σ n 8

9 converges in distribution to a standard normal random variable. In other words, as n. P (Z n z) Φ(z) = 1 2π z e 1 2 u2 du In a typical application of the central limit theorem, one considers a random variable X with mean µ and variance σ 2, and a sequence X 1, X 2,... of independent realizations of X. Let X n be the sample mean, X n = (X X n )/n. Then the limiting distribution of the random variable Z n = (X n µ)/(σ/ n) is the standard normal distribution. We saw in Lecture Notes 2 that if random variables X and Y are independent and have probability density functions f X and f Y, then the sum X + Y has probability density function equal to the convolution f X f Y. Therefore, one way to observe the convergence to the normal distribution claimed in the central limit theorem is to consider the convolution powers f n = f f of the common probability density function f with itself n times and see that it approaches a normal distribution. (Subtracting µ and dividing by σ/ n changes the probability density function in the simple way described in proposition 1.1.) 2 f 2.5 f f f f f 6 f f f f Figure 1: Convolution powers of the function f(x) = 1 over [ 1, 1]. By the central limit theorem, after centering and re-scaling (not done in the figure), f n approaches a normal distribution. 9

10 Example 3.1 (Die tossing) Consider the experiment of tossing a fair die n times. Let X i be the number obtained in the i-th toss and S n = X X n. The X i are independent and have a common discrete distribution with mean µ = 3.5 and σ 2 = 35/12. Assuming n = 1, by the central limit theorem S n has approximately the normal distribution with mean µ(s n ) = 35 and standard deviation σ(s n ) = 35 1/12, which is approximately 54. Therefore, if we simulate the experiment of tossing a die 1 times, repeat the experiment a number of times (say 5) and plot a histogram of the result, what we obtain should be approximated by the function where µ = 35 and σ = 54. f(x) = 1 σ 2π e 1 2( x µ σ ) 2, Figure 2: Comparison between the sample distribution given by the stem plot and the normal distribution for the experiment of tossing a die 1 times and counting the total number of pips. Example 3.2 (Die tossing II) We would like to compute the probability that after tossing a die 1 times, one obtains more than 15 6s. Here, we consider the random variable X i, i = 1,..., 1, taking values in {, 1}, with P (X i = 1) = 1/6. (X i = 1 represents the event of getting a 6 in the i-th toss.) Writing S n = X 1 + +X n, we wish to compute the probability P (S 1 > 15). Each X i has mean p = 1/6 and standard deviation (1 p)p. By the central limit theorem, we approximate the probability distribution of S n by a normal distribution with µ = 1p and variance σ = 1(1 p)p. This is approximately µ = and σ = Now, the distribution of (S n µ)/σ is 1

11 approximately the standard normal, so we can write P (S n > 15) = P ((S n µ)/σ > (15 µ)/σ) = P ((S n µ)/σ > ) 1 e 1 2 z2 dz 2π = The integral above was evaluated numerically by a simple Riemann sum over the interval [ 1.41, 1] and step-size.1. We conclude that the probability of obtaining at least 15 6s in 1 tosses is approximately Some Simulation Techniques Up until now we have mostly done simulations of random variables with a finite number of possible values. In this section we explore a few ideas for simulating continuous random variables. Suppose we have a continuous random variable X with probability density function f(x) and we wish to evaluate the expectation E[g(X)], for some function g(x). This requires evaluating the integral E[g(X)] = g(x)f(x)dx. If the integral is not tractable by analytical or standard numerical methods, one can approach it by simulating realizations x 1, x 2,..., x n of X and, provided that the variance of g(x) is finite, one can apply the law of large numbers to obtain the approximation E[g(X)] 1 n n g(x i ). This is the basic idea behind Monte-Carlo integration. It may happen that we cannot simulate realizations of X, but we can simulate realizations y 1, y 2,..., y n of a random variable Y with probability density function h(x) which is related to X in that h(x) is not unless f(x) is. In this case we can write E[g(X)] = = i=1 g(x)f(x)dx g(x)f(x) h(x)dx h(x) = E[g(Y )f(y )/h(y )] 1 n g(y i )f(y i ). n h(y i ) i=1 11

12 The above procedure is known as importance sampling. We need now ways to simulate realizations of random variables. This is typically not an easy task, but a few general techniques are available. We describe some below. 4.1 Uniform random number generation We have already looked at simulating uniform random numbers in Lecture Notes 1. We review the main idea here. This is usually done by number theoretic methods, the simplest of which are the linear congruential algorithms. Such an algorithm begins by setting an integer seed, u, and then generating a sequence of new integer values by some deterministic rule of the type u n+1 = Ku n + b (mod M), where M and K are integers. Since only finitely many different numbers occur, the modulus M should be chosen as large as possible. To prevent cycling with a period less than M the multiplier K should be taken relatively prime to M. Typically b is set to, in which case the pseudo-random number generator is called a multiplicative congruential generator. A good choice of parameters are K = 7 5 and M = Then x = u/m gives a pseudo-random number in [, 1] that simulates a uniformly distributed random variable. We may return to this topic later and discuss statistical tests for evaluating the quality of such pseudo-random numbers. For now, we will continue to take for granted that this is a good way to simulate uniform random numbers over [, 1]. If we want to simulate a random point in the square [, 1] [, 1], we can naturally do it by picking a pair of independent random variables X 1, X 2 with the uniform distribution on [, 1]. (Similarly for cubes [, 1] n in any dimension n.) Example 4.1 (Area of a disc by Monte Carlo) As a simple illustration of the Monte Carlo integration method, suppose we wish to find the area of a disc of radius 1. The disc is inscribed in the square S = [ 1, 1] [ 1, 1]. If we pick a point from S at random with the uniform distribution, then the probability that it will be from the disc is p = π/4, which is the ratio of the area of the disc by the area of the square S. The following program simulates this experiment (5 random points) and estimates a = 4p = π. It gave the value a = rand( seed,121) n=5; X=2*rand(n,2)-1; a=4*sum(x(:,1).^2 + X(:,2).^2<=1)/n 12

13 Figure 3: Simulation of 1 random points on the square [ 1, 1] 2 with the uniform distribution. To approximate the ratio of the area of the disc over the area of the square we compute the fraction of points that fall on the disc. The above example should prompt the question: How do we estimate the error involved in, say, our calculation of π, and how do we determine the number of random points needed for a given precision? First, consider the probability space S = [ 1, 1] 2 with probability measure given by P (E) = 1 dxdy, 4 and the random variable D : S {, 1} which is 1 for a point in the disc and for a point in the complement of the disc. The expected value of D is µ = π/4 and the variance is easily calculated to be σ 2 = µ(1 µ) = π(4 π)/16. If we draw n independent points on the square, and call the outcomes D 1, D 2,..., D n, then the fraction of points in the disc is given by the random variable E D n = D D n. n As we have already seen (proposition 1.5), D n must have mean value µ and standard deviation σ/ n. Fix a positive number K. One way to estimate the error in our calculation is to ask for the probability that D n µ is bigger than the error Kσ/ n. Equivalently, we ask for the probability P ( Z n K), where Z n = D n µ σ/ n. This probability can now be estimated using the central limit theorem. Recall that the probability distribution density of Z n, for big n, is very nearly a 13

14 standard normal distribution. Thus P ( D n µ Kσ/ n) 2 e 1 2 z2 dz. 2π If we take K = 3, the integral on the right-hand side is approximately Using n = 5, we obtain P ( D n µ.17) In other words, the probability that our simulated D n, for n = 5, does not lie in an interval around the true value µ of radius.17 is approximately This assumes, of course, that our random numbers generator is ideal. Deviations from this error bound can serve as a quality test for the generator. 4.2 Transformation methods We indicate by U U(, 1) a random variable that has the uniform distribution over [, 1]. Our problem now is to simulate realizations of another random variable X with probability density function f(x), using U U(, 1). Let F (x) = x f(s)ds denote the cumulative distribution function of X. Notice that F : R (, 1). Assume that F is invertible, and denote by G : (, 1) R the inverse function G = F 1. Now set X = F 1 (U). Proposition 4.1 (Inverse distribution method) Let U U(, 1) and G : (, 1) R the inverse of the cumulative distribution function with PDF f(x). Set X = G(U). Then X has cumulative distribution function F (x). Proof. K P (X x) = P (F 1 (U) x) = P (U F (x)) = F U (F (x)) = F (x). Example 4.2 The random variable X = a + (b a)u clearly has the uniform distribution on [a, b] if U U(, 1). This follows from the proposition since F (x) = x a b a, which has inverse function F 1 (u) = a + (b a)u. 14

15 Example 4.3 (Exponential random variables) If U U(, 1) and λ >, then X = 1 λ log(u) has an exponential distribution with parameter λ. In fact, an exponential random variable has PDF f(x) = λe λx and its cumulative distribution function is easily obtained by explicit integration: F (x) = 1 e λx. Therefore, F 1 (u) = 1 log(1 u). λ But 1 U U(, 1) if U U(, 1), so we have the claim Figure 4: Stem plot of relative frequencies of 5 independent realizations of an exponential random variable with parameter 1. Superposed to it is the graph of frequencies given by the exact density e λ. function y=exponential(lambda,n) %Simulates n independent realizations of a %random variable with the exponential %distribution with parameter lambda. y=-log(rand(1,n))/lambda; 15

16 Here are the commands used to produce figure 4. y=exponential(1,5); [n,xout]=hist(y,4); stem(xout,n/5) grid hold on dx=xout(2)-xout(1); fdx=exp(-xout)*dx; plot(xout,fdx) 4.3 Lookup methods This is the discrete version of the inverse transformation method. Suppose one is interested in simulating a discrete random variable X with sample space S = {, 1, 2,... } or a subset of it. Write p k = P (X = k). Some, possibly infinitely many, of the p k may be zero. Now define Let U U(, 1) and write q k = P (X k) = k p i. i= X = min{k : q k U}. Then X has the desired probability distribution. In fact, we must necessarily have q k 1 < U q k for some k and so 4.4 Scaling P (X = k) = P (U (q k 1, q k ]) = q k q k 1 = p k. We have already seen that if Y = ax + b for a non-zero a, then f Y (y) = 1 ( ) y b a f. a Thus, for example, if X is exponentially distributed with parameter 1, then Y = X/λ is exponentially distributed with parameter λ. Similarly, if we can simulate a random variable Z with the standard normal distribution, then X = σz + µ will be a normal random variable with mean µ and standard deviation σ. 16

17 4.5 The uniform rejection method The methods of this and the next subsection are examples of the rejection sampler method. Suppose we want to simulate a random variable with PDF f(x) such that f(x) is zero outside of the interval [a, b] and f(x) L for all x. Choose X U(a, b) and Y U(, L) independently. If Y < f(x), accept X as the simulated value we want. If the acceptance condition is not satisfied, try again enough times until it holds. Then take that X for which Y < f(x) as the output of the algorithm and call it X. This procedure is referred to as the uniform rejection method for density f(x). Proposition 4.2 (Uniform rejection method) The random variable X produced by the uniform rejection method for density f(x) has probability distribution function f(x). Proof. Let A represent the region in [a, b] [, L] consisting of points (x, y) such that y < f(x). We call A the acceptance region. As above, we denote by (X, Y ) a random variable uniformly distributed on [a, b] [, L]. Let F (x) = P (X x) denote the cumulative distribution function of X. We wish to show that F (x) = x f(s)ds. This is a consequence of the following calculation, which a uses the continuous version of the total probability formula and the key fact: P ((X, Y ) A X = s) = f(s)/l. F (x) = P (X x) = P (X x (X, Y ) A) P ({X x} {(X, Y ) A}) = P ((X, Y ) A) = = = = 1 b b a a x a b a x f(s) a L b f(s) a L x a P ({X x} {(X, Y ) A} X = s)ds 1 b a b a P ((X, Y ) A X = s)ds P ((X, Y ) A X = s)ds P ((X, Y ) A X = s)ds ds ds f(s)ds. It is clear that the efficiency of the rejection method will depend on the probability that a random point (X, Y ) will be accepted, i.e., will fall in the 17

18 acceptance region A. This probability can be estimated as follows: P (accept) = P ((X, Y ) A) = 1 b a b a 1 = (b a)l 1 = (b a)l. P ((X, Y ) A X = s)ds b a f(s)ds If this number is too small, the procedure will be inefficient. Example 4.4 (Uniform rejection method) Consider the probability density function f(x) = 1 2 sin(x) over the interval [, π]. We wish to simulate a random variable X with PDF f(x) using, first, the uniform rejection method. The following program does that. function x=samplefromsine(n) %Simulates n independent realizations of a random %variable with PDF (1/2)sin(x) over the interval %[,pi]. x=[]; for i=1:n U=; Y=1/2; while Y>=(1/2)*sin(U) U=pi*rand; Y=(1/2)*rand; end x=[x U]; end The probability of acceptance is 2/π, or approximately.64. The following Matlab commands can be used to obtain figure 5: y=samplefromsine(5); [n,xout]=hist(y,4); stem(xout,n/5) grid hold on 18

19 dx=xout(2)-xout(1); fdx=(1/2)*sin(xout)*dx; plot(xout,fdx) Figure 5: Stem plot of the relative frequencies of 5 independent realizations a simulated random variable with PDF.5 sin(x) over [, π]. The exact frequencies are superposed for comparison. 4.6 The envelope method One limitation of the uniform rejection method is the requirement that the PDF f(x) be on the complement of a finite interval [a, b]. A more general procedure, called the envelope method for f(x) can sometimes be used when the uniform rejection method does not apply. Suppose that we wish to simulate a random variable with PDF f(x) and that we already know how to simulate a second random variable Y with PDF g(x) having the property that f(x) ag(x) for some positive a and all x. Note that a 1 since the total integral of both f(x) and g(x) is 1. Now consider the following algorithm. Draw a realization of Y with the distribution density g(y) and then draw a realization of U with the uniform distribution U(, ag(y)). Repeat the procedure until a pair (Y, U) such that U < f(y ) is obtained. Then set X equal to the obtained value of Y. In other words, simulate a value from the distribution g(y) and accept this value with probability f(y)/(ag(y)), otherwise reject and try again. The method will work more efficiently if the acceptance rate is high. The overall acceptance probability is P (U < f(y )). It is not difficult to calculate this probability as we did in the case of the uniform rejection method. (Simply 19

20 apply the integral form of the total probability formula.) The result is P (accept) = 1 a. Proposition 4.3 (The envelope method) The envelope method for f(x) described above simulates a random variable X with probability distribution f(x). Proof. The argument is essentially the same as for the uniform rejection method. Note now that P (U f(y ) Y = s) = f(s)/(ag(s)). With this in mind, we have: F (x) = P (X x) = P (Y x U f(y )) P ({Y x} {U f(y )}) = P (U f(y )) P ({Y x} {U f(y )} Y = s)g(s)ds = P (U f(y ) Y = s)g(s)ds x P (U f(y ) Y = s)g(s)ds = P (U f(y ) Y = s)g(s)ds = = x f(s) a x ds f(s) a ds f(s)ds. Example 4.5 (Envelope method) This is the same as the previous example, but we now approach the problem via the envelope method. We wish to simulate a random variable X with PDF (1/2) sin(x) over [, π]. We first simulate a random variable Y with probability density g(y), where g(y) = { 4 π y 2 4 π 2 (π y) if y [, π/2] if y [π/2, π]. Notice that f(x) ag(x) for a = π 2 /8. Therefore, the envelope method will have probability of acceptance 1/a =.81. To simulate the random variable Y, note that g(x) = (h h)(x), where h(x) = 2/π over [, π/2]. Therefore, we can take Y = V 1 +V 2, where V i are identically distributed uniform random variables over [, π/2]. function x=samplefromsine2(n) 2

21 %Simulates n independent realizations of a random %variable with PDF (1/2)sin(x) over the interval %[,pi], using the envelope method. x=[]; for i=1:n U=1/2; Y=; while U>=(1/2)*sin(Y) Y=(pi/2)*sum(rand(1,2)); U=(pi^2/8)*((2/pi)-(4/pi^2)*abs(Y-pi/2))*rand; end x=[x Y]; end 5 Standard Probability Distributions We study here a number of the more commonly occurring probability distributions. They are associated with basic types of random experiments that often serve as building blocks for more complicated probability models. Among the most important for our later study are the normal, the exponential, and the Poisson distributions. 5.1 The discrete uniform distribution A random variable X is discrete uniform on the numbers 1, 2,..., n, written X DU(n) if it takes values in the set S = {1, 2,..., n} and each of the possible values is equally likely to occur, that is, P (X = k) = 1 k, k S. The cumulative distribution function of X is, therefore, P (X k) = k n, k S. The expectation of a discrete uniform random variable X is easily calculated: E[X] = n k=1 k n n(n + 1) = 2n = n

22 The variance is similarly calculated. Its value is Var(X) = n The following is a simple way to simulate a DU(n) random variable; function y=discreteuniform(n,m) %Simulates m independent samples of a DU(n) random variable. y=[]; for i=1:m y=[y ceil(n*rand)]; end 5.2 The binomial distribution Given a positive integer n and an integer k between and n, recall that the binomial coefficient is defined by ( ) n n! C(n, k) = = k k!(n k)!. It gives the number of ways to pick k elements in a set of n elements. We often read C(n, k) as n choose k. The binomial distribution is the distribution of the number of successful outcomes in a series of n independent trials, each with a probability p of success and 1 p of failure. If the total number of successes is denoted X, we write X B(n, p) to indicate that X is a binomial random variable for n independent trials and success probability p. Thus, if Z 1,..., Z n are independent random variables taking values in {, 1}, and P (Z i = ) = 1, P (Z i = ) = 1 p, then X = Z Z n is a B(n, p) random variable. The sample space for a binomial random variable X is S = {, 1, 2,..., n}. The probability of k successes followed by n k failures is p k (1 p) n k. Indeed, this is the probability of any sequence of n outcomes with k success trials, independent of the order in which they occur. There are C(n, k) such sequences, so the probability of k successes is ( ) n P (X = k) = p k (1 p) n k. k The expectation and variance of the binomial distribution are easily obtained: E[X] = np, Var(X) = np(1 p). 22

23 Example 5.1 (Urn problem) An urn contains N balls, of which K are black and N K are red. We draw with replacement n balls and count the number X of black balls drawn. Let p = K/N. Then X B(n, p). function y=binomial(n,p,m) %Simulates drawing m independent samples of a %binomial random variable B(n,p). y=sum(rand(n,m)<=p); 5.3 The multinomial distribution Consider the following random experiment. An urn contains balls of colors c 1, c 2,..., c r, which can be drawn with probabilities p 1, p 2,..., p r. Suppose we draw n balls with replacement and register the number X i, for i = 1, 2,..., r, that a ball of color c i was drawn. The vector X = (X 1, X 2,..., X r ) is then said to have the multinomial distribution, and we write X M(n, p 1,..., p r ). Note that X X r = n. Binomial random variables correspond to the special case r = 2. More explicitly, the multinomial distribution assigns probabilities P (X 1 = k 1,..., X r = k r ) = n! k 1!... k r! pk pkr r. Note that if X = (X 1,..., X r ) M(n, p 1,..., p r ), then each X i can be interpreted as the number of successes in n trials, each of which has probability p i of success and 1 p i of failure. Therefore, X i is a binomial random variable B(n, p i ). Example 5.2 Suppose that 1 independent observations are taken from a uniform distribution on [, 1]. We partition the interval into 1 equal subintervals (bins), and record the numbers X 1,..., X 1 of observations that fall in each bin. The information is then represented as a histogram, or bar graph, in which the bar over the bin labeled by i equals X i. Then, the histogram itself can be viewed as a random variable with the multinomial distribution, where n = 1 and p i = 1/1 for i = 1, 2,..., 1. The following program simulates one sample draw of a random variable with the multinomial distribution. function y=multinomial(n,p) %Simulates drawing a sample vector y=[y1,... yr] %with the multinomial distribution M(n,p), %where p=[p1... pr] is a probability vector. r=length(p); x=rand(n,1); a=; 23

24 A=zeros(n,1); for i=1:r A=A+i*(a<=x & x<a+p(i)); a=a+p(i); end y=zeros(1,r); for j=1:r y(j)=sum(a==j); end 5.4 The geometric distribution As in the binomial distribution, consider a sequence of trials with a success/fail outcome and probability p of success. Let X be the number of independent trials until the first success is encountered. In other words, X is the waiting time until the first success. Then X is said to have the geometric distribution, denoted X Geom(p). The sample space is S = {1, 2, 3,... }. In order to have X = k, there must be a sequence of k 1 failures followed by one success. Therefore, P (X = k) = (1 p) k 1 p. Another way to describe a geometric random variable X is as follows. Let X 1, X 2,... be independent random variables with values in {, 1} such that P (X i = 1) = p and P (X i = ) = 1 p. Then X = min{n 1 : X n = 1} Geom(p). In fact, as the X i are independent, we have P (X = n) = P ({X 1 = } {X 2 = } {X n 1 = } {X n = 1}) = P (X 1 = )P (X 2 = )... P (X n 1 = )P (X n = 1) = (1 p) n 1 p. The expectation of a geometrically distributed random variable is calculated as follows: E[X] = ip (X = i) = i(1 p) i 1 p p = (1 (1 p)) 2 = 1 p. Similarly: i=1 i=1 i=1 E[X 2 ] = i 2 P (X = i) = i 2 (1 p) i (1 p) p = p (1 (1 p)) 3 = 2 p p 2, i=1 24

25 from which we obtain the variance: Var = E[X 2 ] E[X] 2 = 2 p p 2 We have used above the following formulas: i=1 i=1 1 p 2 = 1 p p 2. a i 1 = 1 1 a, ia i 1 1 = (1 a) 2, i 2 a i 1 = 1 + a (1 a) 3 i=1 as well as the formulas n i = i=1 n(n + 1), 2 n i=1 i 2 = 1 n(n + 1)(2n + 1). 6 Example 5.3 (Waiting for a six) How long should we expect to have to wait to get a 6 in a sequence of die tosses? Let X denote the number of tosses until 6 appears for the first time. Then the probability that X = k is P (X = k) = ( ) k In other words we have k 1 failures, each with probability 5/6, until a success, with probability 1/6. The expected value of X is ( ) k kp (X = k) = k 6 6 = 6. k=1 k=1 So on average we need to wait 6 tosses to get a 6. The following program simulates one sample draw of a random variable with the Geom(p) distribution. function y=geometric(p) %Simulates one draw of a geometric %random variable with parameter p. a=; y=; while a== y=y+1; a=(rand<p); end 25

26 5.5 The negative binomial distribution A random variable X has the negative binomial distribution, also called the Pascal distribution, denoted X NB(n, p), if there exists an integer n 1 and a real number p (, 1) such that P (X = n + k) = ( n + k 1 k ) p n (1 p) k, k =, 1, 2,... The negative binomial distribution has the following interpretation. Proposition 5.1 Let X 1,..., X n be independent Geom(p) random variables. Then X = X X n has the negative binomial distribution with parameters n and p. Therefore, to simulate a negative binomial random variable all we need is to simulate n independent geometric random variables, then add them up. It also follows from this proposition that E[X] = n n(1 p), Var(X) = p p 2. function y=negbinomial(n,p) %Simulates one draw of a negative binomial %random variable with parameters n and p. y=; for i=1:n a=; u=; while a== u=u+1; a=(rand<p); end y=y+u; end 5.6 The Poisson distribution A Poisson random variable X with parameter λ, denoted X Po(λ), is a random variable with sample space S = {, 1, 2,... } such that P (X = k) = λk k! e λ, k =, 1, 2,.... This is a very ubiquitous distribution and we will encounter it many times in the course. One way to think about the Poisson distribution is as the limit of a 26

27 binomial distribution B(n, p) as n, p, while λ = np = E[X] remains constant. In fact, replacing p by λ/n in the binomial distribution gives ( ) ( n λ P (X = k) = k n = n! k!(n k)! ) k ( 1 λ ) n k n ) k ( 1 λ ) n k n ( λ n = λk n! (1 λ/n) n k! (n k)!n k (1 λ/n) k = λk k! n n λk k! e λ. (n 1) (n 2) n n (n k + 1) (1 λ/n) n n (1 λ/n) k Notice that we have used the limit (1 λ/n) n e λ. The expectation and variance of a Poisson random variable are easily calculated from the definition or by the limit of the corresponding quantities for the binomial distribution. The result is: E[X] = λ, Var(X) = λ. One noteworthy property of Poisson random variables is that, if X Po(λ) and Y Po(µ) are independent Poisson random variables, then Z = X + Y Po(λ + µ). A numerical example may help clarify the meaning of the Poisson distribution. Consider the interval [, 1] partitioned into a large number, n, of subintervals of equal length: [, 1/n), [1/n, 2/n),..., [(n 1)/n, 1]. To each subinterval we randomly assign a value 1 with a small probability λ/n (for a fixed λ) and with probability 1 λ/n. Let X be the number of 1s. Then, for large n, the random variable X is approximately Poisson with parameter λ. The following Matlab script illustrates this procedure. It produces samples of a Poisson random variable with parameter λ = 3 over the interval [, 1] and a graph that shows the positions where an event occurs. %Approximate Poisson random variable, X, with parameter lambda. lambda=3; n=5; p=lambda/n; a=(rand(1,n)<p); x=1/n:1/n:1; X=sum(a) stem(x,a) 27

28 Figure 6: Poisson distributed events over the interval [, 1] for λ = 3. A sequence of times of occurrences of random events is said to be a Poisson process with rate λ if the number of observations, N t, in any interval of length t is N t Po(λt) and the number of events in disjoint intervals are independent of one another. This is a simple model for discrete events occurring continuously in time. The following is a function script for the arrival times over [, T ] of a Poisson process with parameter λ. function a=poisson(lambda,t) %Imput - lambda arrival rate % - T time interval, [,T] %Output - a arrival times in interval [,T] for i=1:1 z(i,1)=(1/lambda)*log(1/(1-rand(1,1))); %interarrival times if i==1 t(i,1)=z(i); else t(i,1)=t(i-1)+z(i,1); end if t(i)>t break end end M=length(t)-1; a=t(1:m); 28

29 5.7 The hypergeometric distribution A random variable X is said to have a hypergeometric distribution if there exist positive integers r, n and m such that for any k =, 1, 2,..., n we have ( ) ( ) r n r k P (X = k) = m k ( ). n m The binomial, the multinomial and the Poisson distributions arise when one wants to count the number of successes in situations that generally correspond to drawing from a population with replacement. The hypergeometric distribution arises when the experiment involves drawing without replacement. The expectation and variance of a random variable X having the hypergeometric distribution are given by E[X] = np, Var(X) = npq N n N 1. Example 5.4 (Drawing without replacement) An urn contains n balls, r of which are black and n r are red. We draw from the urn without replacement m bals and denote by X the number of black balls among them. Then X has the hypergeometric distribution with parameters r, n and m. Example 5.5 (Capture/recapture) The capture/recapture method is sometimes used to estimate the size of a wildlife population. Suppose that 1 animals are captured, tagged, and released. On a later occasion, 2 animals are captured, and it is found that 4 of them are tagged. How large is the population? We assume that there are n animals in the population, of which 1 are tagged. If the 2 animals captured later are taken in such a way that all the n-choose-2 possible groups are equally likely, then the probability that 4 of them are tagged is ( ) ( ) 1 n 1 4 L(n) = ( n 2 16 ). The number n cannot be precisely determined from the given information, but it can be estimated using the maximum likelihood method. The idea is to estimate the value of n as the value that makes the observed outcome (X = 4 in this example) most probable. In other words, we estimate the population size to be the value n that maximizes L(n). It is left as an exercise to check that L(n)/L(n 1) > 1 if and only if n < 5, so L(n) stops growing for n = 5. Therefore, the maximum is attained for n = 5. This number serves as our estimate of the population size in the sense that it is the value that maximizes the likelihood of the outcome X = 4. 29

30 5.8 The uniform distribution A random variable X has a uniform distribution over the range [a, b], written X U(a, b), if the PDF is given by If x [a, b], then Therefore f X (x) = { 1 b a F X (x) = if a x b otherwise. = x x a = x a b a. f X (y)dy f X (y)dy if x < a x a F X (x) = b a if a x b 1 if x > b. The expectation and variance are easily calculated: E[X] = a + b 2 (b a)2 Var(X) = The exponential distribution A random variable X has an exponential distribution with parameter λ >, written X Exp(λ) if it has the PDF f X (x) = { if x < λe λx if x. The cumulative distribution function is, therefore, given by { if x < F X (x) = 1 e λx if x. Expectation and variance are given by E[X] = 1 λ, Var(X) = 1 λ 2. 3

31 Exponential random variables often arise as random times. We will see them very often. The following propositions contain some of their most notable properties. The first property states can be interpreted as follows. Suppose that something has a random life span which is exponentially distributed with parameter λ. (For example, the atom of a radioactive element.) Then, having survived for a time t, the probability of surviving for an additional time s is the same probability it had initially to survive for a time s. Thus the system does not keep any memory of the passage of time. To put it differently, if an entity has a life span that is exponentially distributed, its death cannot be due to an aging mechanism since, having survived by time t, the chances of it surviving an extra time s are the same as the chances that it would have survived to time s from the very beginning. More precisely, we have the following proposition. Proposition 5.2 (Memoryless property) If X Exp(λ), then we have for any s, t. Proof. P (X > s + t X > t) = P (X > s) P ({X > s + t} {X > t}) P (X > s + t X > t) = P (X > t) = P (X > s + t P (X > t) = 1 P (X s + t 1 P (X t) = 1 F X(s + t) 1 F X (t) = 1 (1 e λ(s+t) ) 1 (1 e λt ) = e λs = 1 (1 e λs ) = 1 F X (s) = 1 P (X s) = P (X > s). The next proposition states that the inter-event times for a Poisson random variable with parameter λ are exponentially distributed with parameter λ. Proposition 5.3 Consider a Poisson process with rate λ. Let T be the time to the first event (after ). Then T Exp(λ). 31

32 Proof. Let N t be the number of events in the interval (, t] (for given fixed t > ). Then N t Po(λt). Consider the cumulative distribution function of T : F T (t) = P (T t) = 1 P (T > t) = 1 P (N t = ) = 1 (λt) e λt! = 1 e λt. This is the distribution function of an Exp(λ) random variable, so T Exp(λ). So the time of the first event of a Poisson process is an exponential random variable. Using the independence properties of the Poisson process, it should be clear (more details later) that the time between any two such events has the same exponential distribution. Thus the times between events of the Poisson process are exponential. There is another way of thinking about the Poisson process that this result suggests. For a small time h we have P (T h) h = 1 e λh h = 1 (1 λh) h + O(h2 ) h λ as h. So for very small h, P (T h) is approximately λh and due to the independence property of the Poisson process, this is the probability for any time interval of length h. The Poisson process can therefore be thought of as a process with constant event hazard λ, where the hazard is essentially a measure of event density on the time axis. The exponential distribution with parameter λ can therefore also be reinterpreted as the time to an event of constant hazard λ. The next proposition describes the distribution of the minimum of a collection of independent exponential random variables. Proposition 5.4 Let X i Exp(λ i ), i = 1, 2,..., n, be independent random variables, and define X = min{x 1, X 2,..., X n }. Then X Exp(λ ), where λ = λ 1 + λ λ n. 32

33 Proof. First note that for X Exp(λ) we have P (X > x) = e λx. Then P (X > x) = P (min i {X i } > x) = P ({X 1 > x} {X 2 > x} {X n > x}) n = P (X i > x) = i=1 n i=1 e λix = e x(λ1+ +λn) = e λx. Proposition 5.5 Suppose that X Exp(λ) and Y Exp(µ) are independent random variables. Then P (X < Y ) = λ/(λ + µ). Proof. P (X < Y ) = = = = λ λ + µ. P (X < Y Y = y)f(y)dy P (X < y)f(y)dy (1 e λy )µe µy dy The next result gives the likelihood of a particular exponential random variable of an independent collection being the smallest. Proposition 5.6 Let X i Exp(λ i ), i = 1, 2,..., n be independent random variables and let J be the index of the smallest of the X i. Then J is a discrete random variable with probability mass function where λ = λ λ n. P (J = i) = λ i λ, i = 1, 2,..., n, Proof. For each j, define the random variable Y = min k j {X k } and set λ j = 33

34 λ λ j. Then P (J = j) = P (X j < min k j {X k }) = P (X j < Y ) λ j = λ j + λ j = λ j λ. From the formula for a linear transformation of a random variable we immediately have: Proposition 5.7 Let X Exp(λ). Then for α >, Y = αx has distribution Y Exp(λ/α). 5.1 The Erlang distribution A continuous random variable X taking values in [, ) is said to have the Erlang distribution is it has PDF f(x) = λ(λx)n 1 (n 1)! e λx. It can be shown that if T 1, T 2,..., T n are independent random variables with a common exponential distribution with parameter λ, then S n = T 1 + T T n has the Erlang distribution with parameters n and λ. It follows from this claim that the expectation and variance of an Erlang random variable are given by E[X] = n λ, Var(X) = n λ The normal distribution The normal, or Gaussian, distribution is one of the most important distributions in probability theory. One reason for this is that sums of random variables often approximately follow a normal distribution. Definition 5.1 A random variable X has a normal distribution with parameters µ and σ 2, written X N(µ, σ 2 ) if it has probability density function { f X (x) = 1 σ 2π exp 1 2 for < x < and σ >. ( ) } 2 x µ σ 34

35 Note that the PDF is symmetric about x = µ, so the median and mean of the distribution will be µ. Checking that the density integrates to 1 requires the well-known integral π e αx2 dx = α, α >. We leave the calculation of this and the variance as an exercise. The result is E[X] = µ, Var(X) = σ 2. The random variable Z is said to have the standard normal distribution if Z N(, 1). Therefore, the density of Z, which is usually denoted φ(z), is given by φ(z) = 1 exp { 12 } 2π z2 for < z <. The cumulative distribution function of a standard normal random variable is denoted Φ(z), and is given by Φ(z) = z φ(x)dx. There is no simple analytic expression for Φ(z) in terms of elementary functions. Consider Z N(, 1) and let X = µ + σz, for σ >. Then X N(µ, σ 2 ). But we know that f x (x) = (1/σ)φ((x µ)/σ), from which the claim follows. Conversely, if X N(µ, σ 2 ), then Z = X µ σ N(, 1). It is also easily shown that the cumulative distribution function satisfies: ( ) x µ F X (x) = Φ σ and so the cumulative probabilities for any normal random variable can be calculated using the tables for the standard normal distribution. The sum of normal random variables is also a normal random variable. This is shown in the following proposition. Proposition 5.8 If X 1 N(µ 1, σ 2 1) and X 2 N(µ 2, σ 2 2) are independent normal random variables, then Y = X 1 + X 2 is also normal and Y N(µ 1 + µ 2, σ σ 2 2). The elementary proof will be left as an exercise. Therefore, any linear combination of independent normal random variables is also a normal random variable. The mean and variance of the resulting random variable can then be calculated from the proposition. 35

Chapter 4 Lecture Notes

Chapter 4 Lecture Notes Chapter 4 Lecture Notes Random Variables October 27, 2015 1 Section 4.1 Random Variables A random variable is typically a real-valued function defined on the sample space of some experiment. For instance,

More information

Definition: Suppose that two random variables, either continuous or discrete, X and Y have joint density

Definition: Suppose that two random variables, either continuous or discrete, X and Y have joint density HW MATH 461/561 Lecture Notes 15 1 Definition: Suppose that two random variables, either continuous or discrete, X and Y have joint density and marginal densities f(x, y), (x, y) Λ X,Y f X (x), x Λ X,

More information

Overview of Monte Carlo Simulation, Probability Review and Introduction to Matlab

Overview of Monte Carlo Simulation, Probability Review and Introduction to Matlab Monte Carlo Simulation: IEOR E4703 Fall 2004 c 2004 by Martin Haugh Overview of Monte Carlo Simulation, Probability Review and Introduction to Matlab 1 Overview of Monte Carlo Simulation 1.1 Why use simulation?

More information

For a partition B 1,..., B n, where B i B j = for i. A = (A B 1 ) (A B 2 ),..., (A B n ) and thus. P (A) = P (A B i ) = P (A B i )P (B i )

For a partition B 1,..., B n, where B i B j = for i. A = (A B 1 ) (A B 2 ),..., (A B n ) and thus. P (A) = P (A B i ) = P (A B i )P (B i ) Probability Review 15.075 Cynthia Rudin A probability space, defined by Kolmogorov (1903-1987) consists of: A set of outcomes S, e.g., for the roll of a die, S = {1, 2, 3, 4, 5, 6}, 1 1 2 1 6 for the roll

More information

ST 371 (IV): Discrete Random Variables

ST 371 (IV): Discrete Random Variables ST 371 (IV): Discrete Random Variables 1 Random Variables A random variable (rv) is a function that is defined on the sample space of the experiment and that assigns a numerical variable to each possible

More information

The sample space for a pair of die rolls is the set. The sample space for a random number between 0 and 1 is the interval [0, 1].

The sample space for a pair of die rolls is the set. The sample space for a random number between 0 and 1 is the interval [0, 1]. Probability Theory Probability Spaces and Events Consider a random experiment with several possible outcomes. For example, we might roll a pair of dice, flip a coin three times, or choose a random real

More information

5. Continuous Random Variables

5. Continuous Random Variables 5. Continuous Random Variables Continuous random variables can take any value in an interval. They are used to model physical characteristics such as time, length, position, etc. Examples (i) Let X be

More information

Lecture Notes 1. Brief Review of Basic Probability

Lecture Notes 1. Brief Review of Basic Probability Probability Review Lecture Notes Brief Review of Basic Probability I assume you know basic probability. Chapters -3 are a review. I will assume you have read and understood Chapters -3. Here is a very

More information

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.436J/15.085J Fall 2008 Lecture 5 9/17/2008 RANDOM VARIABLES

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.436J/15.085J Fall 2008 Lecture 5 9/17/2008 RANDOM VARIABLES MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.436J/15.085J Fall 2008 Lecture 5 9/17/2008 RANDOM VARIABLES Contents 1. Random variables and measurable functions 2. Cumulative distribution functions 3. Discrete

More information

Statistics 100A Homework 7 Solutions

Statistics 100A Homework 7 Solutions Chapter 6 Statistics A Homework 7 Solutions Ryan Rosario. A television store owner figures that 45 percent of the customers entering his store will purchase an ordinary television set, 5 percent will purchase

More information

Section 5.1 Continuous Random Variables: Introduction

Section 5.1 Continuous Random Variables: Introduction Section 5. Continuous Random Variables: Introduction Not all random variables are discrete. For example:. Waiting times for anything (train, arrival of customer, production of mrna molecule from gene,

More information

E3: PROBABILITY AND STATISTICS lecture notes

E3: PROBABILITY AND STATISTICS lecture notes E3: PROBABILITY AND STATISTICS lecture notes 2 Contents 1 PROBABILITY THEORY 7 1.1 Experiments and random events............................ 7 1.2 Certain event. Impossible event............................

More information

Math 431 An Introduction to Probability. Final Exam Solutions

Math 431 An Introduction to Probability. Final Exam Solutions Math 43 An Introduction to Probability Final Eam Solutions. A continuous random variable X has cdf a for 0, F () = for 0 <

More information

Probability and Statistics Prof. Dr. Somesh Kumar Department of Mathematics Indian Institute of Technology, Kharagpur

Probability and Statistics Prof. Dr. Somesh Kumar Department of Mathematics Indian Institute of Technology, Kharagpur Probability and Statistics Prof. Dr. Somesh Kumar Department of Mathematics Indian Institute of Technology, Kharagpur Module No. #01 Lecture No. #15 Special Distributions-VI Today, I am going to introduce

More information

MULTIVARIATE PROBABILITY DISTRIBUTIONS

MULTIVARIATE PROBABILITY DISTRIBUTIONS MULTIVARIATE PROBABILITY DISTRIBUTIONS. PRELIMINARIES.. Example. Consider an experiment that consists of tossing a die and a coin at the same time. We can consider a number of random variables defined

More information

Chapter 3 RANDOM VARIATE GENERATION

Chapter 3 RANDOM VARIATE GENERATION Chapter 3 RANDOM VARIATE GENERATION In order to do a Monte Carlo simulation either by hand or by computer, techniques must be developed for generating values of random variables having known distributions.

More information

Random variables, probability distributions, binomial random variable

Random variables, probability distributions, binomial random variable Week 4 lecture notes. WEEK 4 page 1 Random variables, probability distributions, binomial random variable Eample 1 : Consider the eperiment of flipping a fair coin three times. The number of tails that

More information

Sums of Independent Random Variables

Sums of Independent Random Variables Chapter 7 Sums of Independent Random Variables 7.1 Sums of Discrete Random Variables In this chapter we turn to the important question of determining the distribution of a sum of independent random variables

More information

Information Theory and Coding Prof. S. N. Merchant Department of Electrical Engineering Indian Institute of Technology, Bombay

Information Theory and Coding Prof. S. N. Merchant Department of Electrical Engineering Indian Institute of Technology, Bombay Information Theory and Coding Prof. S. N. Merchant Department of Electrical Engineering Indian Institute of Technology, Bombay Lecture - 17 Shannon-Fano-Elias Coding and Introduction to Arithmetic Coding

More information

2WB05 Simulation Lecture 8: Generating random variables

2WB05 Simulation Lecture 8: Generating random variables 2WB05 Simulation Lecture 8: Generating random variables Marko Boon http://www.win.tue.nl/courses/2wb05 January 7, 2013 Outline 2/36 1. How do we generate random variables? 2. Fitting distributions Generating

More information

Lecture 6: Discrete & Continuous Probability and Random Variables

Lecture 6: Discrete & Continuous Probability and Random Variables Lecture 6: Discrete & Continuous Probability and Random Variables D. Alex Hughes Math Camp September 17, 2015 D. Alex Hughes (Math Camp) Lecture 6: Discrete & Continuous Probability and Random September

More information

Probability density function : An arbitrary continuous random variable X is similarly described by its probability density function f x = f X

Probability density function : An arbitrary continuous random variable X is similarly described by its probability density function f x = f X Week 6 notes : Continuous random variables and their probability densities WEEK 6 page 1 uniform, normal, gamma, exponential,chi-squared distributions, normal approx'n to the binomial Uniform [,1] random

More information

What is Statistics? Lecture 1. Introduction and probability review. Idea of parametric inference

What is Statistics? Lecture 1. Introduction and probability review. Idea of parametric inference 0. 1. Introduction and probability review 1.1. What is Statistics? What is Statistics? Lecture 1. Introduction and probability review There are many definitions: I will use A set of principle and procedures

More information

Math 461 Fall 2006 Test 2 Solutions

Math 461 Fall 2006 Test 2 Solutions Math 461 Fall 2006 Test 2 Solutions Total points: 100. Do all questions. Explain all answers. No notes, books, or electronic devices. 1. [105+5 points] Assume X Exponential(λ). Justify the following two

More information

Summary of Formulas and Concepts. Descriptive Statistics (Ch. 1-4)

Summary of Formulas and Concepts. Descriptive Statistics (Ch. 1-4) Summary of Formulas and Concepts Descriptive Statistics (Ch. 1-4) Definitions Population: The complete set of numerical information on a particular quantity in which an investigator is interested. We assume

More information

M2S1 Lecture Notes. G. A. Young http://www2.imperial.ac.uk/ ayoung

M2S1 Lecture Notes. G. A. Young http://www2.imperial.ac.uk/ ayoung M2S1 Lecture Notes G. A. Young http://www2.imperial.ac.uk/ ayoung September 2011 ii Contents 1 DEFINITIONS, TERMINOLOGY, NOTATION 1 1.1 EVENTS AND THE SAMPLE SPACE......................... 1 1.1.1 OPERATIONS

More information

The Exponential Distribution

The Exponential Distribution 21 The Exponential Distribution From Discrete-Time to Continuous-Time: In Chapter 6 of the text we will be considering Markov processes in continuous time. In a sense, we already have a very good understanding

More information

Metric Spaces. Chapter 7. 7.1. Metrics

Metric Spaces. Chapter 7. 7.1. Metrics Chapter 7 Metric Spaces A metric space is a set X that has a notion of the distance d(x, y) between every pair of points x, y X. The purpose of this chapter is to introduce metric spaces and give some

More information

Microeconomic Theory: Basic Math Concepts

Microeconomic Theory: Basic Math Concepts Microeconomic Theory: Basic Math Concepts Matt Van Essen University of Alabama Van Essen (U of A) Basic Math Concepts 1 / 66 Basic Math Concepts In this lecture we will review some basic mathematical concepts

More information

Notes on Continuous Random Variables

Notes on Continuous Random Variables Notes on Continuous Random Variables Continuous random variables are random quantities that are measured on a continuous scale. They can usually take on any value over some interval, which distinguishes

More information

IEOR 6711: Stochastic Models I Fall 2012, Professor Whitt, Tuesday, September 11 Normal Approximations and the Central Limit Theorem

IEOR 6711: Stochastic Models I Fall 2012, Professor Whitt, Tuesday, September 11 Normal Approximations and the Central Limit Theorem IEOR 6711: Stochastic Models I Fall 2012, Professor Whitt, Tuesday, September 11 Normal Approximations and the Central Limit Theorem Time on my hands: Coin tosses. Problem Formulation: Suppose that I have

More information

1 if 1 x 0 1 if 0 x 1

1 if 1 x 0 1 if 0 x 1 Chapter 3 Continuity In this chapter we begin by defining the fundamental notion of continuity for real valued functions of a single real variable. When trying to decide whether a given function is or

More information

Important Probability Distributions OPRE 6301

Important Probability Distributions OPRE 6301 Important Probability Distributions OPRE 6301 Important Distributions... Certain probability distributions occur with such regularity in real-life applications that they have been given their own names.

More information

Example. A casino offers the following bets (the fairest bets in the casino!) 1 You get $0 (i.e., you can walk away)

Example. A casino offers the following bets (the fairest bets in the casino!) 1 You get $0 (i.e., you can walk away) : Three bets Math 45 Introduction to Probability Lecture 5 Kenneth Harris aharri@umich.edu Department of Mathematics University of Michigan February, 009. A casino offers the following bets (the fairest

More information

Chapter 5. Random variables

Chapter 5. Random variables Random variables random variable numerical variable whose value is the outcome of some probabilistic experiment; we use uppercase letters, like X, to denote such a variable and lowercase letters, like

More information

Discrete Mathematics and Probability Theory Fall 2009 Satish Rao, David Tse Note 18. A Brief Introduction to Continuous Probability

Discrete Mathematics and Probability Theory Fall 2009 Satish Rao, David Tse Note 18. A Brief Introduction to Continuous Probability CS 7 Discrete Mathematics and Probability Theory Fall 29 Satish Rao, David Tse Note 8 A Brief Introduction to Continuous Probability Up to now we have focused exclusively on discrete probability spaces

More information

Master s Theory Exam Spring 2006

Master s Theory Exam Spring 2006 Spring 2006 This exam contains 7 questions. You should attempt them all. Each question is divided into parts to help lead you through the material. You should attempt to complete as much of each problem

More information

CA200 Quantitative Analysis for Business Decisions. File name: CA200_Section_04A_StatisticsIntroduction

CA200 Quantitative Analysis for Business Decisions. File name: CA200_Section_04A_StatisticsIntroduction CA200 Quantitative Analysis for Business Decisions File name: CA200_Section_04A_StatisticsIntroduction Table of Contents 4. Introduction to Statistics... 1 4.1 Overview... 3 4.2 Discrete or continuous

More information

WHERE DOES THE 10% CONDITION COME FROM?

WHERE DOES THE 10% CONDITION COME FROM? 1 WHERE DOES THE 10% CONDITION COME FROM? The text has mentioned The 10% Condition (at least) twice so far: p. 407 Bernoulli trials must be independent. If that assumption is violated, it is still okay

More information

Lecture 7: Continuous Random Variables

Lecture 7: Continuous Random Variables Lecture 7: Continuous Random Variables 21 September 2005 1 Our First Continuous Random Variable The back of the lecture hall is roughly 10 meters across. Suppose it were exactly 10 meters, and consider

More information

Generating Random Variables and Stochastic Processes

Generating Random Variables and Stochastic Processes Monte Carlo Simulation: IEOR E4703 c 2010 by Martin Haugh Generating Random Variables and Stochastic Processes In these lecture notes we describe the principal methods that are used to generate random

More information

Statistics 100A Homework 8 Solutions

Statistics 100A Homework 8 Solutions Part : Chapter 7 Statistics A Homework 8 Solutions Ryan Rosario. A player throws a fair die and simultaneously flips a fair coin. If the coin lands heads, then she wins twice, and if tails, the one-half

More information

Normal distribution. ) 2 /2σ. 2π σ

Normal distribution. ) 2 /2σ. 2π σ Normal distribution The normal distribution is the most widely known and used of all distributions. Because the normal distribution approximates many natural phenomena so well, it has developed into a

More information

THE CENTRAL LIMIT THEOREM TORONTO

THE CENTRAL LIMIT THEOREM TORONTO THE CENTRAL LIMIT THEOREM DANIEL RÜDT UNIVERSITY OF TORONTO MARCH, 2010 Contents 1 Introduction 1 2 Mathematical Background 3 3 The Central Limit Theorem 4 4 Examples 4 4.1 Roulette......................................

More information

Department of Mathematics, Indian Institute of Technology, Kharagpur Assignment 2-3, Probability and Statistics, March 2015. Due:-March 25, 2015.

Department of Mathematics, Indian Institute of Technology, Kharagpur Assignment 2-3, Probability and Statistics, March 2015. Due:-March 25, 2015. Department of Mathematics, Indian Institute of Technology, Kharagpur Assignment -3, Probability and Statistics, March 05. Due:-March 5, 05.. Show that the function 0 for x < x+ F (x) = 4 for x < for x

More information

FEGYVERNEKI SÁNDOR, PROBABILITY THEORY AND MATHEmATICAL

FEGYVERNEKI SÁNDOR, PROBABILITY THEORY AND MATHEmATICAL FEGYVERNEKI SÁNDOR, PROBABILITY THEORY AND MATHEmATICAL STATIsTICs 4 IV. RANDOm VECTORs 1. JOINTLY DIsTRIBUTED RANDOm VARIABLEs If are two rom variables defined on the same sample space we define the joint

More information

Monte Carlo Methods in Finance

Monte Carlo Methods in Finance Author: Yiyang Yang Advisor: Pr. Xiaolin Li, Pr. Zari Rachev Department of Applied Mathematics and Statistics State University of New York at Stony Brook October 2, 2012 Outline Introduction 1 Introduction

More information

Introduction to Probability

Introduction to Probability Introduction to Probability EE 179, Lecture 15, Handout #24 Probability theory gives a mathematical characterization for experiments with random outcomes. coin toss life of lightbulb binary data sequence

More information

6 Scalar, Stochastic, Discrete Dynamic Systems

6 Scalar, Stochastic, Discrete Dynamic Systems 47 6 Scalar, Stochastic, Discrete Dynamic Systems Consider modeling a population of sand-hill cranes in year n by the first-order, deterministic recurrence equation y(n + 1) = Ry(n) where R = 1 + r = 1

More information

Generating Random Variables and Stochastic Processes

Generating Random Variables and Stochastic Processes Monte Carlo Simulation: IEOR E4703 Fall 2004 c 2004 by Martin Haugh Generating Random Variables and Stochastic Processes 1 Generating U(0,1) Random Variables The ability to generate U(0, 1) random variables

More information

Probability and Statistics Vocabulary List (Definitions for Middle School Teachers)

Probability and Statistics Vocabulary List (Definitions for Middle School Teachers) Probability and Statistics Vocabulary List (Definitions for Middle School Teachers) B Bar graph a diagram representing the frequency distribution for nominal or discrete data. It consists of a sequence

More information

( ) is proportional to ( 10 + x)!2. Calculate the

( ) is proportional to ( 10 + x)!2. Calculate the PRACTICE EXAMINATION NUMBER 6. An insurance company eamines its pool of auto insurance customers and gathers the following information: i) All customers insure at least one car. ii) 64 of the customers

More information

3 Some Integer Functions

3 Some Integer Functions 3 Some Integer Functions A Pair of Fundamental Integer Functions The integer function that is the heart of this section is the modulo function. However, before getting to it, let us look at some very simple

More information

Practice with Proofs

Practice with Proofs Practice with Proofs October 6, 2014 Recall the following Definition 0.1. A function f is increasing if for every x, y in the domain of f, x < y = f(x) < f(y) 1. Prove that h(x) = x 3 is increasing, using

More information

Practice problems for Homework 11 - Point Estimation

Practice problems for Homework 11 - Point Estimation Practice problems for Homework 11 - Point Estimation 1. (10 marks) Suppose we want to select a random sample of size 5 from the current CS 3341 students. Which of the following strategies is the best:

More information

An Introduction to Basic Statistics and Probability

An Introduction to Basic Statistics and Probability An Introduction to Basic Statistics and Probability Shenek Heyward NCSU An Introduction to Basic Statistics and Probability p. 1/4 Outline Basic probability concepts Conditional probability Discrete Random

More information

MATH10212 Linear Algebra. Systems of Linear Equations. Definition. An n-dimensional vector is a row or a column of n numbers (or letters): a 1.

MATH10212 Linear Algebra. Systems of Linear Equations. Definition. An n-dimensional vector is a row or a column of n numbers (or letters): a 1. MATH10212 Linear Algebra Textbook: D. Poole, Linear Algebra: A Modern Introduction. Thompson, 2006. ISBN 0-534-40596-7. Systems of Linear Equations Definition. An n-dimensional vector is a row or a column

More information

Chapter 3: DISCRETE RANDOM VARIABLES AND PROBABILITY DISTRIBUTIONS. Part 3: Discrete Uniform Distribution Binomial Distribution

Chapter 3: DISCRETE RANDOM VARIABLES AND PROBABILITY DISTRIBUTIONS. Part 3: Discrete Uniform Distribution Binomial Distribution Chapter 3: DISCRETE RANDOM VARIABLES AND PROBABILITY DISTRIBUTIONS Part 3: Discrete Uniform Distribution Binomial Distribution Sections 3-5, 3-6 Special discrete random variable distributions we will cover

More information

This unit will lay the groundwork for later units where the students will extend this knowledge to quadratic and exponential functions.

This unit will lay the groundwork for later units where the students will extend this knowledge to quadratic and exponential functions. Algebra I Overview View unit yearlong overview here Many of the concepts presented in Algebra I are progressions of concepts that were introduced in grades 6 through 8. The content presented in this course

More information

1 The Brownian bridge construction

1 The Brownian bridge construction The Brownian bridge construction The Brownian bridge construction is a way to build a Brownian motion path by successively adding finer scale detail. This construction leads to a relatively easy proof

More information

Random variables P(X = 3) = P(X = 3) = 1 8, P(X = 1) = P(X = 1) = 3 8.

Random variables P(X = 3) = P(X = 3) = 1 8, P(X = 1) = P(X = 1) = 3 8. Random variables Remark on Notations 1. When X is a number chosen uniformly from a data set, What I call P(X = k) is called Freq[k, X] in the courseware. 2. When X is a random variable, what I call F ()

More information

Homework 4 - KEY. Jeff Brenion. June 16, 2004. Note: Many problems can be solved in more than one way; we present only a single solution here.

Homework 4 - KEY. Jeff Brenion. June 16, 2004. Note: Many problems can be solved in more than one way; we present only a single solution here. Homework 4 - KEY Jeff Brenion June 16, 2004 Note: Many problems can be solved in more than one way; we present only a single solution here. 1 Problem 2-1 Since there can be anywhere from 0 to 4 aces, the

More information

Homework set 4 - Solutions

Homework set 4 - Solutions Homework set 4 - Solutions Math 495 Renato Feres Problems R for continuous time Markov chains The sequence of random variables of a Markov chain may represent the states of a random system recorded at

More information

Section 1.3 P 1 = 1 2. = 1 4 2 8. P n = 1 P 3 = Continuing in this fashion, it should seem reasonable that, for any n = 1, 2, 3,..., = 1 2 4.

Section 1.3 P 1 = 1 2. = 1 4 2 8. P n = 1 P 3 = Continuing in this fashion, it should seem reasonable that, for any n = 1, 2, 3,..., = 1 2 4. Difference Equations to Differential Equations Section. The Sum of a Sequence This section considers the problem of adding together the terms of a sequence. Of course, this is a problem only if more than

More information

Maximum Likelihood Estimation

Maximum Likelihood Estimation Math 541: Statistical Theory II Lecturer: Songfeng Zheng Maximum Likelihood Estimation 1 Maximum Likelihood Estimation Maximum likelihood is a relatively simple method of constructing an estimator for

More information

1.1 Introduction, and Review of Probability Theory... 3. 1.1.1 Random Variable, Range, Types of Random Variables... 3. 1.1.2 CDF, PDF, Quantiles...

1.1 Introduction, and Review of Probability Theory... 3. 1.1.1 Random Variable, Range, Types of Random Variables... 3. 1.1.2 CDF, PDF, Quantiles... MATH4427 Notebook 1 Spring 2016 prepared by Professor Jenny Baglivo c Copyright 2009-2016 by Jenny A. Baglivo. All Rights Reserved. Contents 1 MATH4427 Notebook 1 3 1.1 Introduction, and Review of Probability

More information

MATH 425, PRACTICE FINAL EXAM SOLUTIONS.

MATH 425, PRACTICE FINAL EXAM SOLUTIONS. MATH 45, PRACTICE FINAL EXAM SOLUTIONS. Exercise. a Is the operator L defined on smooth functions of x, y by L u := u xx + cosu linear? b Does the answer change if we replace the operator L by the operator

More information

e.g. arrival of a customer to a service station or breakdown of a component in some system.

e.g. arrival of a customer to a service station or breakdown of a component in some system. Poisson process Events occur at random instants of time at an average rate of λ events per second. e.g. arrival of a customer to a service station or breakdown of a component in some system. Let N(t) be

More information

6.041/6.431 Spring 2008 Quiz 2 Wednesday, April 16, 7:30-9:30 PM. SOLUTIONS

6.041/6.431 Spring 2008 Quiz 2 Wednesday, April 16, 7:30-9:30 PM. SOLUTIONS 6.4/6.43 Spring 28 Quiz 2 Wednesday, April 6, 7:3-9:3 PM. SOLUTIONS Name: Recitation Instructor: TA: 6.4/6.43: Question Part Score Out of 3 all 36 2 a 4 b 5 c 5 d 8 e 5 f 6 3 a 4 b 6 c 6 d 6 e 6 Total

More information

4. Continuous Random Variables, the Pareto and Normal Distributions

4. Continuous Random Variables, the Pareto and Normal Distributions 4. Continuous Random Variables, the Pareto and Normal Distributions A continuous random variable X can take any value in a given range (e.g. height, weight, age). The distribution of a continuous random

More information

Vector and Matrix Norms

Vector and Matrix Norms Chapter 1 Vector and Matrix Norms 11 Vector Spaces Let F be a field (such as the real numbers, R, or complex numbers, C) with elements called scalars A Vector Space, V, over the field F is a non-empty

More information

CHAPTER 6: Continuous Uniform Distribution: 6.1. Definition: The density function of the continuous random variable X on the interval [A, B] is.

CHAPTER 6: Continuous Uniform Distribution: 6.1. Definition: The density function of the continuous random variable X on the interval [A, B] is. Some Continuous Probability Distributions CHAPTER 6: Continuous Uniform Distribution: 6. Definition: The density function of the continuous random variable X on the interval [A, B] is B A A x B f(x; A,

More information

UNIT I: RANDOM VARIABLES PART- A -TWO MARKS

UNIT I: RANDOM VARIABLES PART- A -TWO MARKS UNIT I: RANDOM VARIABLES PART- A -TWO MARKS 1. Given the probability density function of a continuous random variable X as follows f(x) = 6x (1-x) 0

More information

Constrained optimization.

Constrained optimization. ams/econ 11b supplementary notes ucsc Constrained optimization. c 2010, Yonatan Katznelson 1. Constraints In many of the optimization problems that arise in economics, there are restrictions on the values

More information

Lecture 13: Martingales

Lecture 13: Martingales Lecture 13: Martingales 1. Definition of a Martingale 1.1 Filtrations 1.2 Definition of a martingale and its basic properties 1.3 Sums of independent random variables and related models 1.4 Products of

More information

Random Variate Generation (Part 3)

Random Variate Generation (Part 3) Random Variate Generation (Part 3) Dr.Çağatay ÜNDEĞER Öğretim Görevlisi Bilkent Üniversitesi Bilgisayar Mühendisliği Bölümü &... e-mail : cagatay@undeger.com cagatay@cs.bilkent.edu.tr Bilgisayar Mühendisliği

More information

The Steepest Descent Algorithm for Unconstrained Optimization and a Bisection Line-search Method

The Steepest Descent Algorithm for Unconstrained Optimization and a Bisection Line-search Method The Steepest Descent Algorithm for Unconstrained Optimization and a Bisection Line-search Method Robert M. Freund February, 004 004 Massachusetts Institute of Technology. 1 1 The Algorithm The problem

More information

A few useful MATLAB functions

A few useful MATLAB functions A few useful MATLAB functions Renato Feres - Math 350 Fall 2012 1 Uniform random numbers The Matlab command rand is based on a popular deterministic algorithm called multiplicative congruential method.

More information

ECE302 Spring 2006 HW3 Solutions February 2, 2006 1

ECE302 Spring 2006 HW3 Solutions February 2, 2006 1 ECE302 Spring 2006 HW3 Solutions February 2, 2006 1 Solutions to HW3 Note: Most of these solutions were generated by R. D. Yates and D. J. Goodman, the authors of our textbook. I have added comments in

More information

Principle of Data Reduction

Principle of Data Reduction Chapter 6 Principle of Data Reduction 6.1 Introduction An experimenter uses the information in a sample X 1,..., X n to make inferences about an unknown parameter θ. If the sample size n is large, then

More information

Probability Generating Functions

Probability Generating Functions page 39 Chapter 3 Probability Generating Functions 3 Preamble: Generating Functions Generating functions are widely used in mathematics, and play an important role in probability theory Consider a sequence

More information

Lecture 3: Finding integer solutions to systems of linear equations

Lecture 3: Finding integer solutions to systems of linear equations Lecture 3: Finding integer solutions to systems of linear equations Algorithmic Number Theory (Fall 2014) Rutgers University Swastik Kopparty Scribe: Abhishek Bhrushundi 1 Overview The goal of this lecture

More information

STT315 Chapter 4 Random Variables & Probability Distributions KM. Chapter 4.5, 6, 8 Probability Distributions for Continuous Random Variables

STT315 Chapter 4 Random Variables & Probability Distributions KM. Chapter 4.5, 6, 8 Probability Distributions for Continuous Random Variables Chapter 4.5, 6, 8 Probability Distributions for Continuous Random Variables Discrete vs. continuous random variables Examples of continuous distributions o Uniform o Exponential o Normal Recall: A random

More information

Linear Algebra Notes for Marsden and Tromba Vector Calculus

Linear Algebra Notes for Marsden and Tromba Vector Calculus Linear Algebra Notes for Marsden and Tromba Vector Calculus n-dimensional Euclidean Space and Matrices Definition of n space As was learned in Math b, a point in Euclidean three space can be thought of

More information

Review of Random Variables

Review of Random Variables Chapter 1 Review of Random Variables Updated: January 16, 2015 This chapter reviews basic probability concepts that are necessary for the modeling and statistical analysis of financial data. 1.1 Random

More information

Continued Fractions and the Euclidean Algorithm

Continued Fractions and the Euclidean Algorithm Continued Fractions and the Euclidean Algorithm Lecture notes prepared for MATH 326, Spring 997 Department of Mathematics and Statistics University at Albany William F Hammond Table of Contents Introduction

More information

Feb 28 Homework Solutions Math 151, Winter 2012. Chapter 6 Problems (pages 287-291)

Feb 28 Homework Solutions Math 151, Winter 2012. Chapter 6 Problems (pages 287-291) Feb 8 Homework Solutions Math 5, Winter Chapter 6 Problems (pages 87-9) Problem 6 bin of 5 transistors is known to contain that are defective. The transistors are to be tested, one at a time, until the

More information

2. Discrete random variables

2. Discrete random variables 2. Discrete random variables Statistics and probability: 2-1 If the chance outcome of the experiment is a number, it is called a random variable. Discrete random variable: the possible outcomes can be

More information

Exploratory Data Analysis

Exploratory Data Analysis Exploratory Data Analysis Johannes Schauer johannes.schauer@tugraz.at Institute of Statistics Graz University of Technology Steyrergasse 17/IV, 8010 Graz www.statistics.tugraz.at February 12, 2008 Introduction

More information

Inner Product Spaces

Inner Product Spaces Math 571 Inner Product Spaces 1. Preliminaries An inner product space is a vector space V along with a function, called an inner product which associates each pair of vectors u, v with a scalar u, v, and

More information

Section 6.1 Joint Distribution Functions

Section 6.1 Joint Distribution Functions Section 6.1 Joint Distribution Functions We often care about more than one random variable at a time. DEFINITION: For any two random variables X and Y the joint cumulative probability distribution function

More information

Discrete Mathematics and Probability Theory Fall 2009 Satish Rao, David Tse Note 13. Random Variables: Distribution and Expectation

Discrete Mathematics and Probability Theory Fall 2009 Satish Rao, David Tse Note 13. Random Variables: Distribution and Expectation CS 70 Discrete Mathematics and Probability Theory Fall 2009 Satish Rao, David Tse Note 3 Random Variables: Distribution and Expectation Random Variables Question: The homeworks of 20 students are collected

More information

Nonparametric adaptive age replacement with a one-cycle criterion

Nonparametric adaptive age replacement with a one-cycle criterion Nonparametric adaptive age replacement with a one-cycle criterion P. Coolen-Schrijner, F.P.A. Coolen Department of Mathematical Sciences University of Durham, Durham, DH1 3LE, UK e-mail: Pauline.Schrijner@durham.ac.uk

More information

Descriptive Statistics

Descriptive Statistics Y520 Robert S Michael Goal: Learn to calculate indicators and construct graphs that summarize and describe a large quantity of values. Using the textbook readings and other resources listed on the web

More information

Lecture 13 - Basic Number Theory.

Lecture 13 - Basic Number Theory. Lecture 13 - Basic Number Theory. Boaz Barak March 22, 2010 Divisibility and primes Unless mentioned otherwise throughout this lecture all numbers are non-negative integers. We say that A divides B, denoted

More information

Algebra Unpacked Content For the new Common Core standards that will be effective in all North Carolina schools in the 2012-13 school year.

Algebra Unpacked Content For the new Common Core standards that will be effective in all North Carolina schools in the 2012-13 school year. This document is designed to help North Carolina educators teach the Common Core (Standard Course of Study). NCDPI staff are continually updating and improving these tools to better serve teachers. Algebra

More information

Introduction to Queueing Theory and Stochastic Teletraffic Models

Introduction to Queueing Theory and Stochastic Teletraffic Models Introduction to Queueing Theory and Stochastic Teletraffic Models Moshe Zukerman EE Department, City University of Hong Kong Copyright M. Zukerman c 2000 2015 Preface The aim of this textbook is to provide

More information

Probabilistic Strategies: Solutions

Probabilistic Strategies: Solutions Probability Victor Xu Probabilistic Strategies: Solutions Western PA ARML Practice April 3, 2016 1 Problems 1. You roll two 6-sided dice. What s the probability of rolling at least one 6? There is a 1

More information

The Ideal Class Group

The Ideal Class Group Chapter 5 The Ideal Class Group We will use Minkowski theory, which belongs to the general area of geometry of numbers, to gain insight into the ideal class group of a number field. We have already mentioned

More information

Numerical Methods for Option Pricing

Numerical Methods for Option Pricing Chapter 9 Numerical Methods for Option Pricing Equation (8.26) provides a way to evaluate option prices. For some simple options, such as the European call and put options, one can integrate (8.26) directly

More information