# Joint Probability Distributions and Random Samples (Devore Chapter Five)

Save this PDF as:

Size: px
Start display at page:

## Transcription

1 Joint Probability Distributions and Random Samples (Devore Chapter Five) Probability and Statistics for Engineers Winter Contents 1 Joint Probability Distributions Two Discrete Random Variables Independence of Random Variables Two Continuous Random Variables Collection of Formulas Example of Double Integration Expected Values, Covariance and Correlation 9 3 Statistics Constructed from Random Variables Random Samples What is a Statistic? Mean and Variance of a Random Sample Sample Variance Explained at Last Linear Combinations of Random Variables Linear Combination of Normal Random Variables The Central Limit Theorem 19 5 Summary of Properties of Sums of Random Variables 22 Copyright 2011, John T. Whelan, and all that 1

2 Tuesday 25 January Joint Probability Distributions Consider a scenario with more than one random variable. For concreteness, start with two, but methods will generalize to multiple ones. 1.1 Two Discrete Random Variables Call the rvs X and Y. The generalization of the pmf is the joint probability mass function, which is the probability that X takes some value x and Y takes some value y: p(x, y) = P ((X = x) (Y = y)) (1.1) Since X and Y have to take on some values, all of the entries in the joint probability table have to sum to 1: p(x, y) = 1 (1.2) x y We can collect the values into a table: Example: problem 5.1: y p(x, y) x This means that for example there is a 2% chance that x = 1 and y = 3. Each combination of values for X and Y is an outcome that occurs with a certain probability. We can combine those into events; e.g., the event (X 1) (Y 1) consists of the outcomes in which (X, Y ) is (0, 0), (0, 1), (1, 0), and (1, 1). If we call this set of (X, Y ) combinations A, the probability of the event is the sum of all of the probabilities for the outcomes in A: P ((X, Y ) A) = p(x, y) (1.3) So, specifically in this case, (x,y) A P ((X 1) (Y 1)) = =.42 (1.4) The events need not correspond to rectangular regions in the table. For instance, the event X < Y corresponds to (X, Y ) combinations of (0, 1), (0, 2), and (1, 2), so P (X < Y ) = =.12 (1.5) Another event you can consider is X = x for some x, regardless of the value of Y. For example, P (X = 1) = =.34 (1.6) 2

3 But of course P (X = x) is just the pmf for X alone; when we obtain it from a joint pmf, we call it a marginal pmf: p X (x) = P (X = x) = p(x, y) (1.7) y and likewise p Y (y) = P (Y = y) = x p(x, y) (1.8) For the example above, we can sum the columns to get the marginal pmf p Y (y): y p Y (y) or sum the rows to get the marginal pmf p X (x): x p X (x) They re apparently called marginal pmfs because you can write the sums of columns and rows in the margins: y p(x, y) p X (x) x p Y (y) Independence of Random Variables Recall that two events A and B are called independent if (and only if) P (A B) = P (A)P (B). That definition extends to random variables: Two random variables X and Y are independent if and only if the events X = x and Y = y are independent for all choices of x and y, i.e., if p(x, y) = p X (x)p Y (y) for all x and y. We can check if this is true for our example. For instance, p X (2)p Y (2) = (.50)(.38) =.19 while p(2, 2) =.30 so p(2, 2) p X (2)p Y (2), which means that X and Y are not independent. (If X and Y were independent, we d have to check that by checking p(x, y) = p X (x)p Y (y) for each possible combination of x and y.) 1.2 Two Continuous Random Variables We now consider the case of two continuous rvs. It s not really convenient to use the cdf like we did for one variable, but we can extend the definition of the pdf by considering the probability that X and Y lie in a tiny box centered on (x, y) with sides x and y. This probability will go to zero as either x or y goes to zero, but if we divide by x y we 3

4 get something which remains finite. This is the joint pdf : P (( x x f(x, y) = lim < X < x + ) ( x 2 2 y y < Y < y + )) y 2 2 x, y 0 x y (1.9) We can construct from this the probability that (X, Y ) will lie within a certain region P ((X, Y ) A) f(x, y) x y (1.10) (x,y) A What is that, exactly? First, consider special case of a rectangular region where a < x < b and c < y < d. Divide it into M pieces in the x direction and N pieces in the y direction: Now, so M N P ((a < X < b) (c < Y < d)) f(x i, y j ) x y (1.11) N f(x i, y j ) y d j=1 c j=1 f(x i, y) dy (1.12) 4

5 P ((a < X < b) (c < Y < d)) M d c f(x i, y) dy x b a ( d c ) f(x, y) dy dx (1.13) This is a double integral. We integrate with respect to y and then integrate with respect to x. Note that there was nothing special about the order we approximated the sums as integrals. We could also have taken the limit as the x spacing went to zero first, and then found 5

6 P ((a < X < b) (c < Y < d)) N j=1 d c M f(x i, y j ) x y ( b a ) f(x, y) dx dy N j=1 b a f(x, y j ) dx y (1.14) As long as we integrate over the correct region of the x, y plane, it doesn t matter in what order we do it. All of these approximations get better and better as x and y go to zero, so the expression involving the integrals is actually exact: P ((a < X < b) (c < Y < d)) = b a ( d c ) f(x, y) dy dx = d c ( b a ) f(x, y) dx dy (1.15) The normalization condition is that the probability density for all values of X and Y has to integrate up to 1: Practice Problems 5.1, 5.3, 5.9, 5.13, 5.19, 5.21 f(x, y) dx dy = 1 (1.16) 6

7 Thursday 27 January Collection of Formulas Discrete Continuous x P([X=x± 2 ] [Y =y± Definition p(x, y) = P ([X = x] [Y = y]) f(x, y) y 2 ]) x y Normalization x y p(x, y) = 1 f(x, y) dx dy = 1 Region A P ((X, Y ) A) = p(x, y) P ((X, Y ) A) = f(x, y) dx dy (x,y) A (x,y) A Marginal p X (x) = y p(x, y) f X(x) = f(x, y) dy X & Y indep iff p(x, y) = p X (x)p Y (y) for all x, y f(x, y) = f X (x)f Y (y) for all x, y 1.4 Example of Double Integration Last time, we calculated the probability that a pair of continuous random variables X and Y lie within a rectangular region. Now let s consider how we d integrate to get the probability that (X, Y ) lie in a less simple region, specifically X < Y. For simplicity, assume that the joint pdf f(x, y) is non-zero only for 0 x 1 and 0 y 1. So we want to integrate P (X < Y ) = f(x, y) dx dy (1.17) x<y The region we want to integrate over looks like this: If we do the y integral first, that means that, for each value of x, we need work out the range of y values. This means slicing up the triangle along lines of constant x and integrating in the y direction for each of those: 7

8 The conditions for a given x are y 0, y > x, and y 1. So the minimum possible y is x and the maximum is 1, and the integral is 1 ( 1 ) P (X < Y ) = f(x, y) dy dx (1.18) 0 We integrate x over the full range of x, from 0 to 1, because there is a strip present for each of those xs. Note that the limits of the y integral depend on x, which is fine because the y integral is inside the x integral. If, on the other hand, we decided to do the x integral first, we d be slicing up into slices of constant x and considering the range of x values for each possible y: x 8

9 Now, given a value of y, the restrictions on x are x 0, x < y, and x 1, so the minimum x is 0 and the maximum is y, which makes the integral 1 ( y ) P (X < Y ) = f(x, y) dx dy (1.19) 0 The integral over y is over the whole range from 0 to 1 because there is a constant-y strip for each of those values. Note that the limits on the integrals are different depending on which order the integrals are done. The fact that we get the same answer (which should be apparent from the fact that we re covering all the points of the same region) is apparently called Fubini s theorem. 2 Expected Values, Covariance and Correlation We can extend further concepts to the realm of multiple random variables. For instance, the expected value of any function h(x, Y ) which can be constructed from random variables X and Y is taken by multiplying the value of the function corresponding to each outcome by the probability of that outcome: E(h(X, Y )) = h(x, y) p(x, y) (2.1) x y In the case of continuous rvs, we replace the pmf with the pdf and the sums with integrals: E(h(X, Y )) = 9 0 h(x, y) f(x, y) dx dy (2.2)

10 (All of the properties we ll show in this section work for both discrete and continuous distributions, but we ll do the explicit demonstrations in one case; the other version can be produced via a straightforward conversion.) In particular, we can define the mean value of each of the random variables as before and likewise the variance: µ X = E(X) (2.3a) µ Y = E(Y ) (2.3b) σ 2 X = V (X) = E((X µ X ) 2 ) = E(X 2 ) µ 2 X σ 2 Y = V (Y ) = E((Y µ Y ) 2 ) = E(Y 2 ) µ 2 Y (2.4a) (2.4b) Note that each of these is a function of only one variable, and we can calculate the expected value of such a function without having to use the full joint probability distribution, since e.g., E(h(X)) = h(x) p(x, y) = h(x) p(x, y) = h(x) p X (x). (2.5) x y x y x Which makes sense, since the marginal pmf p X (x) = P (X = x) is just the probability distribution we d assign to X if we didn t know or care about Y. A useful expected value which is constructed from a function of both variables is the covariance. Instead of squaring (X µ X ) or (Y µ Y ), we multiply them by each other: Cov(X, Y ) = E([X µ X ][Y µ Y ]) = E(XY ) µ X µ Y (2.6) It s worth going explicitly through the derivation of the shortcut formula: E([X µ X ][Y µ Y ]) = E(XY µ X Y µ Y X + µ X µ Y ) = E(XY ) µ X E(Y ) µ Y E(X) + µ X µ Y (2.7) = E(XY ) µ X µ Y µ Y µ X + µ X µ Y We can show that the covariance of two independent random variables is zero: E(XY ) = x y p(x, y) = x y p X (x) p Y (y) = ( ( )) x p X (x) y p Y (y) x y x y x y = x p X (x)µ Y = µ Y x p X (x) = µ X µ Y if X and Y are independent x x (2.8) The converse is, however, not true. We can construct a probability distribution in which X and Y are not independent but their covariance is zero: 10

11 y p(x, y) p X (x) x p Y (y) From the form of the marginal pmfs, we can see µ X = 0 = µ Y, and if we calculate E(XY ) = x y p(x, y) (2.9) x y we see that for each x, y combination for which p(x, y) 0, either x or y is zero, and so Cov(X, Y ) = E(XY ) = 0. Unlike variance, covariance can be positive or negative. For example, consider two rvs with the joint pmf y p(x, y) p X (x) x p Y (y) Since we have the same marginal pmfs as before, µ X µ Y = 0, and Cov(X, Y ) = E(XY ) = x y p(x, y) = ( 1)( 1)(.2) + (1)(1)(.2) =.4 (2.10) x y The positive covariance means X tends to be positive when Y is positive and negative when Y is negative. On the other hand if the pmf is y p(x, y) p X (x) x p Y (y) which again has µ X µ Y = 0, the covariance is Cov(X, Y ) = E(XY ) = x y p(x, y) = ( 1)(1)(.2) + (1)( 1)(.2) =.4 (2.11) x y One drawback of covariance is that, like variance, it depends on the units of the quantities considered. If the numbers above were in feet, the covariance would really be.4 ft 2 ; if you 11

12 converted X and Y into inches, so that 1 ft became 12 in, the covariance would become 57.6 (actually 57.6 in 2 ). But there wouldn t really be any increase in the degree to which X and Y are correlated; the change of units would just have spread things out. The same thing would happen to the variances of each variable; the covariance is measuring the spread of each variable along with the correlation between them. To isolate a measure of how correlated the variables are, we can divide by the product of their standard deviations and define something called the correlation coëfficient: ρ X,Y = Cov(X, Y ) σ X σ Y (2.12) To see why the product of the standard deviations is the right thing, suppose that X and Y have different units, e.g., X is in inches and Y is in seconds. Then µ X is in inches, σx 2 is in inches squared, and σ X is in inches. Similar arguments show that µ Y and σ Y are in seconds, and thus Cov(X, Y ) is in inches times seconds, so dividing by σ X σ Y makes all of the units cancel out. Exercise: work out σ X, σ Y, and ρ X,Y for each of these examples. Practice Problems 5.25, 5.27, 5.31, 5.33, 5.37,

13 Tuesday 1 February Statistics Constructed from Random Variables 3.1 Random Samples We ve done explicit calculations so far on pairs of random variables, but of course everything we ve done extends to the general situation where we have n random variables, which we ll call X 1, X 2,... X n. Depending on whether the rvs are discrete or continuous, there is a joint pmf or pdf p(x 1, x 2,..., x n ) or f(x 1, x 2,... x n ) (3.1) from which probabilities and expected values can be calculated in the usual way. A special case of this is if the n random variables are independent. Then the pmf or pdf factors, e.g., for the case of a continuous rv, X 1, X 2,... X n independent means f(x 1, x 2,... x n ) = f X1 (x 1 )f X2 (x 2 ) f Xn (x n ) (3.2) An even more special case is when each of the rvs follows the same distribution, which we can then write as e.g., f(x 1 ) rather than f X1 (x 1 ). Then we say the n rvs are independent and identically distributed or iid. Again, writing this explicitly for the continuous case, X 1, X 2,... X n iid means f(x 1, x 2,... x n ) = f(x 1 )f(x 2 ) f(x n ) (3.3) This is a useful enough concept that it has a name, and we call it a random sample of size n drawn from the distribution with pdf f(x). For example, if we roll a die fifteen times, we can make statements about those fifteen numbers taken together. 3.2 What is a Statistic? Often we ll want to combine the n numbers X 1, X 2,... X n into some quantity that tells us something about the sample. For example, we might be interested in the average of the numbers, or their sum, or the maximum among them. Any such quantity constructed out of a set of random variables is called a statistic. (It s actually a little funny to see statistic used in the singular, since one thinks of the study of statistics as something like physics or mathematics, and we don t talk about a physic or a mathematic.) A statistic constructed from random variables is itself a random variable. 13

14 3.3 Mean and Variance of a Random Sample Remember that if we had a specific set of n numbers x 1, x 2,... x n, we could define the sample mean and sample variance as x = 1 n s 2 = 1 n 1 n x i (3.4a) n (x i x) 2 (3.4b) On the other hand, if we have a single random variable X, its mean and variance are calculated with the expected value: µ X = E(X) (3.5a) σ 2 X = E([X µ X ] 2 ) (3.5b) So, given a set of n random variables, we could combine them in the same ways we combine sample points, and define the mean and variance of those n numbers X = 1 n X i n (3.6a) S 2 = 1 n (X i X) 2 n 1 (3.6b) Each of these is a statistic, and each is a random variable, which we stress by writing it with a capital letter. We could then ask about quantities derived from the probability distributions of the statistics, for example µ X = E(X) (3.7a) (σ X ) 2 = V (X) = E([X µ X ] 2 ) (3.7b) µ S 2 = E(S 2 ) (3.7c) (σ S 2) 2 = V (S 2 ) = E([S 2 µ S 2] 2 ) (3.7d) Of special interest is the case where the {X i } are an iid random sample. We can actually work the means and variances just by using the fact that the expected value is a linear operator. So ( ) 1 n µ X = E(X) = E X i = 1 n E(X i ) = 1 n µ X = µ X (3.8) n n n which is kind of what we d expect: the average of n iid random variables has an expected value which is equal to the mean of the underlying distribution. But we can also consider 14

15 what the variance of the mean is. I.e., on average, how far away will the mean calculated from a random sample be from the mean of the underlying distribution? ( ) [ ] 2 (σ X ) 2 1 n = V X i = E 1 n X i µ X (3.9) n n It s easy to get a little lost squaring the sum, but we can see the essential point of what happens in the case where n = 2: ( [1 ] ) ( 2 [1 (σ X ) 2 = E 2 (X 1 + X 2 ) µ X = E 2 (X 1 µ X ) + 1 ] ) 2 2 (X 2 µ X ) ( 1 = E 4 (X 1 µ X ) (X 1 µ X )(X 2 µ X ) + 1 ) (3.10) 4 (X 2 µ X ) 2 Since the expected value operation is linear, this is (σ X ) 2 = 1 4 E([X 1 µ X ] 2 ) E([X 1 µ X ][X 2 µ X ]) E([X 2 µ X ] 2 ) = 1 4 V (X 1) Cov(X 1, X 2 ) V (X 2) (3.11) But since X 1 and X 2 are independent random variables, their covariance is zero, and (σ X ) 2 = 1 4 σ2 X σ2 X = 1 2 σ2 X (3.12) The same thing works for n iid random variables: the cross terms are all covariances which 1 equal zero, and we get n copies of σ 2 n 2 X so that (σ X ) 2 = 1 n σ2 X (3.13) This means that if you have a random sample of size n, the sample mean will be a better estimate of the underlying population mean the larger n is. (There are assorted anecdotal examples of this in Devore.) Note that these statements can also be made about the sum T o = X 1 + X X n = n X (3.14) rather than the mean, and they are perhaps easier to remember: µ To = E(T o ) = nµ X (3.15a) (σ To ) 2 = V (T o ) = nσ 2 X (3.15b) 15

16 3.4 Sample Variance Explained at Last We now have the machinery needed to show that µ S 2 = σx 2. This result is what validates 1 using the factor of rather than 1 in the definition of sample variance way back at the n 1 n beginning of the quarter, so it s worth doing as a demonstration of the power of the formalism of random samples. The sample variance S 2 = 1 n (X i X) 2 (3.16) n 1 is a random variable. Its mean value is ( ) E(S 2 1 n ) = E (X i X) 2 n 1 If we write = 1 n 1 n E ( [X i X] 2) (3.17) X i X = (X i µ X ) (X µ X ) (3.18) we can look at the expected value associated with each term in the sum: E ( [X i X] 2) ( [(Xi = E µ X ) (X µ X ) ] ) 2 = E ( [(X i µ X )] 2) 2E ( [X i µ X ] [ ]) ( [X ] 2 ) (3.19) X µ X + E µx The first and last terms are just the variance of X i and X, respectively: E ( [(X i µ X )] 2) = V (X i ) = σ 2 X (3.20) E ( [(X µx ) ] 2 ) = V (X) = σ2 X n (3.21) To evaluate the middle term, we can expand out X to get E ( [X i µ X ] [ X µ X ]) = 1 n n E ([X i µ X ] [X j µ X ]) (3.22) Since the rvs in the sample are independent, the covariance between different rvs is zero: { 0 i j E ([X i µ X ] [X j µ X ]) = Cov(X i, X j ) = σx 2 i = j (3.23) j=1 Since there is only one term in the sum for which j = i, n E ([X i µ X ] [X j µ X ]) = σx 2 (3.24) j=1 16

17 and Putting it all together we get and thus Q.E.D.! Practice Problems 5.37, 5.39, 5.41, 5.45, 5.49 E ( [X i µ X ] [ X µ X ]) = σ 2 X n E ( [X i X] 2) = σ 2 X 2 σ2 X n + σ2 X n = E(S 2 ) = 1 n 1 Thursday 3 February Linear Combinations of Random Variables n (3.25) ( 1 1 ) σx 2 = n 1 n n σ2 X (3.26) n 1 n σ2 X = σ 2 X (3.27) These results about the means of a random sample are special cases of general results from section 5.5, where we consider a general linear combination of any n rvs (not necessarily iid) and show that and Y = a 1 X 1 + a 2 X 2 + a n X n = n a i X i (3.28) µ Y = E(Y ) = a 1 E(X 1 ) + a 2 E(X 2 ) + + a n E(X n ) = a 1 µ 1 + a 2 µ a n µ n (3.29) if X 1,..., X n independent, V (Y ) = a 1 2 V (X 1 ) + a 2 2 V (X 2 ) + + a n 2 V (X n ) (3.30) The first result follows from linearity of the expected value; the second can be illustrated for n = 2 as before: V (a 1 X 1 + a 2 X 2 ) = E ( [(a 1 X 1 + a 2 X 2 ) µ Y ] 2) = E ( [(a 1 X 1 + a 2 X 2 ) (a 1 µ 1 + a 2 µ 2 )] 2) = E ( [a 1 (X 1 µ 1 ) + a 2 (X 2 µ 2 )] 2) = E ( a 1 2 (X 1 µ 1 ) 2 + 2a 1 a 2 (X 1 µ 1 )(X 2 µ 2 ) + a 2 2 (X 2 µ 2 ) 2) = a 1 2 V (X 1 ) + 2a 1 a 2 Cov(X 1, X 2 ) + a 2 2 V (X 2 ) and then the cross term vanishes if X 1 and X 2 are independent. 17 (3.31)

18 3.6 Linear Combination of Normal Random Variables To specify the distribution of a linear combination of random variables, beyond its mean and variance, you have to know something about the distributions of the individual variables. One useful result is that a linear combination of independent normally-distributed random variables is itself normally distributed. We ll show this for the special case of the sum of two independent standard normal random variables: Y = Z 1 + Z 2 (3.32) How do we get the pdf p(y )? Well, for one rv, we can fall back on our trick of going via the cumulative distribution function F (y) = P (Y y) = P (Z 1 + Z 2 y) (3.33) Now, Z 1 + Z 2 y is just an event which defines a region in the z 1, z 2 plane, so we can evaluate its probability as F (y) = z 1 +z 2 y f(z 1, z 2 ) = y z2 e z 1 2 /2 2 /2 e z 2 dz 1 dz 2 (3.34) 2π 2π Where the limits of integration for the z 1 integral come from the condtion z 1 y z 2. Since z 1 and z 2 are just integration variables, let s rename them to w and z to reduce the amount of writing we ll have to do: y z e w2 /2 e z2 /2 /2 F (y) = dw dz = Φ (y z) e z2 dz (3.35) 2π 2π 2π We ve used the usual definition of the standard normal cdf to do the w integral, but now we re sort of stuck with the z integral. But if we just want the pdf, we can differentiate with respect to y, using d e (y z)2/2 Φ (y z) = (3.36) dy 2π so that f(y) = F (y) = The argument of the exponential is (y z) 2 z 2 If we define a new integration variable u by 2 e (y z)2 /2 e z2 /2 ( ) 1 (y z) 2 dz = 2π 2π 2π exp z 2 dz (3.37) 2 = z 2 + yz 1 ( 2 y2 = z y ) y2 1 2 y2 (3.38) u 2 = z y 2 (3.39) 18

19 so that dz = du/ 2, the integral becomes ) 1 F (y) = ( 2π exp u2 2 y2 du = 4 1 = e y2 /(2[ 2] 2 ) 2 2π 1 e y2 /4 2 2π 1 2π e u2 /2 du (3.40) which is just the pdf for a normal random variable with mean 0+0 = 0 and variance 1+1 = 2. The demonstration a linear combination of standard normal rvs, or for the sum of normal rvs with different means, is similar, but there s a little more algebra to keep track of the different σs. 4 The Central Limit Theorem We have shown that if you add a bunch of independent random variables, the resulting statistic has a mean which is the sum of the individual means and a variance which is the sum of the individual variances. There is remarkable theorem which means that if you add up enough iid random variables, the mean and variance are all you need to know. This is known as the Central Limit Theorem: If X 1, X 2,... X n are independent random variables each from the same distribution with mean µ and variance σx 2, their sum T o = X 1 + X X n has a distribution which is approximately a normal distribution with mean nµ and variance nσx 2, with the approximation being better the larger n is. The same can also be said for the sample mean X = T o /n, but now the mean is µ and the variance is σx 2 /n. As a rule of thumb, the central limit theorem applies for n 30. We have actually already used the central limit theorem when approximating a binomial distribution with the corresponding normal distribution. A binomial rv can be thought of as the sum of n Bernoulli random variables. 19

20 You may also have seen the central limit theorem in action if you ve considered the pmf for the results of rolling several six-sided dice and adding the results. With one die, the results are uniformly distributed: If we roll two dice and add the results, we get a non-uniform distribution, with results close to 7 being most likely, and the probability distribution declining linearly from there: 20

21 If we add three dice, the distribution begins to take on the shape of a bell curve and in fact such a random distribution is used to approximate the distribution of human physical properties in some role-playing games: Adding more and more dice produces histograms that look more and more like a normal distribution: 21

22 5 Summary of Properties of Sums of Random Variables Property When is it true? E(T o ) = n E(X i) Always V (T o ) = n V (X i) When {X i } independent Exact, when {X T o normally distributed i } normally distributed Approximate, when n 30 (Central Limit Theorem) Practice Problems 5.55, 5.57, 5.61, 5.65, 5.67,

### Random Variables, Expectation, Distributions

Random Variables, Expectation, Distributions CS 5960/6960: Nonparametric Methods Tom Fletcher January 21, 2009 Review Random Variables Definition A random variable is a function defined on a probability

### Joint Probability Distributions and Random Samples. Week 5, 2011 Stat 4570/5570 Material from Devore s book (Ed 8), and Cengage

5 Joint Probability Distributions and Random Samples Week 5, 2011 Stat 4570/5570 Material from Devore s book (Ed 8), and Cengage Two Discrete Random Variables The probability mass function (pmf) of a single

### MATH 201. Final ANSWERS August 12, 2016

MATH 01 Final ANSWERS August 1, 016 Part A 1. 17 points) A bag contains three different types of dice: four 6-sided dice, five 8-sided dice, and six 0-sided dice. A die is drawn from the bag and then rolled.

### Jointly Distributed Random Variables

Jointly Distributed Random Variables COMP 245 STATISTICS Dr N A Heard Contents 1 Jointly Distributed Random Variables 1 1.1 Definition......................................... 1 1.2 Joint cdfs..........................................

### Review Exam Suppose that number of cars that passes through a certain rural intersection is a Poisson process with an average rate of 3 per day.

Review Exam 2 This is a sample of problems that would be good practice for the exam. This is by no means a guarantee that the problems on the exam will look identical to those on the exam but it should

### Bivariate Distributions

Chapter 4 Bivariate Distributions 4.1 Distributions of Two Random Variables In many practical cases it is desirable to take more than one measurement of a random observation: (brief examples) 1. What is

### Probability & Statistics Primer Gregory J. Hakim University of Washington 2 January 2009 v2.0

Probability & Statistics Primer Gregory J. Hakim University of Washington 2 January 2009 v2.0 This primer provides an overview of basic concepts and definitions in probability and statistics. We shall

### 2.8 Expected values and variance

y 1 b a 0 y 0.0 0.2 0.4 0.6 0.8 1.0 1.2 0 a b (a) Probability density function x 0 a b (b) Cumulative distribution function x Figure 2.3: The probability density function and cumulative distribution function

### MT426 Notebook 3 Fall 2012 prepared by Professor Jenny Baglivo. 3 MT426 Notebook 3 3. 3.1 Definitions... 3. 3.2 Joint Discrete Distributions...

MT426 Notebook 3 Fall 2012 prepared by Professor Jenny Baglivo c Copyright 2004-2012 by Jenny A. Baglivo. All Rights Reserved. Contents 3 MT426 Notebook 3 3 3.1 Definitions............................................

### Definition The covariance of X and Y, denoted by cov(x, Y ) is defined by. cov(x, Y ) = E(X µ 1 )(Y µ 2 ).

Correlation Regression Bivariate Normal Suppose that X and Y are r.v. s with joint density f(x y) and suppose that the means of X and Y are respectively µ 1 µ 2 and the variances are 1 2. Definition The

### Covariance and Correlation. Consider the joint probability distribution f XY (x, y).

Chapter 5: JOINT PROBABILITY DISTRIBUTIONS Part 2: Section 5-2 Covariance and Correlation Consider the joint probability distribution f XY (x, y). Is there a relationship between X and Y? If so, what kind?

### 3 Multiple Discrete Random Variables

3 Multiple Discrete Random Variables 3.1 Joint densities Suppose we have a probability space (Ω, F,P) and now we have two discrete random variables X and Y on it. They have probability mass functions f

### ST 371 (VIII): Theory of Joint Distributions

ST 371 (VIII): Theory of Joint Distributions So far we have focused on probability distributions for single random variables. However, we are often interested in probability statements concerning two or

### Monte Carlo Method: Probability

John (ARC/ICAM) Virginia Tech... Math/CS 4414: The Monte Carlo Method: PROBABILITY http://people.sc.fsu.edu/ jburkardt/presentations/ monte carlo probability.pdf... ARC: Advanced Research Computing ICAM:

### Continuous Distributions

MAT 2379 3X (Summer 2012) Continuous Distributions Up to now we have been working with discrete random variables whose R X is finite or countable. However we will have to allow for variables that can take

### MULTIVARIATE PROBABILITY DISTRIBUTIONS

MULTIVARIATE PROBABILITY DISTRIBUTIONS. PRELIMINARIES.. Example. Consider an experiment that consists of tossing a die and a coin at the same time. We can consider a number of random variables defined

### Conditional expectation

A Conditional expectation A.1 Review of conditional densities, expectations We start with the continuous case. This is sections 6.6 and 6.8 in the book. Let X, Y be continuous random variables. We defined

### POL 571: Expectation and Functions of Random Variables

POL 571: Expectation and Functions of Random Variables Kosuke Imai Department of Politics, Princeton University March 10, 2006 1 Expectation and Independence To gain further insights about the behavior

### 3.6: General Hypothesis Tests

3.6: General Hypothesis Tests The χ 2 goodness of fit tests which we introduced in the previous section were an example of a hypothesis test. In this section we now consider hypothesis tests more generally.

### Discrete and Continuous Random Variables. Summer 2003

Discrete and Continuous Random Variables Summer 003 Random Variables A random variable is a rule that assigns a numerical value to each possible outcome of a probabilistic experiment. We denote a random

### Continuous Random Variables

Continuous Random Variables COMP 245 STATISTICS Dr N A Heard Contents 1 Continuous Random Variables 2 11 Introduction 2 12 Probability Density Functions 3 13 Transformations 5 2 Mean, Variance and Quantiles

### 6. Jointly Distributed Random Variables

6. Jointly Distributed Random Variables We are often interested in the relationship between two or more random variables. Example: A randomly chosen person may be a smoker and/or may get cancer. Definition.

### Random Variables of The Discrete Type

Random Variables of The Discrete Type Definition: Given a random experiment with an outcome space S, a function X that assigns to each element s in S one and only one real number x X(s) is called a random

### Expectation Discrete RV - weighted average Continuous RV - use integral to take the weighted average

PHP 2510 Expectation, variance, covariance, correlation Expectation Discrete RV - weighted average Continuous RV - use integral to take the weighted average Variance Variance is the average of (X µ) 2

### Joint Distributions. Tieming Ji. Fall 2012

Joint Distributions Tieming Ji Fall 2012 1 / 33 X : univariate random variable. (X, Y ): bivariate random variable. In this chapter, we are going to study the distributions of bivariate random variables

### This is a good time to refresh your memory on double-integration. We will be using this skill in the upcoming lectures.

Chapter 5: JOINT PROBABILITY DISTRIBUTIONS Part 1: Sections 5-1.1 to 5-1.4 For both discrete and continuous random variables we will discuss the following... Joint Distributions (for two or more r.v. s)

### Change of Continuous Random Variable

Change of Continuous Random Variable All you are responsible for from this lecture is how to implement the Engineer s Way (see page 4) to compute how the probability density function changes when we make

### L10: Probability, statistics, and estimation theory

L10: Probability, statistics, and estimation theory Review of probability theory Bayes theorem Statistics and the Normal distribution Least Squares Error estimation Maximum Likelihood estimation Bayesian

### For a partition B 1,..., B n, where B i B j = for i. A = (A B 1 ) (A B 2 ),..., (A B n ) and thus. P (A) = P (A B i ) = P (A B i )P (B i )

Probability Review 15.075 Cynthia Rudin A probability space, defined by Kolmogorov (1903-1987) consists of: A set of outcomes S, e.g., for the roll of a die, S = {1, 2, 3, 4, 5, 6}, 1 1 2 1 6 for the roll

### Statistiek (WISB361)

Statistiek (WISB361) Final exam June 29, 2015 Schrijf uw naam op elk in te leveren vel. Schrijf ook uw studentnummer op blad 1. The maximum number of points is 100. Points distribution: 23 20 20 20 17

### Topic 4: Multivariate random variables. Multiple random variables

Topic 4: Multivariate random variables Joint, marginal, and conditional pmf Joint, marginal, and conditional pdf and cdf Independence Expectation, covariance, correlation Conditional expectation Two jointly

### WEEK #22: PDFs and CDFs, Measures of Center and Spread

WEEK #22: PDFs and CDFs, Measures of Center and Spread Goals: Explore the effect of independent events in probability calculations. Present a number of ways to represent probability distributions. Textbook

### Statistics 100 Binomial and Normal Random Variables

Statistics 100 Binomial and Normal Random Variables Three different random variables with common characteristics: 1. Flip a fair coin 10 times. Let X = number of heads out of 10 flips. 2. Poll a random

### Examination 110 Probability and Statistics Examination

Examination 0 Probability and Statistics Examination Sample Examination Questions The Probability and Statistics Examination consists of 5 multiple-choice test questions. The test is a three-hour examination

### The Expected Value of X

3.3 Expected Values The Expected Value of X Definition Let X be a discrete rv with set of possible values D and pmf p(x). The expected value or mean value of X, denoted by E(X) or μ X or just μ, is 2 Example

### ECE302 Spring 2006 HW7 Solutions March 11, 2006 1

ECE32 Spring 26 HW7 Solutions March, 26 Solutions to HW7 Note: Most of these solutions were generated by R. D. Yates and D. J. Goodman, the authors of our textbook. I have added comments in italics where

### EE 322: Probabilistic Methods for Electrical Engineers. Zhengdao Wang Department of Electrical and Computer Engineering Iowa State University

EE 322: Probabilistic Methods for Electrical Engineers Zhengdao Wang Department of Electrical and Computer Engineering Iowa State University Discrete Random Variables 1 Introduction to Random Variables

### Chapters 5. Multivariate Probability Distributions

Chapters 5. Multivariate Probability Distributions Random vectors are collection of random variables defined on the same sample space. Whenever a collection of random variables are mentioned, they are

### We have discussed the notion of probabilistic dependence above and indicated that dependence is

1 CHAPTER 7 Online Supplement Covariance and Correlation for Measuring Dependence We have discussed the notion of probabilistic dependence above and indicated that dependence is defined in terms of conditional

### Lesson 5 Chapter 4: Jointly Distributed Random Variables

Lesson 5 Chapter 4: Jointly Distributed Random Variables Department of Statistics The Pennsylvania State University 1 Marginal and Conditional Probability Mass Functions The Regression Function Independence

### Summary of Probability

Summary of Probability Mathematical Physics I Rules of Probability The probability of an event is called P(A), which is a positive number less than or equal to 1. The total probability for all possible

### Lecture Note 7 Random Sample. MIT Spring 2006 Herman Bennett

Lecture Note 7 Random Sample MIT 14.30 Spring 2006 Herman Bennett 17 Definitions 17.1 Random Sample Let X 1,..., X n be mutually independent RVs such that f Xi (x) = f Xj (x) i = j. Denote f Xi (x) = f(x).

### Chapter 3: Discrete Random Variable and Probability Distribution. January 28, 2014

STAT511 Spring 2014 Lecture Notes 1 Chapter 3: Discrete Random Variable and Probability Distribution January 28, 2014 3 Discrete Random Variables Chapter Overview Random Variable (r.v. Definition Discrete

### Worked examples Multiple Random Variables

Worked eamples Multiple Random Variables Eample Let X and Y be random variables that take on values from the set,, } (a) Find a joint probability mass assignment for which X and Y are independent, and

### Lecture Notes 1. Brief Review of Basic Probability

Probability Review Lecture Notes Brief Review of Basic Probability I assume you know basic probability. Chapters -3 are a review. I will assume you have read and understood Chapters -3. Here is a very

### Covariance and Correlation

Covariance and Correlation ( c Robert J. Serfling Not for reproduction or distribution) We have seen how to summarize a data-based relative frequency distribution by measures of location and spread, such

### 3. Continuous Random Variables

3. Continuous Random Variables A continuous random variable is one which can take any value in an interval (or union of intervals) The values that can be taken by such a variable cannot be listed. Such

### Continuous Random Variables and Probability Distributions. Stat 4570/5570 Material from Devore s book (Ed 8) Chapter 4 - and Cengage

4 Continuous Random Variables and Probability Distributions Stat 4570/5570 Material from Devore s book (Ed 8) Chapter 4 - and Cengage Continuous r.v. A random variable X is continuous if possible values

### University of California, Berkeley, Statistics 134: Concepts of Probability

University of California, Berkeley, Statistics 134: Concepts of Probability Michael Lugo, Spring 211 Exam 2 solutions 1. A fair twenty-sided die has its faces labeled 1, 2, 3,..., 2. The die is rolled

### 10 BIVARIATE DISTRIBUTIONS

BIVARIATE DISTRIBUTIONS After some discussion of the Normal distribution, consideration is given to handling two continuous random variables. The Normal Distribution The probability density function f(x)

### Lecture 7: Continuous Random Variables

Lecture 7: Continuous Random Variables 21 September 2005 1 Our First Continuous Random Variable The back of the lecture hall is roughly 10 meters across. Suppose it were exactly 10 meters, and consider

### Section 6.1 Joint Distribution Functions

Section 6.1 Joint Distribution Functions We often care about more than one random variable at a time. DEFINITION: For any two random variables X and Y the joint cumulative probability distribution function

### Contents. 3 Survival Distributions: Force of Mortality 37 Exercises Solutions... 51

Contents 1 Probability Review 1 1.1 Functions and moments...................................... 1 1.2 Probability distributions...................................... 2 1.2.1 Bernoulli distribution...................................

### Joint Exam 1/P Sample Exam 1

Joint Exam 1/P Sample Exam 1 Take this practice exam under strict exam conditions: Set a timer for 3 hours; Do not stop the timer for restroom breaks; Do not look at your notes. If you believe a question

### To give it a definition, an implicit function of x and y is simply any relationship that takes the form:

2 Implicit function theorems and applications 21 Implicit functions The implicit function theorem is one of the most useful single tools you ll meet this year After a while, it will be second nature to

### Probabilities for machine learning

Probabilities for machine learning Roland Memisevic February 2, 2006 Why probabilities? One of the hardest problems when building complex intelligent systems is brittleness. How can we keep tiny irregularities

### FEGYVERNEKI SÁNDOR, PROBABILITY THEORY AND MATHEmATICAL

FEGYVERNEKI SÁNDOR, PROBABILITY THEORY AND MATHEmATICAL STATIsTICs 4 IV. RANDOm VECTORs 1. JOINTLY DIsTRIBUTED RANDOm VARIABLEs If are two rom variables defined on the same sample space we define the joint

### EE 302 Division 1. Homework 5 Solutions.

EE 32 Division. Homework 5 Solutions. Problem. A fair four-sided die (with faces labeled,, 2, 3) is thrown once to determine how many times a fair coin is to be flipped: if N is the number that results

### Lecture 8: Continuous random variables, expectation and variance

Lecture 8: Continuous random variables, expectation and variance Lejla Batina Institute for Computing and Information Sciences Digital Security Version: spring 2012 Lejla Batina Version: spring 2012 Wiskunde

### Topic 2: Scalar random variables. Definition of random variables

Topic 2: Scalar random variables Discrete and continuous random variables Probability distribution and densities (cdf, pmf, pdf) Important random variables Expectation, mean, variance, moments Markov and

### Joint distributions Math 217 Probability and Statistics Prof. D. Joyce, Fall 2014

Joint distributions Math 17 Probability and Statistics Prof. D. Joyce, Fall 14 Today we ll look at joint random variables and joint distributions in detail. Product distributions. If Ω 1 and Ω are sample

### Lecture 3: Continuous distributions, expected value & mean, variance, the normal distribution

Lecture 3: Continuous distributions, expected value & mean, variance, the normal distribution 8 October 2007 In this lecture we ll learn the following: 1. how continuous probability distributions differ

### P (x) 0. Discrete random variables Expected value. The expected value, mean or average of a random variable x is: xp (x) = v i P (v i )

Discrete random variables Probability mass function Given a discrete random variable X taking values in X = {v 1,..., v m }, its probability mass function P : X [0, 1] is defined as: P (v i ) = Pr[X =

### Continuous Probability Distributions (120 Points)

Economics : Statistics for Economics Problem Set 5 - Suggested Solutions University of Notre Dame Instructor: Julio Garín Spring 22 Continuous Probability Distributions 2 Points. A management consulting

### Random Variables. Chapter 2. Random Variables 1

Random Variables Chapter 2 Random Variables 1 Roulette and Random Variables A Roulette wheel has 38 pockets. 18 of them are red and 18 are black; these are numbered from 1 to 36. The two remaining pockets

### Lecture 6: Discrete & Continuous Probability and Random Variables

Lecture 6: Discrete & Continuous Probability and Random Variables D. Alex Hughes Math Camp September 17, 2015 D. Alex Hughes (Math Camp) Lecture 6: Discrete & Continuous Probability and Random September

### Summary of Formulas and Concepts. Descriptive Statistics (Ch. 1-4)

Summary of Formulas and Concepts Descriptive Statistics (Ch. 1-4) Definitions Population: The complete set of numerical information on a particular quantity in which an investigator is interested. We assume

### 5. Conditional Expected Value

Virtual Laboratories > 3. Expected Value > 1 2 3 4 5 6 5. Conditional Expected Value As usual, our starting point is a random experiment with probability measure P on a sample space Ω. Suppose that X is

### the number of organisms in the squares of a haemocytometer? the number of goals scored by a football team in a match?

Poisson Random Variables (Rees: 6.8 6.14) Examples: What is the distribution of: the number of organisms in the squares of a haemocytometer? the number of hits on a web site in one hour? the number of

### Mathematical Expectation

Mathematical Expectation Properties of Mathematical Expectation I The concept of mathematical expectation arose in connection with games of chance. In its simplest form, mathematical expectation is the

### Math 431 An Introduction to Probability. Final Exam Solutions

Math 43 An Introduction to Probability Final Eam Solutions. A continuous random variable X has cdf a for 0, F () = for 0 <

### Normal distribution Approximating binomial distribution by normal 2.10 Central Limit Theorem

1.1.2 Normal distribution 1.1.3 Approimating binomial distribution by normal 2.1 Central Limit Theorem Prof. Tesler Math 283 October 22, 214 Prof. Tesler 1.1.2-3, 2.1 Normal distribution Math 283 / October

### Expected Value. Let X be a discrete random variable which takes values in S X = {x 1, x 2,..., x n }

Expected Value Let X be a discrete random variable which takes values in S X = {x 1, x 2,..., x n } Expected Value or Mean of X: E(X) = n x i p(x i ) i=1 Example: Roll one die Let X be outcome of rolling

### 4. Joint Distributions of Two Random Variables

4. Joint Distributions of Two Random Variables 4.1 Joint Distributions of Two Discrete Random Variables Suppose the discrete random variables X and Y have supports S X and S Y, respectively. The joint

### Chapter 5: Joint Probability Distributions. Chapter Learning Objectives. The Joint Probability Distribution for a Pair of Discrete Random

Chapter 5: Joint Probability Distributions 5-1 Two or More Random Variables 5-1.1 Joint Probability Distributions 5-1.2 Marginal Probability Distributions 5-1.3 Conditional Probability Distributions 5-1.4

### Example. Two fair dice are tossed and the two outcomes recorded. As is familiar, we have

Lectures 9-10 jacques@ucsd.edu 5.1 Random Variables Let (Ω, F, P ) be a probability space. The Borel sets in R are the sets in the smallest σ- field on R that contains all countable unions and complements

### Question: What is the probability that a five-card poker hand contains a flush, that is, five cards of the same suit?

ECS20 Discrete Mathematics Quarter: Spring 2007 Instructor: John Steinberger Assistant: Sophie Engle (prepared by Sophie Engle) Homework 8 Hints Due Wednesday June 6 th 2007 Section 6.1 #16 What is the

### 4: Probability. What is probability? Random variables (RVs)

4: Probability b binomial µ expected value [parameter] n number of trials [parameter] N normal p probability of success [parameter] pdf probability density function pmf probability mass function RV random

### Precalculus idea: A picture is worth a thousand

Six Pillars of Calculus by Lorenzo Sadun Calculus is generally viewed as a difficult subject, with hundreds of formulas to memorize and many applications to the real world. However, almost all of calculus

### Joint Continuous Distributions

Joint Continuous Distributions Statistics 11 Summer 6 Copright c 6 b Mark E. Irwin Joint Continuous Distributions Not surprisingl we can look at the joint distribution of or more continuous RVs. For eample,

### Exercises with solutions (1)

Exercises with solutions (). Investigate the relationship between independence and correlation. (a) Two random variables X and Y are said to be correlated if and only if their covariance C XY is not equal

### Practice Exam 1: Long List 18.05, Spring 2014

Practice Eam : Long List 8.05, Spring 204 Counting and Probability. A full house in poker is a hand where three cards share one rank and two cards share another rank. How many ways are there to get a full-house?

### An-Najah National University Faculty of Engineering Industrial Engineering Department. Course : Quantitative Methods (65211)

An-Najah National University Faculty of Engineering Industrial Engineering Department Course : Quantitative Methods (65211) Instructor: Eng. Tamer Haddad 2 nd Semester 2009/2010 Chapter 5 Example: Joint

### Chapter 4. Multivariate Distributions

1 Chapter 4. Multivariate Distributions Joint p.m.f. (p.d.f.) Independent Random Variables Covariance and Correlation Coefficient Expectation and Covariance Matrix Multivariate (Normal) Distributions Matlab

### , where f(x,y) is the joint pdf of X and Y. Therefore

INDEPENDENCE, COVARIANCE AND CORRELATION M384G/374G Independence: For random variables and, the intuitive idea behind " is independent of " is that the distribution of shouldn't depend on what is. This

### Estimation with Minimum Mean Square Error

C H A P T E R 8 Estimation with Minimum Mean Square Error INTRODUCTION A recurring theme in this text and in much of communication, control and signal processing is that of making systematic estimates,

Using Your TI-NSpire Calculator: Binomial Probability Distributions Dr. Laura Schultz Statistics I This handout describes how to use the binompdf and binomcdf commands to work with binomial probability

### Continuous Distributions, Mainly the Normal Distribution

Continuous Distributions, Mainly the Normal Distribution 1 Continuous Random Variables STA 281 Fall 2011 Discrete distributions place probability on specific numbers. A Bin(n,p) distribution, for example,

### MATH Fundamental Mathematics II.

MATH 10032 Fundamental Mathematics II http://www.math.kent.edu/ebooks/10032/fun-math-2.pdf Department of Mathematical Sciences Kent State University December 29, 2008 2 Contents 1 Fundamental Mathematics

### Some notes on the Poisson distribution

Some notes on the Poisson distribution Ernie Croot October 2, 2008 1 Introduction The Poisson distribution is one of the most important that we will encounter in this course it is right up there with the

### Discrete Random Variables; Expectation Spring 2014 Jeremy Orloff and Jonathan Bloom

Discrete Random Variables; Expectation 18.05 Spring 2014 Jeremy Orloff and Jonathan Bloom This image is in the public domain. http://www.mathsisfun.com/data/quincunx.html http://www.youtube.com/watch?v=9xubhhm4vbm

### Joint Distribution and Correlation

Joint Distribution and Correlation Michael Ash Lecture 3 Reminder: Start working on the Problem Set Mean and Variance of Linear Functions of an R.V. Linear Function of an R.V. Y = a + bx What are the properties

### Continuous random variables

Continuous random variables So far we have been concentrating on discrete random variables, whose distributions are not continuous. Now we deal with the so-called continuous random variables. A random

### ST 371 (IV): Discrete Random Variables

ST 371 (IV): Discrete Random Variables 1 Random Variables A random variable (rv) is a function that is defined on the sample space of the experiment and that assigns a numerical variable to each possible

### Probability and Statistics Prof. Dr. Somesh Kumar Department of Mathematics Indian Institute of Technology, Kharagpur

Probability and Statistics Prof. Dr. Somesh Kumar Department of Mathematics Indian Institute of Technology, Kharagpur Module No. #01 Lecture No. #15 Special Distributions-VI Today, I am going to introduce

### Probability density function : An arbitrary continuous random variable X is similarly described by its probability density function f x = f X

Week 6 notes : Continuous random variables and their probability densities WEEK 6 page 1 uniform, normal, gamma, exponential,chi-squared distributions, normal approx'n to the binomial Uniform [,1] random

### Definition: Suppose that two random variables, either continuous or discrete, X and Y have joint density

HW MATH 461/561 Lecture Notes 15 1 Definition: Suppose that two random variables, either continuous or discrete, X and Y have joint density and marginal densities f(x, y), (x, y) Λ X,Y f X (x), x Λ X,

### Statistics - Written Examination MEC Students - BOVISA

Statistics - Written Examination MEC Students - BOVISA Prof.ssa A. Guglielmi 26.0.2 All rights reserved. Legal action will be taken against infringement. Reproduction is prohibited without prior consent.