Ch. 13.3: More about Probability

Ch. 13.3: More about Probability Complementary Probabilities Given any event, E, of some sample space, U, of a random experiment, we can always talk about the complement, E, of that event: this is the set of outcomes not favourable to E. For example, if the experiment is rolling 2 fair dice, and E = { rolling a 7 }, then E = { not rolling a 7 }. his illustrates that taking the complement of an event is the same as taking the negation of a statement. Also, it is easy to see that: E E = and E E = U. Now recall from our Ch. 13.1 notes some of the basic identities in probability theory: P(E F ) = P(E) + P(F ) P(E F ), for any events E, F. P(U) = 1 (i.e, the probability of some outcome is always 100%). P( ) = 0 (i.e, the probability of an impossible event is always 0%). If we let F = E in the first equation, we have: P(E E) = P(E) + P(E) P(E E) P(U) = P(E) + P(E) P( ) 1 = P(E) + P(E) 0 his gives us the following identity: Property of Complementary Probabilities P(E) = 1 P(E), for any event E. (Note that P(E) = 1 P(E) is also true). 1

Example 1: Suppose 2 cards are randomly drawn from a full deck of 52, without replacement. (a) What is the probability that they are not both Aces? (b) What is the probability that we draw at least one Ace? ANSWER: Since the drawing is random, we can assume that all possible combinations of 2 cards chosen from 52 are equally-likely. hus, our sample space has U = 52 C 2 = 1326 possible outcomes. (a) Let A be the event of drawing 2 Aces. Since there are 4 Aces in a deck, there are A = 4 C 2 = 6 possible outcomes in event A. hus, by the property of complementary probabilities: P(not drawing 2 Aces) = P(A) = 1 P(A) = 1 A U = 1 6 1326 = 220 221 = 0.995475. (b) Let D be the event of drawing at least 1 Ace. hen D is the event of drawing no Aces. If we exclude the possibility of an Ace, then we must choose 2 from 52 4 = 48 cards in the deck. hus, D = 48 C 2 = 1128, and: P(drawing at least 1 Ace) = P(D) = 1 P(D) = 1 D U = 1 1128 1326 = 33 221 0.149. 2

Example 2: Suppose we toss a fair coin 10 times in a row. (a) What is the probability of tossing at least 1 heads? (b) What is the probability of tossing exactly 1 heads? (c) What is the probability of tossing at least 2 heads? ANSWER: Each toss results in 2 possible equally-likely outcomes: heads (H) or tails (). Since there are 10 tosses in a row, our sample space has U = 2 10 = 1024 equallylikely outcomes (the order of tosses is important here!). (a) If we do not toss at least 1 heads, then we toss zero heads. here is only 1 possible outcome for for this event: the outcome in which we toss 10 tails in a row,... 1st toss 2nd toss 3rd toss 9th toss 10th toss hus, P(toss at least 1 heads) = 1 P(toss 0 heads) = 1 1 1024 = 1023 1024 = 0.999023. (b) If we toss exactly 1 heads, then all our tosses are tails, except for one: H... 1st toss 2nd toss single heads 3rd toss 9th toss 10th toss here are 10 possible positions in which this one heads can occur. hus, this event has 10 possible outcomes. P(toss exactly 1 heads) = 10 1024 = 5 512 0.0098. 3

(c) If we do not toss at least 2 heads, then we would have tossed either 1 heads or 0 heads. hese events are mutually-exclusive, ergo, P(toss at least 2 heads) = 1 P(toss 1 heads or toss 0 heads) = 1 ( P(toss 1 heads) + P(toss 0 heads) ) = 1 ( 10 1024 + 1 ) 1024 = 1013 1024 = 0.989258. Odds When discussing games of chance (such as in a casino or at a horse track), probabilities of events are often given in terms of odds. For instance, you might hear that the odds of winning a single-number bet on a single roulette spin is 1 to 27. What this means is that in the long run, you will lose the bet 27 times more often than win it. It is important to note here that the ratio 1/27 is not the probability of winning. he probability of winning the bet is 1 1+27 = 1/28. In terms of probabilities, the odds for an event are defined by the following ratios: Odds of an Event Given any event E of a random experiment, Odds in favour of E = P(E)/P(E) Odds against E = P(E)/P(E) Odds are almost always quoted as a ratio of 2 positive integers, not as a percent or decimal number. 4

Example 3: What are the odds in favour of winning a bet of red on a single roulette spin? ANSWER: here are 38 possible equally-likely outcomes in a single spin. In 36/2 = 18 of these outcomes, the ball stops on a red slot. Letting E be the event of landing on red, the odds in favour of E is then: P(E)/P(E) = P(E)/(1 P(E)) = 18 /( 1 18 ) = 18/20. 38 38 We would say the odds in favour of winning the bet are 18 to 20. Notice that this answer is just E / E. If our experiment always has a finite number of equally-likely outcomes, then the odds in favour of any event E is always E / E. In most situations, it is better to think of odds in terms of this ratio. If we re given the odds in favour of an event, we can also derive the probability: Odds of an Event If the odds in favour of E = s/f, then... P(E) = s s + f and P(E) = f s + f. Example 4: You re at a racetrack, and the horse Sir Fisher is given odds 9 to 5 against winning 1st place. What is the probability that Sir Fisher comes in first, assuming the tote board is accurate? ANSWER: he odds in favour of Sir Fisher winning 1st place is 5 to 9. Ergo, P(1st place) = 5 5 + 9 = 5 14 = 0.357143. 5

In reality, this is a very subjective probability. At a real racetrack, the tote board odds really measure the number of bets being placed against a given horse. he accuracy of these odds as a measure of winning-likelihood depends on the public s knowledge and betting tendencies. Conditional Probability Another fundamental definition in the theory of probability is conditional probability. In the real world, probabilities are often computed given some prior known information. When given new information about a situation, our level of uncertainty may change. Here s a simple example to illustrate this notion more precisely: Example 5: Suppose you and a friend are playing a game of dice. Your friend rolls a pair of fair dice behind a black screen; you can t see what the outcome is, but you have to guess the exact number rolled. An appropriate sample space for the experiment of rolling 2 dice is, as usual Sample Space, U where all 36 outcomes are equally-likely. What is the probability of rolling a 6? We count 5 possible outcomes in which a 6 is obtained, therefore: P(rolling a 6) = 5 36. 6

Now, let s make it more interesting: suppose that after she rolled the dice, your friend tells you that it s a double. What now is the probability of it being a 6, given this new information? We can cut down the size of our sample space to only allow outcomes that are doubles: New Sample Space It seems reasonable to assume that these outcomes are still equally likely, since the mechanics of the roll was not altered. hus, since only one of these outcomes is a 6, we can say: P(rolling a 6 given rolling a double) = number of outcomes for rolling a 6 AND a double number of outcomes for rolling a double = 1 6 his is what we mean by conditional probability. In symbols, let s let E = { rolling a 6 }, and F = { rolling a double }. hen we may rewrite this as: P(E F ) = E F F = 1 6. Here the symbol is translated as given. his is a convenient definition of conditional probability when dealing with simple random experiments with finitely-many equally-likely outcomes. Note if we divide top and bottom by the size of our original sample space, U, we get the equation: P(E F ) = P(E F ). P(F ) his may be taken as the general definition of conditional probability: 7

Conditional Probability For any random experiment, the probability of an event E given some non-impossible event F is defined as: P(E F ) = P(E F ). P(F ) (Note that we must have F.) Example 6: Consider the same experiment of rolling 2 fair dice. Compute the following probabilities by counting outcomes in the sample space U, listed above: (a) P(rolling a double rolled a 6) = 1/5. (b) P(not rolling a 6 rolled a double) = 5/6. (c) P(rolling a 7 not rolling a double) = 6/30 = 1/5. (d) P(rolling a 2 rolled an odd) = 0. (e) P(rolling a double rolled a 14) = undefined (because rolling a 14 is impossible). his example helps illustrate some simple properties of conditional probability: In general, P(E F ) P(F E). P(E F ) = 0 = P(F E) if and only if E and F are mutually exclusive. P(E F ) is undefined whenever F is impossible. P(E F ) = 1 P(E F ) (property of complementary events). From a more philosophical point of view, it can be argued that all probabilities are conditional, since all probabilities are calculated on the basis of some information. Here is an example to illustrate this point of view: 8

Example 7: Among all drivers in the United States, suppose the probability of never being involved in a vehicular accident is about 0.7, and the probability of being involved in at least 2 accidents in your lifetime is 0.11. 1 Suppose Mark was just recently in a car accident. What is the probability that he will be involved in a second accident at some point in his life? ANSWER: If Mark had never been in a vehicular accident before, we might say that the chances of that happening were 1 0.7 = 0.3. However, we have some information about his history, which changes the situation. For convenience, let s let X be the random variable counting the total number of vehicle accidents Mark will encounter in his life. We know that X is never less than 0, and Mark has already been in an accident, so X 1. Given this information, we d like to know the probability that X 2: P(X 2 X 1) = P(X 2 X 1) P(X 1) = P(X 2) P(X 1) = P(X 2) 1 P(X = 0) = 0.11 1 0.7 0.367. We see that Mark s probability of being in car accident has increased from 0.3 to 0.367. Clearly, this probability is very subjective. If we knew more information about Mark (such as his age, gender, where, when, and how often he drives, etc.) as well as more detailed national statistics, we could come up with a better measure of Mark s risk. In fact, in this example, we left open one very absurd possibility: that Mark actually died in his last accident! If we eliminated this possibility from our sample space, we might expect his risk to increase slightly more. One of the great things about conditional probability is that it gives us a formula for calculating probabilities of intersections. If we multiply both sides in the definition by P(F ), we have: P(E F ) = P(F E) = P(E F ) P(F ). 1 hese values are arbitrary. 9

(Remember that the order of sets in an intersection does not matter, i.e, E F = F E always.) As it typically happens, computing P(E F ) directly is often much easier than computing P(E F ), so this formula is in fact very useful. he next example illustrates this: Example 8: Suppose you randomly draw 2 cards in sequence and without replacement from a full deck of 52. Compute the following probabilities: (a) P(drawing 2 Aces). (b) P(2nd draw is an Ace). ANSWER: For convenience, define A 1 = { 1st card is an Ace }, and A 2 = { 2nd card is an Ace }. (a) Consider both events in their chronological order: A 1 is determined first, A 2 is determined second. For the first draw, there are 52 cards total, 4 of which are Aces. hus: P(A 1 ) = 4 52. Now, assuming that the event A 1 occurs (i.e, the first draw was an Ace), we are left with 51 cards total, 3 of which are Aces. hus: Putting this togther, we have: P(A 2 A 1 ) = 3 51. P(drawing 2 Aces) = P(A 1 A 2 ) = P(A 2 A 1 ) P(A 1 ) = 3 51 4 52 = 1 221. (b) We now wish to find P(A 2 ). However if we try to compute this like we did with A 1 above, we have a problem: the probability seems to depend on whether or not the first draw was an Ace! If the first draw was an Ace, then we have P(A 2 A 1 ) = 3/51, as above. But if the first draw was not an Ace, then P(A 2 A 1 ) = 4/51, since there are still 4 Aces left in the pile. We can get around this dilemma by first rewriting A 2 in terms of A 1 : A 2 = (A 2 A 1 ) (A 2 A 1 ) 10

ry describing this equation in words to see that it makes sense. In doing so, you should also convince yourself that (A 2 A 1 ) and (A 2 A 1 ) are mutually exclusive events. Ergo, we can write: P(A 2 ) = P(A 2 A 1 ) + P(A 2 A 1 ) = P(A 2 A 1 ) P(A 1 ) + P(A 2 A 1 ) P(A 1 ) = ( 3 51 4 ) ( 4 + 52 51 48 ) 52 = 1 221 + 16 221 = 17 221 = 4 52. We see, then, that P(A 2 ) = P(A 1 ), despite the fact that the first draw affects the outcome of the second draw. Problems like this are most easily visualized using a probability tree: 3 51 A 2 P(A 1 A 2 ) = 4 52 3 51 = 1 221 start U 4 52 A 1 48 51 A 2 P(A 1 A 2 ) = 4 52 48 51 = 16 221 48 52 A 1 4 51 A 2 P(A 1 A 2 ) = 48 52 4 51 = 16 221 47 51 A 2 P(A 1 A 2 ) = 48 52 47 51 = 188 221 In a tree such as this, each node represents an event, and the number on each branch is a conditional probability of the right event given the left event. o compute the probability of an intersection, such as A 1 A 2, simply multiply all probabilities down the path U A 1 A 2. 11