Introduction to Probability



Similar documents
Elements of probability theory

Math/Stats 425 Introduction to Probability. 1. Uncertainty and the axioms of probability

Lecture Note 1 Set and Probability Theory. MIT Spring 2006 Herman Bennett

Discrete Mathematics and Probability Theory Fall 2009 Satish Rao, David Tse Note 10

Lesson 1. Basics of Probability. Principles of Mathematics 12: Explained! 314

Chapter 4 - Practice Problems 1

Chapter What is the probability that a card chosen from an ordinary deck of 52 cards is an ace? Ans: 4/52.

Basic Probability Concepts

Probability: Terminology and Examples Class 2, 18.05, Spring 2014 Jeremy Orloff and Jonathan Bloom

Session 8 Probability

Random variables, probability distributions, binomial random variable

For two disjoint subsets A and B of Ω, say that A and B are disjoint events. For disjoint events A and B we take an axiom P(A B) = P(A) + P(B)

Probabilistic Strategies: Solutions

PROBABILITY. The theory of probabilities is simply the Science of logic quantitatively treated. C.S. PEIRCE

1 Combinations, Permutations, and Elementary Probability

V. RANDOM VARIABLES, PROBABILITY DISTRIBUTIONS, EXPECTED VALUE

E3: PROBABILITY AND STATISTICS lecture notes

Lecture 1 Introduction Properties of Probability Methods of Enumeration Asrat Temesgen Stockholm University

The Binomial Distribution

Probability and Statistics Vocabulary List (Definitions for Middle School Teachers)

2. How many ways can the letters in PHOENIX be rearranged? 7! = 5,040 ways.

Section 6-5 Sample Spaces and Probability

6.3 Conditional Probability and Independence

Statistics 100A Homework 2 Solutions

Combinatorial Proofs

5. Probability Calculus

Probability. Sample space: all the possible outcomes of a probability experiment, i.e., the population of outcomes

Bayesian Tutorial (Sheet Updated 20 March)

Conditional Probability, Independence and Bayes Theorem Class 3, 18.05, Spring 2014 Jeremy Orloff and Jonathan Bloom

Section 6.2 Definition of Probability

Probability --QUESTIONS-- Principles of Math 12 - Probability Practice Exam 1

AP Stats - Probability Review

Chapter 4 - Practice Problems 2

Chapter 4: Probability and Counting Rules

Question of the Day. Key Concepts. Vocabulary. Mathematical Ideas. QuestionofDay

Chapter 4. Probability and Probability Distributions

WHERE DOES THE 10% CONDITION COME FROM?

Question: What is the probability that a five-card poker hand contains a flush, that is, five cards of the same suit?

STATISTICS HIGHER SECONDARY - SECOND YEAR. Untouchability is a sin Untouchability is a crime Untouchability is inhuman

STAT 319 Probability and Statistics For Engineers PROBABILITY. Engineering College, Hail University, Saudi Arabia

Chapter 4 & 5 practice set. The actual exam is not multiple choice nor does it contain like questions.

MATH 140 Lab 4: Probability and the Standard Normal Distribution

Mathematical goals. Starting points. Materials required. Time needed

STAT 35A HW2 Solutions

Probability and statistical hypothesis testing. Holger Diessel

36 Odds, Expected Value, and Conditional Probability

8.3 Probability Applications of Counting Principles

Math 3C Homework 3 Solutions

ST 371 (IV): Discrete Random Variables

Discrete Mathematics and Probability Theory Fall 2009 Satish Rao, David Tse Note 13. Random Variables: Distribution and Expectation

People have thought about, and defined, probability in different ways. important to note the consequences of the definition:

PROBABILITY. Chapter. 0009T_c04_ qxd 06/03/03 19:53 Page 133

Sample Space and Probability

An Introduction to Basic Statistics and Probability

Fundamentals of Probability

Pattern matching probabilities and paradoxes A new variation on Penney s coin game

Unit 4 The Bernoulli and Binomial Distributions

Definition and Calculus of Probability

Chapter 4 Lecture Notes

Chapter 13 & 14 - Probability PART

EXAM. Exam #3. Math 1430, Spring April 21, 2001 ANSWERS

4. Binomial Expansions

4. Continuous Random Variables, the Pareto and Normal Distributions

Probability. a number between 0 and 1 that indicates how likely it is that a specific event or set of events will occur.

How To Find The Sample Space Of A Random Experiment In R (Programming)

Statistics in Geophysics: Introduction and Probability Theory

(b) You draw two balls from an urn and track the colors. When you start, it contains three blue balls and one red ball.

Lecture 1: Systems of Linear Equations

Basic Probability. Probability: The part of Mathematics devoted to quantify uncertainty

A Few Basics of Probability

INCIDENCE-BETWEENNESS GEOMETRY

Chapter 5 A Survey of Probability Concepts

Homework 3 Solution, due July 16

Responsible Gambling Education Unit: Mathematics A & B

DETERMINE whether the conditions for a binomial setting are met. COMPUTE and INTERPRET probabilities involving binomial random variables

What Do You Expect?: Homework Examples from ACE

Exact Nonparametric Tests for Comparing Means - A Personal Summary

Lecture Notes for Introductory Probability

REPEATED TRIALS. The probability of winning those k chosen times and losing the other times is then p k q n k.

So let us begin our quest to find the holy grail of real analysis.

MATRIX ALGEBRA AND SYSTEMS OF EQUATIONS

Complement. If A is an event, then the complement of A, written A c, means all the possible outcomes that are not in A.

Math 55: Discrete Mathematics

Representation of functions as power series

Probability Using Dice

IAM 530 ELEMENTS OF PROBABILITY AND STATISTICS INTRODUCTION

Introductory Probability. MATH 107: Finite Mathematics University of Louisville. March 5, 2014

Chapter 3. Probability

Comparison of frequentist and Bayesian inference. Class 20, 18.05, Spring 2014 Jeremy Orloff and Jonathan Bloom

Probability definitions

Point and Interval Estimates

36 CHAPTER 1. LIMITS AND CONTINUITY. Figure 1.17: At which points is f not continuous?

Basic Probability Theory II

Probabilities. Probability of a event. From Random Variables to Events. From Random Variables to Events. Probability Theory I

Discrete Mathematics

The Binomial Probability Distribution

3.2 Conditional Probability and Independent Events

Solution to Homework 2

c 2008 Je rey A. Miron We have described the constraints that a consumer faces, i.e., discussed the budget constraint.

Review for Test 2. Chapters 4, 5 and 6

Transcription:

3 Introduction to Probability Given a fair coin, what can we expect to be the frequency of tails in a sequence of 10 coin tosses? Tossing a coin is an example of a chance experiment, namely a process which results in one and only one outcome from a set of mutually exclusive outcomes, where the outcomes cannot be predicted with certainty. A chance experiment can be real or conceptual. Other examples of a chance experiment are: throwing a fair die 10 times and recording the number of times a prime number (namely 1, 2, 3 or 5) is obtained, or selecting 5 students at random and recording whether they are male or female, or randomly drawing a sample of voters from the U.S. population. 3.1 SAMPLE SPACES AND EVENTS The most basic outcomes of a chance experiment are called elementary outcomes or sample points. Any theory involves idealizations, and our rst idealization concerns the elementary outcomes of an experiment. For example, when a coin is tossed, it does not necessarily fall head (H) or tail (T), for it can stand on its edge or roll away. Still we agree that H and T are the only elementary outcomes. The sample space is the set of all elementary outcomes of a chance experiment. An outcome that can be decomposed into a set of elementary outcomes is called an event. The simplest kind of sample spaces are the ones that are nite, that is, consist only of a nite number of points. If the number of points is small, then these spaces are easy to visualize. Example 3.1 Consider the chance experiment of tossing 3 coins or, equivalently, tossing the same coin 3 times. The sample space of this experiment is easily constructed by noticing that the rst coin toss has two possible outcomes, H and T. Given the result of the rst coin toss, the second also has H and T as possible outcomes. Given the results of the rst two coin tosses, the third also has H and T as possible outcomes. The outcome tree of this experiment and its sample points are listed in Table 4. Taken together, these sample points comprise the sample space. The event \at least 2 heads" consists of the following sample points HHH; HHT; HTH; THH: 2

20 Table 4 Outcome tree and sample space of the chance experiment of tossing 3 coins. H T / H HHH H / n T HHT n / H HTH T n T HTT / H THH H / n T THT n / H TTH T n T TTT Many important sample spaces are not nite. Some of them contain countably many points, and some of them may even contain uncountably many points. Example 3.2 Consider the chance experiment of tossing a coint until a head turns up. The points of this sample space are: H; T; TH;TT; TTH; TTT;::: This sample space contains countably many points. 2 Example 3.3 Consider the chance experiment of picking a real number from the interval (0; 1). This sample space contains uncountably many points. 2 3.2 RELATIONS AMONG EVENTS Let S be a sample space, e an elementary outcome and E an event, that is, a set of elementary outcomes. Because the notions of elementary outcome and event are the same as those of point and point set in set theory, standard concepts and results from set theory also apply to probability theory. Thus, ; denotes the impossible event, that is, the event that contains no sample point. Given an event E, E c denotes the complement of E, that is, the event consisting of all points of S that are not contained in E. Clearly, S c = ; and ; c = S. Given two events A and B, we say that A is contained in B, written A µ B, if all points in A are also in B. In the language of probability, we say that \B occurs whenever A occurs". Clearly, for any event E, we have that ; µ E and E µ S. We

INTRODUCTION TO PROBABILITY 21 say that A and B are equal, written A = B, if A µ B and B µ A. We say that A is strictly contained in B, written A ½ B, if A µ B but A is not equal to B. Given two events A and B, the event A[B (called the union of A and B) corresponds to the occurrence of either A or B, that is, it consists of all sample points that are either in A or in B, or in both. Clearly, A [ B = B [ A; A µ (A [ B); B µ (A [ B): Given any event E, we also have E [ E c = S; E [ S = S; E [ ; = E: (3.1) Given two events A and B, the event A \ B (called the intersection of A and B) corresponds to the occurrence of both A and B, that is, it consists of all sample points that are in both A and B. When A \ B = ;, we say that the events A and B are mutually exclusive, that is, they cannot occur at once. Clearly, Further A \ B = B \ A; (A \ B) µ A; (A \ B) µ B: Given any event E, we also have (A \ B) µ (A [ B): E \ E c = ;; E \ S = E; E \ ; = ;: (3.2) In fact, the relationship between (3.1) and (3.2) is a special case of the following results, known as de Morgan's laws. Given two events A and B (A \ B) c = A c [ B c ; (A [ B) c = A c \ B c : De Morgan's laws show that complementation, union and intersection are not independent operations. Given two events A and B, the event E = A B (called the di erence of A and B) corresponds to all sample points in A that are not in B. Clearly, A B = A \ B c. Notice that A B and B A are di erent events, that (A B) \ (B A) = ; and that (A \ B) [ (A B) = A. Venn diagrams. 3.3 PROBABILITIES IN SIMPLE SAMPLE SPACES Probabilities are just numbers assigned to events. These numbers have the same nature as lengths, areas and volumes in geometry. How are probability numbers assigned? In the experiment of tossing a fair coin, where S = fh;tg, we do not hesitate to assign probability 1/2 to each of the two elementary outcomes H and T. From the theoretical point of view this is merely a convention, which can however be justi ed on the basis of actually tossing a fair coin a large number of times. In this case, the probability 1/2 assigned to the event \H occurred" can be interpreted as the limiting

22 relative frequency of heads in the experiment of tossing a fair coin n times as n! 1. The view of probabilities as the limit of relative frequencies is called the frequentist interpretation of probabilities. This is not the only interpretation, however. Another important one is the subjectivist interpretation, where probabilities are essentially viewed as representing degrees of belief about the likelihood of an event. A sample space consisting of a nite number of points, where each point is equally probable, that is, receives the same probability, is called simple. Example 3.4 The sample space corresponding to the chance experiment of tossing a fair coin 3 times is a simple sample space where each sample point receives the same probability 1/8. 2 Given a simple sample space S, the probability of an event E µ S is Pr(E) = number of sample points in E total number of sample points : Several important properties of probabilities follow immediately from this de nition: (i) 0 Pr(E) 1; (ii) Pr(S) = 1; (iii) Pr(;) = 0. These three properties hold for general sample spaces as well. Other properties are easy to understand using Venn diagrams. If A µ B, then If E = A [ B, then Clearly, Pr(A) Pr(B): Pr(E) = sum of the probabilities of all sample points in A [ B = Pr(A) + Pr(B) Pr(A \ B) Pr(A) + Pr(B): Pr(E) = Pr(A) + Pr(B) if and only if Pr(A \ B) = 0, that is, A and B are mutually exclusive events. For the complement E c of E, since E [ E c = S and E \ E c = ;, we have Pr(E) + Pr(E c ) = Pr(S) = 1 and so Pr(E c ) = 1 Pr(E): Example 3.5 Consider the simple sample space corresponding to experiment of tossing a fair coin 3 times. The event \at least 2 heads" corresponds to the set of elementary outcomes A = fhhh; HHT; HTH; THHg: Therefore, its probability is Pr(A) = 4=8 = 1=2:

INTRODUCTION TO PROBABILITY 23 The event \at least 1 tail" corresponds to the set of elementary outcomes B = fhht; HTH; HTT; THH; THT; TTH; TTTg: Because B is the complement of the event \no tails", its probability is The intersection ofa andb is the event Pr(B) = 7=8 = 1 Pr(HHH): A \ B = fhht; HTH; THHg; whose probability is equal to 3=8. The probability of the union of A and B is therefore equal to Pr(A) + Pr(B) Pr(A \ B) = 1 2 + 7 8 3 8 = 1; which ought not be surprising since A [ B = S in this case. 2 3.4 COUNTING RULES Calculations of probabilities for simple sample spaces is facilitated by a systematic use of a few counting rules. 3.4.1 MULTIPLICATION RULE The experiment of tossing a fair coin twice has 4 possible outcomes: HH, HT, TH and TT. This is an example of a chance experiment with the following characteristics: 1. The experiment is performed in 2 parts. 2. The rst part has n possible outcomes, say x 1 ;:::;x n. Regardless of which of these outcomes occurred, the second part has m possible outcomes, say y 1 ;:::;y m. Each point of the sample space S is therefore a pair e = (x i ;y j ), where i = 1;:::;n and j = 1;:::;m, and S consists of the mn pairs (x 1 ;y 1 ) (x 1 ;y 2 ) (x 1 ;y m ) (x 2 ;y 1 ) (x 2 ;y 2 ) (x 2 ;y m )... (x n ;y 1 ) (x n ;y 2 ) (x n ;y m ): The generalization to the case of an experiment with more than 2 parts is straightforward. Consider an experiment that is performed in k parts (k 2), where the hth part of the experiment has n h possible outcomes (h = 1;:::;k) and each of the outcomes in any part of the experiment can occur regardless of which speci c outcome occurred in any of the other parts. Then each sample point in S will be a k-tuple e = (u 1 ;:::;u k ), where u h is one of the n h possible outcomes in the hth part of the experiment. The total number of sample points in S is therefore equal to n 1 n 2 n k :

24 Example 3.6 Suppose one can choose between 10 speaker types, 5 receivers and 3 CD players. The number of di erent stereo systems that can be put together this way is 10 5 3 = 150. 2 The next two subsections provide important examples of application of the multiplication rule. 3.4.2 SAMPLING WITH REPLACEMENT Consider a chance experiment which consists of k repetitions of the same basic experiment or trial. If each trial has the same number n of possible outcomes, then the total number of sample points in S is equal to n k. Example 3.7 Consider tossing a coin 4 times. The total number of outcomes is 2 4 = 16. 2 Example 3.8 Consider a box containing 10 balls numbered 1; 2;:::; 10. Suppose that we repeat 5 times the basic experiment of selecting one ball at random, recording its number and then putting the ball back in the urn. Since the number of possible outcomes in each trial is equal to 10, the total number of possible outcomes of the experiment is equal to 10 5 = 100; 000. This experiment is an example of sampling with replacement from a nite population. 2 3.4.3 SAMPLING WITHOUT REPLACEMENT Sampling without replacement corresponds to successive random draws, without replacement, of a single population unit. In the example of drawing balls from a box (Example 3.8), after a ball is selected, it is left out of the box. Example 3.9 Consider a deck of 52 cards. If we select 3 cards in succession, then there are 52 possible outcomes at the rst selection, 51 at the second, and 50 at the third. This is an example of sampling without replacement from a nite population. The total number of possible outcomes is therefore 52 51 50 = 132; 600. 2 If k elements have to be selected from a set of n elements, then the total number of possible outcomes is P n;k = n(n 1)(n 2) (n k + 1); called the number of permutations of n elements taken k at a time. If k = n, then the number of possible outcomes is the number of permutations of all n elements P n;n = n(n 1)(n 2) 2 1; called n factorial and denoted by n!. By convention 0! = 1. Thus P n;k = n(n 1)(n 2) (n k + 1)(n k) 2 1 (n k)(n k 1) 2 1 = n! (n k)! :

INTRODUCTION TO PROBABILITY 25 Example 3.10 Given a group of k people (2 k 365), what is the probability that at least 2 people in the group have the same birthday? To simplify the problem, assume that birthdays are unrelated (there are no twins) and that each of the 365 days of the year are equally likely to be the birthday of any person. The sample space S then consists of 365 k possible outcomes. The number of outcomes in S for which all k birthdays are di erent is P 365;k. Therefore, if E denotes the event \all k people have di erent birthdays", then Pr(E) = P 365;k 365 k : Because the event \at least 2 people have the same birthday" is just the complement of E, we get Pr(E c ) = 1 P 365;k 365 k : We denote this probability by p(k). The table below summarizes the value of p(k) for di erent values of k: k p(k) 5.027 10.117 20.411 40.891 60.994 Notice that, in a class of 100 people, the event that at least 2 people have the same birthday is almost certain. 2 3.4.4 COMBINATIONS As a motivation, consider the following example. Example 3.11 Consider combining 4 elements a, b, c and d, taken 2 at a time. The total number of possible outcomes is equal to the permutation of 4 objects taken 2 at a time, namely P 4;2 = 4 3 = 12: If the order of the elements of each pair is irrelevant, the table below shows that 6 di erent combinations are obtained:

26 12 permutations 6 combinations a;b a;c a;d b;a b;c b;d c;a c;b c;d d;a d;b d;c fa;bg fa;cg fa;dg fb;cg fb;dg fc;dg 2 Let C n;k denote the number of di erent combinations of n objects taken k at a time. To determine C n;k notice that the list of P n;k permutations may be constructed as follows. First select a particular combination of k objects. Then notice that this particular combination can produce k! permutations. Hence from which we get P n;k = C n;k k!; C n;k = P n;k k! = n! (n k)!k! : The number C n;k is also called binomial coe±cient and denoted µ n C n;k = : k Clearly µ µ n n! n = k (n k)!k! = : n k Example 3.12 In Example 3.11, n = 4, k = 2 and so C 4;2 = 12=2 = 6. 2 Example 3.13 Given a hand of 5 cards, randomly drawn from a deck of 52, the probability of a \straight ush" is p = Pr(\straight ush") = no. of di erent straight ushes : no. of di erent hands The number of di erent hands is equal to µ 52 = 52! = 2; 598; 960: 5 5! 47!

INTRODUCTION TO PROBABILITY 27 Because there are 10 straight ushes for each suit, the total number of straight ushes is 10 4 = 40. Therefore, the desired probability is p = 40 2; 598; 960 = :000015: Not a high one! 2 When a set contains only elements of 2 distinct types, a binomial coe±cient may be used to represent the number of di erent arrangements of all the elements in the set. Example 3.14 Suppose that k red balls and n k green balls are to be arranged in a row. Since the red balls occupy k positions, the number of di erent arrangements of the n balls corresponds to the number C n;k of combinations of n objects taken k at a time. 2 Example 3.15 Given a hand of 5 cards, randomly drawn from a deck of 52, the probability of a \poker" is p = Pr(\poker") = no. of di erent pokers no. of di erent hands ; where the denominator is the same as in Example 3.13. To compute the denominator, notice that 13 types of poker are possible: A, K, Q,..., 2, and that 5 cards can be divided in 2 groups, one of 4 and one of 1 cards, in C 5;4 = µ 5 4 = 5! 1! 4! = 5 possible ways. Therefore, the number of possible pokers in one hand of 5 cards is 13 5 = 65 and so 65 p = 2; 598; 960 = :000025; which is higher than the probability of a straight ush. 2 3.5 CONDITIONAL PROBABILITIES Suppose that we have a sample space S where probabilities have been assigned to all events. If we know that the event B ½ S occurred, then it seems intuitively obvious that this ought to modify our assignment of probabilities to any other event A ½ S, because the only sample points in A that are now possible are the ones that are also contained in B. This new probabilitiy assigned to A is called the conditional probability of the event A given that the event B has occurred, or simply the conditional probability of A given B, and denoted by Pr(AjB). Example 3.16 Consider again the experiment of tossing a fair coin 3 times. Let A = \at least one T" and B = \H in the rst trial". Clearly Pr(B) = 1=2; Pr(A) = 7=8; Pr(A \ B) = 3=8:

28 If we know that B occurred, then the relevant sample space becomes Therefore S 0 = fhhh; HHT; HTH; HTTg: Pr(AjB) = 3 4 = 3=8 1=2 Pr(A \ B) = : Pr(B) Notice that Pr(AjB) < Pr(A) in this case. 2 De nition 3.1 If A and B are any two events, then the conditional probability of A given B is Pr(A \ B) Pr(AjB) = Pr(B) if Pr(B) > 0, and Pr(AjB) = 0 otherwise. 2 The conditional probability of B givena is similarly de ned as Pr(B ja) = Pr(A \ B) Pr(A) provided that Pr(B) > 0. The frequentist interpretation of conditional probabilities is as follows. If a chance experiment is repeated a large number of times, then the proportion of trials on which the event B occurs is approximately equal to Pr(B), whereas the proportion of trials in which both A and B occur is approximately equal to Pr(A \ B). Therefore, among those trials in which B occurs, the proportion in which A also occurs is approximately equal to Pr(A \ B)= Pr(B). De nition 3.1 may be re-expressed as Pr(A \ B) = Pr(AjB) Pr(B): (3.3) This result, called the multiplication law, provides a convenient way of nding Pr(A \ B) whenever Pr(AjB) and Pr(B) are easy to nd. Example 3.17 Consider a hand of 2 cards randomly drawn from a deck of 52. Let A = \second card is a king" and B = \ rst card is an ace". Then Pr(B) = 4=52 and Pr(AjB) = 4=51. Hence Pr(A \ B) = Pr(\ace and then king") = Pr(AjB) Pr(B) = 4 4 51 52 = :0060: 2 We now consider a useful application of the multiplication law (3.3). Notice that A = (A \ B) [ (A \ B c ); where A \ B and A \ B c are disjoint events because B and its complement B c are disjoint. Hence Pr(A) = Pr(A \ B) + Pr(A \ B c );

INTRODUCTION TO PROBABILITY 29 where, by the multiplication law, and Therefore Pr(A \ B) = Pr(A jb) Pr(B) Pr(A \ B c ) = Pr(AjB c ) Pr(B c ): Pr(A) = Pr(AjB) Pr(B) + Pr(AjB c ) Pr(B c ); (3.4) which is sometimes called the law of total probabilities. Example 3.18 Consider a hand of 2 cards randomly drawn from a deck of 52. Let A = \second card is a king" and B = \ rst card is a king". We have Pr(B) = 4=52, Pr(B c ) = 48=52 and Hence, by the law of total probabilities Pr(AjB) = 3=51; Pr(AjB c ) = 4=51: Pr(A) = 3 51 4 52 + 4 51 48 52 = 4 52 : Thus Pr(A) and Pr(B) are the same. 2 3.6 STATISTICAL INDEPENDENCE Let A and B be two events with non-zero probability. If knowing that B occurred gives no information about whether or not A occurred, then the probability assigned to A should not be modi ed by the knowledge that B occurred, that is, Pr(AjB) = Pr(A). Hence, by the multiplication law, Pr(A \ B) = Pr(A) Pr(B): We take this as our formal de nition of statistical independence. De nition 3.2 Two events A and B are said to be statistically independent if Pr(A \ B) = Pr(A) Pr(B): Notice that this de nition of independence is symmetric in A and B, and also covers the case when Pr(A) = 0 or Pr(B) = 0. It is easy to show that if A and B are independent, then A and B c as well as A c and B c are independent. It is clear from De nition 3.2 thatmutually exclusive events cannot be independent. The concept of statistical independence is di erent from other concepts of independence (logical, mathematical, political, etc.). When there is no ambiguity, the term independence will be taken to mean statistical independence. 2

30 Example 3.19 The sample space associated with the experiment of tossing a fair coin twice is a simple sample space consisting of 2 2 = 4 points. De ne the events A = \H in the rst toss" and B = \T in the second toss". Because A \ B = HT we have Pr(A \ B) = 1 4 = 1 2 1 = Pr(A) Pr(B): 2 This result seems fairly intuitive, because the occurrence of H in the rst coin toss has no relation to, and no in uence on the occurrence of T in the second coin toss, and viceversa. 2 It is natural to assume that events that are physically unrelated (such as successive coin tosses) are also statistically independent. However, physically related events may also satisfy the de nition of statistical independence. Example 3.20 Consider the chance experiment consisting of throwing a fair die. The sample space of this experiment is the simple sample space: 1 2 3 4 5 6 Let A = \an even number is obtained" and B = \the number 1, 2, 3 or 4 is obtained". It is easy to verify that Pr(A) = 1=2 and Pr(B) = 2=3. Further Pr(A \ B) = Pr(\2 or 4") = 1=3 = Pr(A) Pr(B): Hence, A and B are independent even though their occurrence depends on the same roll of a die. 2 3.7 BAYES LAW Suppose that you want to determine whether a coin is fair (F) or unfair (U). You have no information on the coin, and so you are willing to believe that F and U are equally likely, that is, Pr(F) = Pr(U) = 1=2: If the coin is fair, then Pr(H jf) = 1=2: Further suppose that you know that, if the coin is unfair, then H is more likely than T, say Pr(H ju) = :9:

INTRODUCTION TO PROBABILITY 31 Assume that tossing the coin once gives you H. What is now the probability that the coins is unfair? This is called the posterior probability of F given H, and denoted by Pr(F jh). Intuitively, the occurrence of H (the most likely event if the coin is unfair) should modify your initial beliefs, leading you to view the event that the coin is fair as less likely than initially thought, whereas the occurrence of T should lead you to view the event that the coin is fair as more likely than initially thought. One way of computing the posterior probabilities Pr(F jh) and Pr(F jt) is to draw the outcome tree for this problem. F U / H Pr(H \ F) = :25 n T Pr(T \ F) = :25 / H Pr(H \ U) = :45 n T Pr(T \ U) = :05 It is then clear that the events U and F are mutually exclusive and that the event H is the union of the two disjoint events H \ F and H \ U. Hence Therefore Pr(H) = Pr(H \ F) + Pr(H \ U) = :25 + :45 = :70: Pr(F jh) = Pr(H \ H) Pr(H) = :25 :70 = :357; which is indeed less than the original assignement of probability to F, namely Pr(F) = 1=2. By a similar argument we have Pr(F jt) = Pr(T \ F) Pr(T) = :25 :30 = :833: We can also compute the posterior probability Pr(F jh) without the need of a tree diagram, by using the fact that by the multiplication law, and by the law of total probabilities. Hence, Pr(H \ F) = Pr(H jf) Pr(F) Pr(H) = Pr(H jf) Pr(F) + Pr(H ju) Pr(U) Pr(F jh) = Pr(H jf) Pr(F) Pr(H jf) Pr(F) + Pr(H ju) Pr(U) : (3.5)

32 This formula is known as Bayes law. For Pr(F jt), Bayes law gives Pr(F jt) = Pr(T jf) Pr(F) Pr(T jf) Pr(F) + Pr(T ju) Pr(U) ; where Pr(T jf) = 1 Pr(H jf) and Pr(T ju) = 1 Pr(H ju). Notice that we can regard Pr(F) as our prior information about whether the coin is fair. Bayes law then gives us a way of updating this information in the light of the new information contained in the fact that H was obtained.