Bayesian Tutorial (Sheet Updated 20 March)

Similar documents
Lesson 1. Basics of Probability. Principles of Mathematics 12: Explained! 314

Texas Hold em. From highest to lowest, the possible five card hands in poker are ranked as follows:

Probability --QUESTIONS-- Principles of Math 12 - Probability Practice Exam 1

Chapter What is the probability that a card chosen from an ordinary deck of 52 cards is an ace? Ans: 4/52.

Chapter 4 & 5 practice set. The actual exam is not multiple choice nor does it contain like questions.

Probability: Terminology and Examples Class 2, 18.05, Spring 2014 Jeremy Orloff and Jonathan Bloom

V. RANDOM VARIABLES, PROBABILITY DISTRIBUTIONS, EXPECTED VALUE

Basic Probability. Probability: The part of Mathematics devoted to quantify uncertainty

Discrete Mathematics and Probability Theory Fall 2009 Satish Rao, David Tse Note 10

Conditional Probability, Independence and Bayes Theorem Class 3, 18.05, Spring 2014 Jeremy Orloff and Jonathan Bloom

Pattern matching probabilities and paradoxes A new variation on Penney s coin game

A Few Basics of Probability

PROBABILITY. The theory of probabilities is simply the Science of logic quantitatively treated. C.S. PEIRCE

Contemporary Mathematics- MAT 130. Probability. a) What is the probability of obtaining a number less than 4?

AP Stats - Probability Review

6.3 Conditional Probability and Independence

Ready, Set, Go! Math Games for Serious Minds

Chapter 4 - Practice Problems 1

Thursday, October 18, 2001 Page: 1 STAT 305. Solutions

Probability definitions

Probabilistic Strategies: Solutions

Probability and Expected Value

Lab 11. Simulations. The Concept

Unit 19: Probability Models

Definition and Calculus of Probability

Question: What is the probability that a five-card poker hand contains a flush, that is, five cards of the same suit?

Session 8 Probability

Bayesian Updating with Discrete Priors Class 11, 18.05, Spring 2014 Jeremy Orloff and Jonathan Bloom

Decision Making Under Uncertainty. Professor Peter Cramton Economics 300

Fundamentals of Probability

Curriculum Design for Mathematic Lesson Probability

Week 2: Conditional Probability and Bayes formula

Probabilities of Poker Hands with Variations

Ch. 13.2: Mathematical Expectation

In the situations that we will encounter, we may generally calculate the probability of an event

Combinatorics 3 poker hands and Some general probability

2. How many ways can the letters in PHOENIX be rearranged? 7! = 5,040 ways.

(b) You draw two balls from an urn and track the colors. When you start, it contains three blue balls and one red ball.

Champion Poker Texas Hold em

Current California Math Standards Balanced Equations

Betting systems: how not to lose your money gambling

Discrete Structures for Computer Science

Math 141. Lecture 2: More Probability! Albyn Jones 1. jones/courses/ Library 304. Albyn Jones Math 141

CS 341 Software Design Homework 5 Identifying Classes, UML Diagrams Due: Oct. 22, 11:30 PM

Random variables, probability distributions, binomial random variable

The temporary new rules and amendments authorize casino licensees to. offer a supplemental wager in the game of three card poker known as the three

Stat 20: Intro to Probability and Statistics

Standard 12: The student will explain and evaluate the financial impact and consequences of gambling.

Basic Probability Theory II

Contemporary Mathematics Online Math 1030 Sample Exam I Chapters No Time Limit No Scratch Paper Calculator Allowed: Scientific

Minimax Strategies. Minimax Strategies. Zero Sum Games. Why Zero Sum Games? An Example. An Example

Worldwide Casino Consulting Inc.

MATH 140 Lab 4: Probability and the Standard Normal Distribution

Know it all. Table Gaming Guide

Math 3C Homework 3 Solutions

STATISTICS HIGHER SECONDARY - SECOND YEAR. Untouchability is a sin Untouchability is a crime Untouchability is inhuman

Conditional Probability

Lecture 1 Introduction Properties of Probability Methods of Enumeration Asrat Temesgen Stockholm University

Gaming the Law of Large Numbers

REPEATED TRIALS. The probability of winning those k chosen times and losing the other times is then p k q n k.

Sue Fine Linn Maskell

Mathematical goals. Starting points. Materials required. Time needed

Chapter 4 - Practice Problems 2

Poker. 10,Jack,Queen,King,Ace. 10, Jack, Queen, King, Ace of the same suit Five consecutive ranks of the same suit that is not a 5,6,7,8,9

Chapter 13 & 14 - Probability PART

How to Play. Player vs. Dealer

STA 371G: Statistics and Modeling

Standard 12: The student will explain and evaluate the financial impact and consequences of gambling.

6th Grade Lesson Plan: Probably Probability

Probability. a number between 0 and 1 that indicates how likely it is that a specific event or set of events will occur.

Introduction to Probability

Discrete Math in Computer Science Homework 7 Solutions (Max Points: 80)

Section 6-5 Sample Spaces and Probability

Chapter 4 Lecture Notes

Expected Value and the Game of Craps

5544 = = = Now we have to find a divisor of 693. We can try 3, and 693 = 3 231,and we keep dividing by 3 to get: 1

The study of probability has increased in popularity over the years because of its wide range of practical applications.

PROBABILITY. Chapter. 0009T_c04_ qxd 06/03/03 19:53 Page 133

For two disjoint subsets A and B of Ω, say that A and B are disjoint events. For disjoint events A and B we take an axiom P(A B) = P(A) + P(B)

Vieta s Formulas and the Identity Theorem

PLACE BETS (E) win each time a number is thrown and lose if the dice ODDS AND LAYS HARDWAYS (F) BUY & LAY BETS (G&H)

E3: PROBABILITY AND STATISTICS lecture notes

Prime Time: Homework Examples from ACE

Homework 3 Solution, due July 16

What Is Probability?

1/3 1/3 1/

Math Games For Skills and Concepts

Math/Stats 425 Introduction to Probability. 1. Uncertainty and the axioms of probability

Comparison of frequentist and Bayesian inference. Class 20, 18.05, Spring 2014 Jeremy Orloff and Jonathan Bloom

Analysis of poker strategies in heads-up poker

PROBABILITY SECOND EDITION

UNIT 7A 118 CHAPTER 7: PROBABILITY: LIVING WITH THE ODDS

Coin Flip Questions. Suppose you flip a coin five times and write down the sequence of results, like HHHHH or HTTHT.

Slots seven card stud...22

7.S.8 Interpret data to provide the basis for predictions and to establish

Lecture Note 1 Set and Probability Theory. MIT Spring 2006 Herman Bennett

1. General Black Jack Double Deck Black Jack Free Bet Black Jack Craps Craps Free Craps...

Transcription:

Bayesian Tutorial (Sheet Updated 20 March) Practice Questions (for discussing in Class) Week starting 21 March 2016 1. What is the probability that the total of two dice will be greater than 8, given that the first die is a 6? 2. A patient s probability to have a liver disease increases if he is an alcoholic. Given that, 10% of the patients entering the clinic have liver disease, 5% of the clinic s patients are alcoholic and that, 7% of the patients diagnosed with liver disease are alcoholics, find the probability that an alcoholic patient is diagnosed with liver disease. 3. After tossing a fair coin three times: i. What is the probability of at least two tails? ii. What is the probability of exactly one tail? iii. Given that at least one tail is observed, what is the probability of observing at least two tails? 4. Given that 13% of patients who have lung cancer, and smoke; 3% of patients smoke and do not have lung cancer. 5% of patients have lung cancer and do not smoke. 79% of patients neither have lung cancer nor do they smoke. Draw an appropriate probability table and find the probability that a patient, picked at random, has lung cancer, given that he smokes. Also find the probability that a patient is a smoker given that he has lung cancer. From this, derive your inference. 5. Consider that 0.9% of the people have a genetic defect, 92% of the tests for gene are true positives, 9.8% of the tests are false positives. If a person gets a positive test result, what are the odds that they actually have the faulty gene? 6. The probability that it is Wednesday and that a student is absent is 0.04. What is the probability that a student is absent given that today is Wednesday? 7. You go to see the doctor about an ingrowing toe-nail. The doctor selects you at random to have a blood test for swine flu, which for the purposes of this exercise we will say is currently suspected to affect 1 in 10,000 people in Australia. The test is 99% accurate, in the sense that the probability of a false positive is 1%. The probability of a false negative is zero. You test positive. What is the new probability that you have swine flu? Now imagine that you went to a friend s wedding in Mexico recently, and (for the purposes of this exercise) it is known that 1 in 200 people who visited Mexico recently come back with swine flu. Given the same test result as above, what should your revised estimate be for the probability you have the disease?

Poker Tournament Question Play this game in Groups, in YOUR OWN TIME! We are going to play a game of poker. Unlike usual games, we will discuss our cards and our bets as we play. The rules of the game Texas Hold em The aim of the game is to win the chips that are bet on each hand You do this by either o Having the best hand o Or forcing all the other players to throw in their hands instead of calling your bet i.e. they fold You will be dealt two cards that only you can see Once everyone has their cards, the person to the left of the dealer makes a bet based on their cards Everybody has the choice of: o Calling betting the same amount o Folding betting nothing and throwing in their cards o Raising Making the bet bigger, which means that the following players must bet the new higher amount if they want to call and that the players who have already bet have to make up the difference to stay in When everybody has bet the same amount or folded, three cards are dealt face up on the table. These cards are shared by all players There is another round of betting, as above, followed by one more shared card, another round of betting, and a final shared card. After a final round of betting, if more than one player is left in, the players show their hands and the player who can make the best hand using the middle cards plus 1 or 2 of his own cards wins. The order of the hands is as follows, with the best first: 1. Straight flush all cards the same suit and in unbroken order, e.g. 5,6,7,8,9 of spades 2. Four of a kind all cards have the same value, e.g. 5,5,5,5 one of each suit 3. Full house Three cards with one shared value plus two of another, e.g K,K,K,5,5 4. Flush All cards the same suit, e.g. 5 hearts. 5. Straight 5 cards in unbroken order, but of a mix of suits, e.g. 3,4 of spades, 5 of diamonds, 6 and 7 of hearts. 6. Three of a kind three cards of the same value 7. Two pairs e.g. 5,5,K,K 8. One pair e.g.7,7 9. No hand none of the above are made When two players have a hand of the same rank, the one that contains the highest valued card wins, except in the case of a full house, where the one with the highest ranking 3 of a kind wins. Round 1 Make sure you can answer these questions before you start the game! Supposing there s just me playing poker, no other players Let s say that I will bet whenever I get two fancy cards, but never any other time (fancy cards ace, Jack, Queen, King).

1. What is the probability of me making a bet, given that you don t know what cards I have? I need to play more hands than that, so I will bluff (bet on rubbish cards) exactly 20% of the time that I have rubbish cards. 2. Now what is the probability of me making a bet, given that you don t know what cards I have? Clue it is the conditional probability of me betting given that I do not have two fancy cards plus the answer to question 1. Now let s work out something more useful. I have made a bet and you want to know whether or not I am bluffing. 3. Use Bayes rule to work out the probability that I am bluffing. Then use the same rule to work out the probability that I have two fancy cards. Check that they sum to one to make sure you have it right. 4. Finally, if I also bet when I have any pair in my hand (5,5 for example), how does that affect the values calculated in 1,2 and 3 above? Once we ve answered these questions, we will try them out in a few hands. Round 2 Each player will say what cards they have and calculate the probability of being dealt those two cards. Hopefully a pattern will be quickly discovered! Players will be forced to bet a minimum amount at this point. We will then deal the three cards to the middle the so called Flop. Now each player will calculate the probability of them making a good hand from the last two unknown cards. Those with a low probability can fold. The others can bet. Again, another card will be dealt and probabilities will be calculated. People can bet or fold. The winner gets the chips. We will play a few more rounds, without people showing their cards until the end. Then we will look at the cards and calculate the probabilities that people were playing with.

Quick Review of Bayes Theorem and Bayesian Network Conditional Probability & Bayes Theorem: The conditional probability of an event B is the probability that the event will occur given the knowledge that an event A has already occurred. This probability is written P(B A), notation for the probability of B given A. P(B A) = P(A B) P(B) / P(A) Deriving Bayes theorem using conditional probability. P(A B) = P(A) * P(B A) P(B A) = P(B) * P(A B) Equating two of them, P(A) P(B A) = P(B) P(A B) This implies, P(B A) = P(A B) P(B) / P(A) Bayesian networks A Bayesian Network (BN) is a way of describing the relationships between causes & effects BNs used to support decision making and to find strategies to solve tasks under uncertainty BNs use Bayesian probability theory How are they different from others? Uncertainty is handled in a mathematically rigorous yet simple and efficient way. Network representation of problems. Most other methods do not include statistical information to make inferences. Combination of statistics and Bayesian network is powerful - Bayesian nets are a network-based framework for representing and analysing models involving uncertainty. Recall example Nodes: Random Variables Arcs: Casual or influential relationships Collection of nodes & arcs is termed graph or topology of BN (For each node, an associated Node Probability Table (NPT) give the the conditional probability of each possible outcome, given combinations of outcomes from parent nodes) BN present causal chains (i.e cause-effect relationships between parent and child nodes) Given evidence of past events, run the BN to see what the most likely future outcomes will be they are also robust to missing info.! Bayesian classification can help to predict information we do not know using information we do know and the likelihood of certain patterns in the data occurring (cf. learning!)

Example Solutions 1. What is the probability that the total of two dice will be greater than 8, given that the first die is a 6? Let A = first die is 6 Let B = total of two dice is greater than 8 We need to determine the conditional probability, P(B/A) i.e. the probability of an event (B) given that another event (A) has occurred. This can be computed by considering only outcomes for which the first die is a 6. Then, determine the proportion of these outcomes that total more than 8. All the possible outcomes for two dice can be calculated as below: There are 36 possible outcomes when a pair of dice is thrown. Consider that if one of the dice rolled is a 1, there are six possibilities for the other die. If one of the dice rolled a 2, the same is still true. And the same is true if one of the dice is a 3,4,5, or 6. If this is still confusing, look at the following (abbreviated) list of outcomes: [(1,1),(1,2),(1,3),(1,4),(1,5),(1,6); (2,1),(2,2),(2,3) (3,1),(3,2),3,3) (4,1) (5,1) (6,1). The total number of outcomes is 6 6 = 36 (or 6^2) Now, there are 6 outcomes for which the first die is a 6: (6,1),(6,2),(6,3),(6,4),(6,5),(6,6), and of these, there are four that total more than 8. The probability of a total greater than 8 given that the first die is 6, i.e. P(B/A) is therefore = 4/6 = 2/3. Alternatively, using Bayes Theorem: P (A) = 1/6 Favourable outcomes for A and B: (6, 3), (6, 4), (6, 5), (6, 6) P (A and B) = 4/ 36 P (B A) = P (A and B) / P (A) = 4/36 * 6/1 = 2/3 2. A patient s probability to have a liver disease increases if he is an alcoholic. Given that, 10% of the patients entering the clinic have liver disease, 5% of the clinic s patients are alcoholic and that, 7% of the patients diagnosed with liver disease are alcoholics, find the probability that an alcoholic patient is diagnosed with liver disease. Given: Probability of patients with Liver disease: P(A) = 0.10 Probability of patients who are alcoholic: P(B) = 0.05. (7% patients diagnosed with liver disease (A) are alcoholics (B)) i.e. P(B A) = 0.07. (Probability that alcoholic patient (B) is diagnosed with liver disease (A)), i.e. P(A B) = P(B A) P(A) / P(B) = 0.14

3. After tossing a fair coin three times: i. What is the probability of at least two tails? Let H represent an outcome of heads and T represent an outcome of Tails For three tosses of the coin all the possible outcomes are H-H-H T-H-H H-T-H H-H-T T-H-T T-T-H H-T-T T-T-T (or, 2^3 since, each toss leads to two outcomes) The above eight possible outcomes are the sample space. The outcomes that have at least two tails in them are T-H-T, T-T-H, H-T-T, and T-T-T. Therefore, there are four of the eight outcomes that have two or more tails in them. This means that the probability of throwing at least two tails in three tosses is 4 out of 8, which reduces to: 50 percent. OR: P (HTT U THT U TTH U TTT) = 1/8 + 1/8 + 1/8 + 1/8 = 4/8 (since events are mutually exclusive) ii. What is the probability of exactly one tail? P (HHT U HTH U THH) = 1/8 + 1/8 + 1/8 = 3/8 iii. Given that at least one tail is observed, what is the probability of observing at least two tails? A1 = event that at least one tail is observed (T 1) A2 = event of observing at least two tails (T 2) Intuitive Answer: It would be pretty quick to just list all possible outcomes. There are 8 ways you can flip 3 coins. Only one of them is eliminated by the condition (HHH). How many of the remaining results have at least two tails? (4/7) Using Bayes Theorem: P (A1 ) = 7/8 P (A2) = 4/8 P (A2 A1) = P(A1 A2) / P(A1) = (4/8) / (7/8) = 4/7 (To find P(T 1 T 2), we should just list the possible sequences of coin tosses that would allow this (4 out of the 8 tosses, as the required conditions eliminate four remaining sequences: H-H-H,T-H-H, H-T-H, H-H-T)

4. Given that 13% of patients who have lung cancer, and smoke; 3% of patients smoke and do not have lung cancer. 5% of patients have lung cancer and do not smoke. 79% of patients neither have lung cancer nor do they smoke. Draw an appropriate probability table and find the probability that a patient, picked at random, has lung cancer, given that he smokes. Also find the probability that a patient is a smoker given that he has lung cancer. From this, derive your inference. S 0.05 0.13 0.03 A B 0.79 Disease Status (Joint-probabilities Table!) Smoker (B) Nonsmoker (~B) Lung No Lung Cancer (A) Cancer (~A) 0.13 0.03 0.16 0.05 0.79 0.84 0.18 0.82 1.00 A = lung cancer B = smoker P(A B) = 0.13/0.16 = 0.8125 which is 81.25% P(B A) = 0.13/0.18 = 0.7222 which is 72.22% Inference is P(A B) and P(B A) are not equal. Q5. Consider that 0.9% of the people have a genetic defect, 92% of the tests for gene are true positives, 9.8% of the tests are false positives. If a person gets a positive test result, what are the odds that they actually have the faulty gene? Let P(A) = probability of having the faulty gene = 0.009 (Hence, P(~A) = 0.991) B = positive test result P(A B) = Probability of having the gene given a positive test result. P(B A) = Probability of a positive test result given that the person has the gene = 0.92 P(B ~A) = Probability of a positive test if the person does not have the gene = 0.098 P(B) = P(B A) + P(B ~A) = P(B A) * P(A) + P(B ~A)*P(~A) = 0.92 * 0.009 + 0.098 * 0.991= 0.105398 P(A B) = P(B A) * P(A) / P(B) = (0.92 * 0.009) / (0.105398) = 0.0786, which is 7.86% probability of having the faulty gene. Q6. The probability that it is Wednesday and that a student is absent is 0.04. What is the probability that a student is absent given that today is Wednesday? P(Absent Wednesday) = P(Wednesday and Absent) / P(Wednesday) = 0.04/0.2 = 0.2 which is 20%. (assuming there are five working days in a student s week!)

Q7 You go to see the doctor about an ingrowing toe-nail. The doctor selects you at random to have a blood test for swine flu, which for the purposes of this exercise we will say is currently suspected to affect 1 in 10,000 people in Australia. The test is 99% accurate, in the sense that the probability of a false positive is 1%. The probability of a false negative is zero. You test positive. What is the new probability that you have swine flu? Now imagine that you went to a friend s wedding in Mexico recently, and (for the purposes of this exercise) it is known that 1 in 200 people who visited Mexico recently come back with swine flu. Given the same test result as above, what should your revised estimate be for the probability you have the disease? Let P(D) be the probability one has swine flu. Let P(T) be the probability of a positive test. We wish to know P(D T). Bayes theorem says: P T D P(D) P D T = P T which in this case can be rewritten as: P T D P(D) P D T = P T D P D + P T ~ D P(~ D) where P(~ D) means the probability of not having swine flu. We have P(D) = 0.0001 (the a priori probability one has swine flu). P(~ D) = 0.9999 P(T D) = 1 (if one has swine flu the test is always positive). P(T ~ D) = 0.01 (1% chance of a false positive, i.e. test wrongly indicates the condition is present). Plugging these numbers in we get: P D T = 1 0.0001 1 0.0001 + 0.01 0.9999 0.01 That is, even though the test was positive one s chance of having swine flu is only 1%. However, if one went to Mexico recently then his starting P(D) is 0.005. In this case P D T = and you should be a lot more worried! 1 0.005 1 0.005 + 0.01 0.995 0.33 Aside: Recap definitions of True/False Positives/Negatives (Confusion Matrix) There is quite a bit of terminological confusion in this area. Many people find it useful to come back to a confusion matrix to think about this. In a classification / screening test, you can have four different situations: Condition: A Not A Test says A True positive False positive ---------------------------------- Test says Not A False negative True negative

In this table, true positive, false negative, false positive and true negative are events (or their probability). What you have is therefore probably a true positive rate and a false negative rate. The distinction matters because it emphasizes that both numbers have a numerator and a denominator. Where things get a bit confusing is that you can find several definitions of false positive rate and false negative rate, with different denominators. For example, Wikipedia provides the following definitions (they seem pretty standard): True positive rate (or sensitivity): TPR=TP/(TP+FN) False positive rate: FPR=FP/(FP+TN) True negative rate (or specificity): TNR=TN/(FP+TN) In all cases, the denominator is the column total. This also gives a cue to their interpretation: The true positive rate is the probability that the test says A when the real value is indeed A (i.e., it is a conditional probability, conditioned on A being true). This does not tell you how likely you are to be correct when calling A (i.e., the probability of a true positive, conditioned on the test result being A ). Assuming the false negative rate is defined in the same way, we then have we then have FNR=1 TPR We cannot however directly derive the false positive rate from either the true positive or false negative rates because they provide no information on the specificity, i.e., how the test behaves when not A is the correct answer. There are however other definitions in the literature! NOTE: 1) True +ve and false -ve make 100% 2) False +ve and true -ve make 100% 3) There is no relation between true positives and false positives.