# Discrete Structures for Computer Science

Save this PDF as:

Size: px
Start display at page:

## Transcription

1 Discrete Structures for Computer Science Adam J. Lee 6111 Sennott Square Lecture #20: Bayes Theorem November 5, 2013

2 How can we incorporate prior knowledge? Sometimes we want to know the probability of some event given that another event has occurred. Example: A fair coin is flipped three times. The first flip turns up tails. Given this information, what is the probability that an odd number of tails appear in the three flips? Solution: l Let F = the first flip of three comes up tails l Let E = tails comes up an odd number of times in three flips l Since F has happened, S is reduced to {THH, THT, TTH, TTT} l By Laplace s definition of probability, we know: l p(e) = E / S l = {THH, TTT} / {THH, THT, TTH, TTT} l = 2/4 l = 1/2

3 Conditional Probability Definition: Let E and F be events with p(f) > 0. The conditional probability of E given F, denoted p(e F), is defined as: Intuition: l Think of the event F as reducing the sample space that can be considered l The numerator looks at the likelihood of the outcomes in E that overlap those in F l The denominator accounts for the reduction in sample size indicated by our prior knowledge that F has occurred

4 Bayes Theorem Bayes Theorem allows us to relate the conditional and marginal probabilities of two random events. In English: Bayes Theorem will help us assess the probability that an event occurred given only partial evidence. Doesn t our formula for conditional probability do this already? We can t always use this formula directly

5 A Motivating Example Suppose that a certain opium test correctly identifies a person who uses opiates as testing positive 99% of the time, and will correctly identify a non-user as testing negative 99% of the time. If a company suspects that 0.5% of its employees are opium users, what is the probability that an employee that tests positive for this drug is actually a user? Question: Can we use our simple conditional probability formula? X is a user X tested positive

6 The 1,000 foot view In situations like those on the last slide, Bayes theorem can help! Essentially, Bayes theorem will allow us to calculate P(E F) assuming that we know (or can derive): l P(E) l P(F E) l P(F E) Returning to our earlier example: Probability that X is a user Test success rate Probability that X is an opium user given a positive test Test false positive rate l Let E = Person X is an opium user l Let F = Person X tested positive for opium It looks like Bayes Theorem could help in this case

7 New Notation Today, we will use the notation E C to denote the complementary event of E That is: E = E C

8 A Simple Example We have two boxes. The first contains two green balls and seven red balls. The second contains four green balls and three red balls. Bob selects a ball by first choosing a box at random. He then selects one of the balls from that box at random. If Bob has selected a red ball, what is the probability that he took it from the first box? 1 2

9 Picking the problem apart First, let s define a few events relevant to this problem: l Let E = Bob has chosen a red ball l By definition E C = Bob has chosen a green ball l Let F = Bob chose his ball from this first box l Therefore, F C = Bob chose his ball from the second box We want to find the probability that Bob chose from the first box, given that he picked a red ball. That is, we want p(f E). Goal: Given that p(f E) = p(f E)/p(E), use what we know to derive p(f E) and p(e).

10 What do we know? We have two boxes. The first contains two green balls and seven red balls. The second contains four green balls and three red balls. Bob selects a ball by first choosing a box at random. He then selects one of the balls from that box at random. If Bob has selected a red ball, what is the probability that he took it from the first box? Statement: Bob selects a ball by first choosing a box at random l Bob is equally likely to choose the first box, or the second box l p(f) = p(f C ) = 1/2 Statement: The first contains two green balls and seven red balls l The first box has nine balls, seven of which are red l p(e F) = 7/9 Statement: The second contains four green balls and three red balls l The second box contains seven balls, three of which are red l p(e F C ) = 3/7

11 Now, for a little algebra The end goal: Compute p(f E) = p(f E)/p(E) Note that p(e F) = p(e F)/p(F) l If we multiply by p(f), we get p(e F) = p(e F)p(F) l Further, we know that p(e F) = 7/9 and p(f) = 1/2 l So p(e F) = 7/9 1/2 = 7/18 Recall: p(f) = p(f C ) = 1/2 p(e F) = 7/9 p(e F C ) = 3/7 Similarly, p(e F C ) = p(e F C )p(f C ) = 3/7 1/2 = 3/14 Observation: E = (E F) (E F C ) l This means that p(e) = p(e F) + p(e F C ) l = 7/18 + 3/14 l = 49/ /126 l = 76/126 l = 38/63

12 Denouement The end goal: Compute p(f E) = p(f E)/p(E) So, p(f E) = (7/18) / (38/63) Recall: p(f) = p(f C ) = 1/2 p(e F) = 7/9 p(e F C ) = 3/7 p(e F) = 7/18 p(e) = 38/63 How did we get here? 1. Extract what we could from the problem definition itself 2. Rearrange terms to derive p(f E) and p(e) 3. Use our trusty definition of conditional probability to do the rest!

13 The reasoning that we used in the last problem essentially derives Bayes Theorem for us! Bayes Theorem: Suppose that E and F are events from some sample space S such that p(e) 0 and p(f) 0. Then: Proof: l The definition of conditional probability says that p(f E) = p(f E)/p(E) p(e F) = p(e F)/p(F) l This means that p(e F) = p(f E)p(E) p(e F) = p(e F)p(F) l So p(f E)p(E) = p(e F)p(F) l Therefore, p(f E) = p(e F)p(F)/p(E)

14 Proof (continued) Note: To finish, we must prove p(e) = p(e F)p(F) + p(e F C )p(f C ) l Observe that E = E S l = E (F F C ) l = (E F) (E F C ) l Note also that (E F) and (E F C ) are disjoint (i.e., no x can be in both F and F C ) l This means that p(e) = p(e F) + p(e F C ) l We already have shown that p(e F) = p(e F)p(F) l Further, since p(e F C ) = p(e F C )/p(f C ), we have that p(e F C ) = p(e F C )p(f C ) l So p(e) = p(e F) + p(e F C ) = p(e F)p(F) + p(e F C )p(f C ) Putting everything together, we get:

15 And why is this useful? In a nutshell, Bayes Thereom is useful if you want to find p(f E), but you don t know p(e F) or p(e).

16 Here s a general solution tactic Step 1: Identify the independent events that are being investigated. For example: l F = Bob chooses the first box, F C = Bob chooses the second box l E = Bob chooses a red ball, E C = Bob chooses a green ball Step 2: Record the probabilities identified in the problem statement. For example: l p(f) = p(f C ) = 1/2 l p(e F) = 7/9 l p(e F C ) = 3/7 Step 3: Plug into Bayes formula and solve

17 Example: Pants and Skirts Suppose there is a co-ed school having 60% boys and 40% girls as students. The girl students wear trousers or skirts in equal numbers; the boys all wear trousers. An observer sees a (random) student from a distance; all they can see is that this student is wearing trousers. What is the probability this student is a girl? Step 1: Set up events l E = X is wearing pants l E C = X is wearing a skirt l F = X is a girl l F C = X is a boy Step 2: Extract probabilities from problem definition l p(f) = 0.4 l p(f C ) = 0.6 l p(e F) = p(e C F) = 0.5 l p(e F C ) = 1

18 Pants and Skirts (continued) Step 3: Plug in to Bayes Theorem l p(f E) = ( )/( ) l = 0.2/0.8 l = 1/4 Recall: p(f) = 0.4 p(f C ) = 0.6 p(e F) = p(e C F) = 0.5 p(e F C ) = 1 Conclusion: There is a 25% chance that the person seen was a girl, given that they were wearing pants.

19 Drug screening, revisited Suppose that a certain opium test correctly identifies a person who uses opiates as testing positive 99% of the time, and will correctly identify a non-user as testing negative 99% of the time. If a company suspects that 0.5% of its employees are opium users, what is the probability that an employee that tests positive for this drug is actually a user? Step 1: Set up events l F = X is an opium user l l l F C = X is not an opium user E = X tests positive for opiates E C = X tests negative for opiates Step 2: Extract probabilities from problem definition l p(f) = l p(f C ) = l p(e F) = 0.99 l p(e F C ) = 0.01

20 Drug screening (continued) Step 3: Plug in to Bayes Theorem l p(f E) = ( )/( ) l = Recall: p(f) = p(f C ) = p(e F) = 0.99 p(e F C ) = 0.01 Conclusion: If an employee tests positive for opiate use, there is only a 33% chance that they are actually an opium user!

21 Group Work! Suppose that 1 person in 100,000 has a particular rare disease. A diagnostic test is correct 99% of the time when given to someone with the disease, and is correct 99.5% of the time when given to someone without the disease. Problem 1: Calculate the probability that someone who tests positive for the disease actually has it. Problem 2: Calculate the probability that someone who tests negative for the disease does not have the disease.

22 Application: Spam filtering Definition: Spam is unsolicited bulk I didn t ask for it, I probably don t want it Sent to lots of people In recent years, spam has become increasingly problematic. For example, in 2006: l Spam accounted for > 40% of all sent l This is over 12.4 billion spam messages! To combat this problem, people have developed spam filters based on Bayes theorem!

23 How does a Bayesian spam filter work? Essentially, these filters determine the probability that a message is spam, given that it contains certain keywords. Message is spam Message contains questionable keyword In the above equation: l p(e F) = Probability that our keyword occurs in spam messages l p(e F C ) = Probability that our keyword occurs in legitimate messages l p(f) = Probability that an arbitrary message is spam l p(f C ) = Probability that an arbitrary message is legitimate Question: How do we derive these parameters?

24 We can learn these parameters by examining historical traces Imagine that we have a corpus of messages We can ask a few intelligent questions to learn the parameters of our Bayesian filter: l How many of these messages do we consider spam? p(f) l In the spam messages, how often does our keyword appear? p(e F) l In the good messages, how often does our keyword appear? p(e F C ) Aside: This is what happens every time you click the mark as spam button in your client! Given this information, we can apply Bayes theorem!

25 Filtering spam using a single keyword Suppose that the keyword Rolex occurs in 250 of 2000 known spam messages, and in 5 of 1000 known good messages. Estimate the probability that an incoming message containing the word Rolex is spam, assuming that it is equally likely that an incoming message is spam or not spam. If our threshold for classifying a message as spam is 0.9, will we reject this message? Step 1: Define events l F = message is spam l F C = message is good l E = message contains the keyword Rolex l E C = message does not contain the keyword Rolex Step 2: Gather probabilities from the problem statement l p(f) = p(f C )= 0.5 l p(e F) = 250/2000 = l p(e F C ) = 5/1000 = 0.005

26 Spam Rolexes (continued) Step 3: Plug in to Bayes Theorem l p(f E) = ( )/( ) l = 0.125/( ) l Recall: p(f) = p(f C ) = 0.5 p(e F) = p(e F C ) = Conclusion: Since the probability that our message is spam given that it contains the string Rolex is approximately > 0.9, we will discard the message.

27 Problems with this simple filter How would you choose a single keyword/phrase to use? l All natural l Nigeria l Click here l Users get upset if false positives occur, i.e., if legitimate messages are incorrectly classified as spam l When was the last time you checked your spam folder? How can we fix this? l Choose keywords s.t. p(spam keyword) is very high or very low l Filter based on multiple keywords

28 Specifically, we want to develop a Bayesian filter that tells us p(f E 1 E 2 ) First, some assumptions 1. Events E 1 and E 2 are independent 2. The events E 1 S and E 2 S are independent 3. p(f) = p(f C ) = 0.5 By Bayes theorem Now, let s derive formula for this p(f E 1 E 2 ) Assumption 3 Assumptions 1 and 2

29 Spam filtering on two keywords Suppose that we train a Bayesian spam filter on a set of 2000 spam messages and 1000 messages that are not spam. The word stock appears in 400 spam messages and 60 good messages, and the word undervalued appears in 200 spam messages and 25 good messages. Estimate the probability that a message containing the words stock and undervalued is spam. Will we reject this message if our spam threshold is set at 0.9? Step 1: Set up events l F = message is spam, F C = message is good l l E 1 = message contains the word stock E 2 = message contains the word undervalued Step 2: Identify probabilities l P(E 1 F) = 400/2000 = 0.2 l p(e 1 F C ) = 60/1000 = 0.06 l p(e 2 F) = 200/2000 = 0.1 l p(e 2 F C ) = 25/1000 = 0.025

30 Two keywords (continued) p(f E 1 E 2 )= Step 3: Plug in to Bayes Theorem l p(f E 1 E 2 ) = ( )/( ) l = 0.02/( ) l p(e 1 F )p(e 2 F ) p(e 1 F )p(e 2 F )+p(e 1 F C )p(e 2 F C ) Recall: p(e 1 F) = 0.2 p(e 1 F C ) = 0.06 p(e 2 F) = 0.1 p(e 2 F C ) = Conclusion: Since the probability that our message is spam given that it contains the strings stock and undervalued is > 0.9, we will reject this message.

31 Final Thoughts n Conditional probability is very useful n Bayes theorem l Helps us assess conditional probabilities l Has a range of important applications n Next time: l Expected values and variance (Section 7.4)

### 1 Introductory Comments. 2 Bayesian Probability

Introductory Comments First, I would like to point out that I got this material from two sources: The first was a page from Paul Graham s website at www.paulgraham.com/ffb.html, and the second was a paper

### 6.3 Conditional Probability and Independence

222 CHAPTER 6. PROBABILITY 6.3 Conditional Probability and Independence Conditional Probability Two cubical dice each have a triangle painted on one side, a circle painted on two sides and a square painted

### Using Laws of Probability. Sloan Fellows/Management of Technology Summer 2003

Using Laws of Probability Sloan Fellows/Management of Technology Summer 2003 Uncertain events Outline The laws of probability Random variables (discrete and continuous) Probability distribution Histogram

### Worked examples Basic Concepts of Probability Theory

Worked examples Basic Concepts of Probability Theory Example 1 A regular tetrahedron is a body that has four faces and, if is tossed, the probability that it lands on any face is 1/4. Suppose that one

### Math 141. Lecture 2: More Probability! Albyn Jones 1. jones@reed.edu www.people.reed.edu/ jones/courses/141. 1 Library 304. Albyn Jones Math 141

Math 141 Lecture 2: More Probability! Albyn Jones 1 1 Library 304 jones@reed.edu www.people.reed.edu/ jones/courses/141 Outline Law of total probability Bayes Theorem the Multiplication Rule, again Recall

### Probability. A random sample is selected in such a way that every different sample of size n has an equal chance of selection.

1 3.1 Sample Spaces and Tree Diagrams Probability This section introduces terminology and some techniques which will eventually lead us to the basic concept of the probability of an event. The Rare Event

### Week 2: Conditional Probability and Bayes formula

Week 2: Conditional Probability and Bayes formula We ask the following question: suppose we know that a certain event B has occurred. How does this impact the probability of some other A. This question

### Chapter 6. 1. What is the probability that a card chosen from an ordinary deck of 52 cards is an ace? Ans: 4/52.

Chapter 6 1. What is the probability that a card chosen from an ordinary deck of 52 cards is an ace? 4/52. 2. What is the probability that a randomly selected integer chosen from the first 100 positive

### V. RANDOM VARIABLES, PROBABILITY DISTRIBUTIONS, EXPECTED VALUE

V. RANDOM VARIABLES, PROBABILITY DISTRIBUTIONS, EXPETED VALUE A game of chance featured at an amusement park is played as follows: You pay \$ to play. A penny and a nickel are flipped. You win \$ if either

### The Basics of Graphical Models

The Basics of Graphical Models David M. Blei Columbia University October 3, 2015 Introduction These notes follow Chapter 2 of An Introduction to Probabilistic Graphical Models by Michael Jordan. Many figures

### Probability OPRE 6301

Probability OPRE 6301 Random Experiment... Recall that our eventual goal in this course is to go from the random sample to the population. The theory that allows for this transition is the theory of probability.

### Bayesian Tutorial (Sheet Updated 20 March)

Bayesian Tutorial (Sheet Updated 20 March) Practice Questions (for discussing in Class) Week starting 21 March 2016 1. What is the probability that the total of two dice will be greater than 8, given that

### Bayesian Updating with Discrete Priors Class 11, 18.05, Spring 2014 Jeremy Orloff and Jonathan Bloom

1 Learning Goals Bayesian Updating with Discrete Priors Class 11, 18.05, Spring 2014 Jeremy Orloff and Jonathan Bloom 1. Be able to apply Bayes theorem to compute probabilities. 2. Be able to identify

### Combinatorics: The Fine Art of Counting

Combinatorics: The Fine Art of Counting Week 7 Lecture Notes Discrete Probability Continued Note Binomial coefficients are written horizontally. The symbol ~ is used to mean approximately equal. The Bernoulli

### Chapter 4 Probability

The Big Picture of Statistics Chapter 4 Probability Section 4-2: Fundamentals Section 4-3: Addition Rule Sections 4-4, 4-5: Multiplication Rule Section 4-7: Counting (next time) 2 What is probability?

### Economics 1011a: Intermediate Microeconomics

Lecture 12: More Uncertainty Economics 1011a: Intermediate Microeconomics Lecture 12: More on Uncertainty Thursday, October 23, 2008 Last class we introduced choice under uncertainty. Today we will explore

### Lesson 1. Basics of Probability. Principles of Mathematics 12: Explained! www.math12.com 314

Lesson 1 Basics of Probability www.math12.com 314 Sample Spaces: Probability Lesson 1 Part I: Basic Elements of Probability Consider the following situation: A six sided die is rolled The sample space

### Basic Probability Theory II

RECAP Basic Probability heory II Dr. om Ilvento FREC 408 We said the approach to establishing probabilities for events is to Define the experiment List the sample points Assign probabilities to the sample

### If, under a given assumption, the of a particular observed is extremely. , we conclude that the is probably not

4.1 REVIEW AND PREVIEW RARE EVENT RULE FOR INFERENTIAL STATISTICS If, under a given assumption, the of a particular observed is extremely, we conclude that the is probably not. 4.2 BASIC CONCEPTS OF PROBABILITY

### 5 =5. Since 5 > 0 Since 4 7 < 0 Since 0 0

a p p e n d i x e ABSOLUTE VALUE ABSOLUTE VALUE E.1 definition. The absolute value or magnitude of a real number a is denoted by a and is defined by { a if a 0 a = a if a

### Chapter 4 & 5 practice set. The actual exam is not multiple choice nor does it contain like questions.

Chapter 4 & 5 practice set. The actual exam is not multiple choice nor does it contain like questions. MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.

### Probability: Terminology and Examples Class 2, 18.05, Spring 2014 Jeremy Orloff and Jonathan Bloom

Probability: Terminology and Examples Class 2, 18.05, Spring 2014 Jeremy Orloff and Jonathan Bloom 1 Learning Goals 1. Know the definitions of sample space, event and probability function. 2. Be able to

### Part III: Machine Learning. CS 188: Artificial Intelligence. Machine Learning This Set of Slides. Parameter Estimation. Estimation: Smoothing

CS 188: Artificial Intelligence Lecture 20: Dynamic Bayes Nets, Naïve Bayes Pieter Abbeel UC Berkeley Slides adapted from Dan Klein. Part III: Machine Learning Up until now: how to reason in a model and

Adjust Webmail Spam Settings An unsolicited bulk email message is known as "spam." Spam, which usually contains some sort of commercial advertising or proposition, is sent to a large number of recipients

### Discrete Mathematics and Probability Theory Fall 2009 Satish Rao,David Tse Note 11

CS 70 Discrete Mathematics and Probability Theory Fall 2009 Satish Rao,David Tse Note Conditional Probability A pharmaceutical company is marketing a new test for a certain medical condition. According

### Problem sets for BUEC 333 Part 1: Probability and Statistics

Problem sets for BUEC 333 Part 1: Probability and Statistics I will indicate the relevant exercises for each week at the end of the Wednesday lecture. Numbered exercises are back-of-chapter exercises from

### Bayes and Naïve Bayes. cs534-machine Learning

Bayes and aïve Bayes cs534-machine Learning Bayes Classifier Generative model learns Prediction is made by and where This is often referred to as the Bayes Classifier, because of the use of the Bayes rule

### Spam Filtering using Naïve Bayesian Classification

Spam Filtering using Naïve Bayesian Classification Presented by: Samer Younes Outline What is spam anyway? Some statistics Why is Spam a Problem Major Techniques for Classifying Spam Transport Level Filtering

### I. WHAT IS PROBABILITY?

C HAPTER 3 PROBABILITY Random Experiments I. WHAT IS PROBABILITY? The weatherman on 0 o clock news program states that there is a 20% chance that it will snow tomorrow, a 65% chance that it will rain and

### 1 Introduction to Option Pricing

ESTM 60202: Financial Mathematics Alex Himonas 03 Lecture Notes 1 October 7, 2009 1 Introduction to Option Pricing We begin by defining the needed finance terms. Stock is a certificate of ownership of

### p-values and significance levels (false positive or false alarm rates)

p-values and significance levels (false positive or false alarm rates) Let's say 123 people in the class toss a coin. Call it "Coin A." There are 65 heads. Then they toss another coin. Call it "Coin B."

### Lecture 16: Expected value, variance, independence and Chebyshev inequality

Lecture 16: Expected value, variance, independence and Chebyshev inequality Expected value, variance, and Chebyshev inequality. If X is a random variable recall that the expected value of X, E[X] is the

### Question: What is the probability that a five-card poker hand contains a flush, that is, five cards of the same suit?

ECS20 Discrete Mathematics Quarter: Spring 2007 Instructor: John Steinberger Assistant: Sophie Engle (prepared by Sophie Engle) Homework 8 Hints Due Wednesday June 6 th 2007 Section 6.1 #16 What is the

### Pattern matching probabilities and paradoxes A new variation on Penney s coin game

Osaka Keidai Ronshu, Vol. 63 No. 4 November 2012 Pattern matching probabilities and paradoxes A new variation on Penney s coin game Yutaka Nishiyama Abstract This paper gives an outline of an interesting

### Chapter 4 - Practice Problems 1

Chapter 4 - Practice Problems SHORT ANSWER. Write the word or phrase that best completes each statement or answers the question. Provide an appropriate response. ) Compare the relative frequency formula

### Math 55: Discrete Mathematics

Math 55: Discrete Mathematics UC Berkeley, Fall 2011 Homework # 7, due Wedneday, March 14 Happy Pi Day! (If any errors are spotted, please email them to morrison at math dot berkeley dot edu..5.10 A croissant

### How to Conduct a Hypothesis Test

How to Conduct a Hypothesis Test The idea of hypothesis testing is relatively straightforward. In various studies we observe certain events. We must ask, is the event due to chance alone, or is there some

### Recursive Estimation

Recursive Estimation Raffaello D Andrea Spring 04 Problem Set : Bayes Theorem and Bayesian Tracking Last updated: March 8, 05 Notes: Notation: Unlessotherwisenoted,x, y,andz denoterandomvariables, f x

### Email Spam Detection A Machine Learning Approach

Email Spam Detection A Machine Learning Approach Ge Song, Lauren Steimle ABSTRACT Machine learning is a branch of artificial intelligence concerned with the creation and study of systems that can learn

### REPEATED TRIALS. The probability of winning those k chosen times and losing the other times is then p k q n k.

REPEATED TRIALS Suppose you toss a fair coin one time. Let E be the event that the coin lands heads. We know from basic counting that p(e) = 1 since n(e) = 1 and 2 n(s) = 2. Now suppose we play a game

### CSE 473: Artificial Intelligence Autumn 2010

CSE 473: Artificial Intelligence Autumn 2010 Machine Learning: Naive Bayes and Perceptron Luke Zettlemoyer Many slides over the course adapted from Dan Klein. 1 Outline Learning: Naive Bayes and Perceptron

### Regression Analysis Prof. Soumen Maity Department of Mathematics Indian Institute of Technology, Kharagpur. Lecture - 2 Simple Linear Regression

Regression Analysis Prof. Soumen Maity Department of Mathematics Indian Institute of Technology, Kharagpur Lecture - 2 Simple Linear Regression Hi, this is my second lecture in module one and on simple

### MAS113 Introduction to Probability and Statistics

MAS113 Introduction to Probability and Statistics 1 Introduction 1.1 Studying probability theory There are (at least) two ways to think about the study of probability theory: 1. Probability theory is a

### 2DI36 Statistics. 2DI36 Part II (Chapter 7 of MR)

2DI36 Statistics 2DI36 Part II (Chapter 7 of MR) What Have we Done so Far? Last time we introduced the concept of a dataset and seen how we can represent it in various ways But, how did this dataset came

### Random variables, probability distributions, binomial random variable

Week 4 lecture notes. WEEK 4 page 1 Random variables, probability distributions, binomial random variable Eample 1 : Consider the eperiment of flipping a fair coin three times. The number of tails that

### Introduction to Bayesian Classification (A Practical Discussion) Todd Holloway Lecture for B551 Nov. 27, 2007

Introduction to Bayesian Classification (A Practical Discussion) Todd Holloway Lecture for B551 Nov. 27, 2007 Naïve Bayes Components ML vs. MAP Benefits Feature Preparation Filtering Decay Extended Examples

### FEGYVERNEKI SÁNDOR, PROBABILITY THEORY AND MATHEmATICAL

FEGYVERNEKI SÁNDOR, PROBABILITY THEORY AND MATHEmATICAL STATIsTICs 4 IV. RANDOm VECTORs 1. JOINTLY DIsTRIBUTED RANDOm VARIABLEs If are two rom variables defined on the same sample space we define the joint

### Lecture 2: Introduction to belief (Bayesian) networks

Lecture 2: Introduction to belief (Bayesian) networks Conditional independence What is a belief network? Independence maps (I-maps) January 7, 2008 1 COMP-526 Lecture 2 Recall from last time: Conditional

### Breaking The Code. Ryan Lowe. Ryan Lowe is currently a Ball State senior with a double major in Computer Science and Mathematics and

Breaking The Code Ryan Lowe Ryan Lowe is currently a Ball State senior with a double major in Computer Science and Mathematics and a minor in Applied Physics. As a sophomore, he took an independent study

### Chapter 3: DISCRETE RANDOM VARIABLES AND PROBABILITY DISTRIBUTIONS. Part 3: Discrete Uniform Distribution Binomial Distribution

Chapter 3: DISCRETE RANDOM VARIABLES AND PROBABILITY DISTRIBUTIONS Part 3: Discrete Uniform Distribution Binomial Distribution Sections 3-5, 3-6 Special discrete random variable distributions we will cover

### Savita Teli 1, Santoshkumar Biradar 2

Effective Spam Detection Method for Email Savita Teli 1, Santoshkumar Biradar 2 1 (Student, Dept of Computer Engg, Dr. D. Y. Patil College of Engg, Ambi, University of Pune, M.S, India) 2 (Asst. Proff,

### Solutions: Problems for Chapter 3. Solutions: Problems for Chapter 3

Problem A: You are dealt five cards from a standard deck. Are you more likely to be dealt two pairs or three of a kind? experiment: choose 5 cards at random from a standard deck Ω = {5-combinations of

### RANDOM VARIABLES MATH CIRCLE (ADVANCED) 3/3/2013. 3 k) ( 52 3 )

RANDOM VARIABLES MATH CIRCLE (ADVANCED) //0 0) a) Suppose you flip a fair coin times. i) What is the probability you get 0 heads? ii) head? iii) heads? iv) heads? For = 0,,,, P ( Heads) = ( ) b) Suppose

### Machine Learning Math Essentials

Machine Learning Math Essentials Jeff Howbert Introduction to Machine Learning Winter 2012 1 Areas of math essential to machine learning Machine learning is part of both statistics and computer science

### Confidence intervals, t tests, P values

Confidence intervals, t tests, P values Joe Felsenstein Department of Genome Sciences and Department of Biology Confidence intervals, t tests, P values p.1/31 Normality Everybody believes in the normal

### MINITAB ASSISTANT WHITE PAPER

MINITAB ASSISTANT WHITE PAPER This paper explains the research conducted by Minitab statisticians to develop the methods and data checks used in the Assistant in Minitab 17 Statistical Software. One-Way

### Setting up Microsoft Outlook to reject unsolicited email (UCE or Spam )

Reference : USER 191 Issue date : January 2004 Updated : January 2008 Classification : Staff Authors : Matt Vernon, Richard Rogers Setting up Microsoft Outlook to reject unsolicited email (UCE or Spam

### Outline. Probability. Random Variables. Joint and Marginal Distributions Conditional Distribution Product Rule, Chain Rule, Bayes Rule.

Probability 1 Probability Random Variables Outline Joint and Marginal Distributions Conditional Distribution Product Rule, Chain Rule, Bayes Rule Inference Independence 2 Inference in Ghostbusters A ghost

### Introduction to. Hypothesis Testing CHAPTER LEARNING OBJECTIVES. 1 Identify the four steps of hypothesis testing.

Introduction to Hypothesis Testing CHAPTER 8 LEARNING OBJECTIVES After reading this chapter, you should be able to: 1 Identify the four steps of hypothesis testing. 2 Define null hypothesis, alternative

### A natural introduction to probability theory. Ronald Meester

A natural introduction to probability theory Ronald Meester ii Contents Preface v 1 Experiments 1 1.1 Definitions and examples........................ 1 1.2 Counting and combinatorics......................

### Math 58. Rumbos Fall 2008 1. Solutions to Review Problems for Exam 2

Math 58. Rumbos Fall 2008 1 Solutions to Review Problems for Exam 2 1. For each of the following scenarios, determine whether the binomial distribution is the appropriate distribution for the random variable

### eprism Email Security Appliance 6.0 Intercept Anti-Spam Quick Start Guide

eprism Email Security Appliance 6.0 Intercept Anti-Spam Quick Start Guide This guide is designed to help the administrator configure the eprism Intercept Anti-Spam engine to provide a strong spam protection

### Name: Date: Use the following to answer questions 2-4:

Name: Date: 1. A phenomenon is observed many, many times under identical conditions. The proportion of times a particular event A occurs is recorded. What does this proportion represent? A) The probability

### 36 Odds, Expected Value, and Conditional Probability

36 Odds, Expected Value, and Conditional Probability What s the difference between probabilities and odds? To answer this question, let s consider a game that involves rolling a die. If one gets the face

### Economics 1011a: Intermediate Microeconomics

Lecture 11: Choice Under Uncertainty Economics 1011a: Intermediate Microeconomics Lecture 11: Choice Under Uncertainty Tuesday, October 21, 2008 Last class we wrapped up consumption over time. Today we

### α = u v. In other words, Orthogonal Projection

Orthogonal Projection Given any nonzero vector v, it is possible to decompose an arbitrary vector u into a component that points in the direction of v and one that points in a direction orthogonal to v

### PROBABILITY. The theory of probabilities is simply the Science of logic quantitatively treated. C.S. PEIRCE

PROBABILITY 53 Chapter 3 PROBABILITY The theory of probabilities is simply the Science of logic quantitatively treated. C.S. PEIRCE 3. Introduction In earlier Classes, we have studied the probability as

### Webmail Friends & Exceptions Guide

Webmail Friends & Exceptions Guide Add email addresses to the Exceptions List and the Friends List in your Webmail account to ensure you receive email messages from family, friends, and other important

### Question 2 Naïve Bayes (16 points)

Question 2 Naïve Bayes (16 points) About 2/3 of your email is spam so you downloaded an open source spam filter based on word occurrences that uses the Naive Bayes classifier. Assume you collected the

Name: University of Chicago Graduate School of Business Business 41000: Business Statistics Special Notes: 1. This is a closed-book exam. You may use an 8 11 piece of paper for the formulas. 2. Throughout

### Communications Express E-mail Filters

Communications Express E-mail Filters Last modified on 10/05/2006 NOTE ON SPAM: Our spam filtering system (PureMessage) rates the likeliness of messages from off campus being spam. For those messages with

### Machine Learning. CS 188: Artificial Intelligence Naïve Bayes. Example: Digit Recognition. Other Classification Tasks

CS 188: Artificial Intelligence Naïve Bayes Machine Learning Up until now: how use a model to make optimal decisions Machine learning: how to acquire a model from data / experience Learning parameters

### Ch. 13.3: More about Probability

Ch. 13.3: More about Probability Complementary Probabilities Given any event, E, of some sample space, U, of a random experiment, we can always talk about the complement, E, of that event: this is the

### If A is divided by B the result is 2/3. If B is divided by C the result is 4/7. What is the result if A is divided by C?

Problem 3 If A is divided by B the result is 2/3. If B is divided by C the result is 4/7. What is the result if A is divided by C? Suggested Questions to ask students about Problem 3 The key to this question

### Probability and Statistics Prof. Dr. Somesh Kumar Department of Mathematics Indian Institute of Technology, Kharagpur

Probability and Statistics Prof. Dr. Somesh Kumar Department of Mathematics Indian Institute of Technology, Kharagpur Module No. #01 Lecture No. #15 Special Distributions-VI Today, I am going to introduce

### Groundbreaking Technology Redefines Spam Prevention. Analysis of a New High-Accuracy Method for Catching Spam

Groundbreaking Technology Redefines Spam Prevention Analysis of a New High-Accuracy Method for Catching Spam October 2007 Introduction Today, numerous companies offer anti-spam solutions. Most techniques

### Lecture 9: Bayesian hypothesis testing

Lecture 9: Bayesian hypothesis testing 5 November 27 In this lecture we ll learn about Bayesian hypothesis testing. 1 Introduction to Bayesian hypothesis testing Before we go into the details of Bayesian

### A Second Course in Mathematics Concepts for Elementary Teachers: Theory, Problems, and Solutions

A Second Course in Mathematics Concepts for Elementary Teachers: Theory, Problems, and Solutions Marcel B. Finan Arkansas Tech University c All Rights Reserved First Draft February 8, 2006 1 Contents 25

### Likelihood: Frequentist vs Bayesian Reasoning

"PRINCIPLES OF PHYLOGENETICS: ECOLOGY AND EVOLUTION" Integrative Biology 200B University of California, Berkeley Spring 2009 N Hallinan Likelihood: Frequentist vs Bayesian Reasoning Stochastic odels and

### Chapter 4 Lecture Notes

Chapter 4 Lecture Notes Random Variables October 27, 2015 1 Section 4.1 Random Variables A random variable is typically a real-valued function defined on the sample space of some experiment. For instance,

### E3: PROBABILITY AND STATISTICS lecture notes

E3: PROBABILITY AND STATISTICS lecture notes 2 Contents 1 PROBABILITY THEORY 7 1.1 Experiments and random events............................ 7 1.2 Certain event. Impossible event............................

### Session 8 Probability

Key Terms for This Session Session 8 Probability Previously Introduced frequency New in This Session binomial experiment binomial probability model experimental probability mathematical probability outcome

### Inner product. Definition of inner product

Math 20F Linear Algebra Lecture 25 1 Inner product Review: Definition of inner product. Slide 1 Norm and distance. Orthogonal vectors. Orthogonal complement. Orthogonal basis. Definition of inner product

### Summary of Probability

Summary of Probability Mathematical Physics I Rules of Probability The probability of an event is called P(A), which is a positive number less than or equal to 1. The total probability for all possible

### Lesson 1: Experimental and Theoretical Probability

Lesson 1: Experimental and Theoretical Probability Probability is the study of randomness. For instance, weather is random. In probability, the goal is to determine the chances of certain events happening.

### PROOFPOINT - EMAIL SPAM FILTER

416 Morrill Hall of Agriculture Hall Michigan State University 517-355-3776 http://support.anr.msu.edu support@anr.msu.edu PROOFPOINT - EMAIL SPAM FILTER Contents PROOFPOINT - EMAIL SPAM FILTER... 1 INTRODUCTION...

### Math 313 Lecture #10 2.2: The Inverse of a Matrix

Math 1 Lecture #10 2.2: The Inverse of a Matrix Matrix algebra provides tools for creating many useful formulas just like real number algebra does. For example, a real number a is invertible if there is

### " Y. Notation and Equations for Regression Lecture 11/4. Notation:

Notation: Notation and Equations for Regression Lecture 11/4 m: The number of predictor variables in a regression Xi: One of multiple predictor variables. The subscript i represents any number from 1 through

### Conditional Probability, Independence and Bayes Theorem Class 3, 18.05, Spring 2014 Jeremy Orloff and Jonathan Bloom

Conditional Probability, Independence and Bayes Theorem Class 3, 18.05, Spring 2014 Jeremy Orloff and Jonathan Bloom 1 Learning Goals 1. Know the definitions of conditional probability and independence

### MATHEMATICS FOR ENGINEERS STATISTICS TUTORIAL 4 PROBABILITY DISTRIBUTIONS

MATHEMATICS FOR ENGINEERS STATISTICS TUTORIAL 4 PROBABILITY DISTRIBUTIONS CONTENTS Sample Space Accumulative Probability Probability Distributions Binomial Distribution Normal Distribution Poisson Distribution

### Example. A casino offers the following bets (the fairest bets in the casino!) 1 You get \$0 (i.e., you can walk away)

: Three bets Math 45 Introduction to Probability Lecture 5 Kenneth Harris aharri@umich.edu Department of Mathematics University of Michigan February, 009. A casino offers the following bets (the fairest

### calculating probabilities

4 calculating probabilities Taking Chances What s the probability he s remembered I m allergic to non-precious metals? Life is full of uncertainty. Sometimes it can be impossible to say what will happen

### + Section 6.2 and 6.3

Section 6.2 and 6.3 Learning Objectives After this section, you should be able to DEFINE and APPLY basic rules of probability CONSTRUCT Venn diagrams and DETERMINE probabilities DETERMINE probabilities

### Lecture 13. Understanding Probability and Long-Term Expectations

Lecture 13 Understanding Probability and Long-Term Expectations Thinking Challenge What s the probability of getting a head on the toss of a single fair coin? Use a scale from 0 (no way) to 1 (sure thing).

### Expected Value. 24 February 2014. Expected Value 24 February 2014 1/19

Expected Value 24 February 2014 Expected Value 24 February 2014 1/19 This week we discuss the notion of expected value and how it applies to probability situations, including the various New Mexico Lottery

### MAT 1000. Mathematics in Today's World

MAT 1000 Mathematics in Today's World We talked about Cryptography Last Time We will talk about probability. Today There are four rules that govern probabilities. One good way to analyze simple probabilities

### Intercept Anti-Spam Quick Start Guide

Intercept Anti-Spam Quick Start Guide Software Version: 6.5.2 Date: 5/24/07 PREFACE...3 PRODUCT DOCUMENTATION...3 CONVENTIONS...3 CONTACTING TECHNICAL SUPPORT...4 COPYRIGHT INFORMATION...4 OVERVIEW...5

### Tweaking Naïve Bayes classifier for intelligent spam detection

682 Tweaking Naïve Bayes classifier for intelligent spam detection Ankita Raturi 1 and Sunil Pranit Lal 2 1 University of California, Irvine, CA 92697, USA. araturi@uci.edu 2 School of Computing, Information