# Conditional probability

Save this PDF as:

Size: px
Start display at page:

## Transcription

1 191 Conditional probability Once the agent has obtained some evidence concerning the previously unknown random variables making up the domain, we have to switch to using conditional (posterior) probabilities P(a b) is the probability of proposition a, given that all we know is b P(cavity toothache) = 0.8 P(cavity) = P(cavity ) We can express conditional probabilities in terms of unconditional probabilities: P( a b) P( a b) P( b) whenever P(b) > The previous equation can also be written as the product rule: P(a b) = P(a b) P(b) We can, of course, have the rule the other way around P(a b) = P(b a) P(a) Conditional distributions: P(X Y) P(X = x i Y = y j ) i, j By the product rule P(X,Y) = P(X Y) P(Y) (entry-by-entry, not a matrix multiplication) 1

2 Probability Axioms The axiomatization of probability theory by Kolmogorov (1933) based on three simple axioms 1. For any proposition a the probability is in between 0 and 1: 0 P(a) 1 2. Necessarily true (i.e., valid) propositions have probability 1 and necessarily false (i.e., unsatisfiable) propositions have probability 0: P(true) = 1 P(false) = 0 3. The probability of a disjunction is given by the inclusion-exclusion principle P(a b) = P(a) + P(b) - P(a b) 194 We can derive a variety of useful facts from the basic axioms; e.g.: P(a a) = P(a) + P( a) - P(a a) P(true) = P(a) + P( a) - P(false) 1 = P(a) + P( a) P( a) = 1 - P(a) The fact of the third line can be extended for a discrete variable D with the domain d 1,,d n : i=1,,n P(D = d i ) = 1 For a continuous variable X the summation is replaced by an integral: P( X x) dx 1 2

3 195 The probability distribution on a single variable must sum to 1 It is also true that any joint probability distribution on any set of variables must sum to 1 Recall that any proposition a is equivalent to the disjunction of all the atomic events in which a holds Call this set of events e(a) Atomic events are mutually exclusive, so the probability of any conjunction of atomic events is zero, by axiom 2 Hence, from axiom 3 P(a) = ei e(a) P(e i ) Given a full joint distribution that specifies the probabilities of all atomic events, this equation provides a simple method for computing the probability of any proposition Inference Using Full Joint Distribution cavity toothache catch catch toothache catch catch cavity E.g., there are six atomic events for cavity toothache: = 0.28 Extracting the distribution over a variable (or some subset of variables), marginal probability, is attained by adding the entries in the corresponding rows or columns E.g., P(cavity) = = 0.2 We can write the following general marginalization (summing out) rule for any sets of variables Y and Z: P(Y) = z Z P(Y, z) 3

4 toothache toothache 197 catch catch catch catch cavity cavity Computing a conditional probability P(cavity toothache) = P(cavity toothache)/p(toothache) = ( )/( ) = 0.12/0.2 = 0.6 Respectively P( cavity toothache) = ( )/0.2 = 0.4 The two probabilities sum up to one, as they should 198 1/P(toothache) = 1/0.2 = 5 is a normalization constant ensuring that the distribution P(Cavity toothache) adds up to 1 Let denote the normalization constant P(Cavity toothache) = P(Cavity, toothache) = (P(Cavity, toothache, catch) + P(Cavity, toothache, catch)) = ([0.108, 0.016] + [0.012, 0.064]) = [0.12, 0.08] = [0.6, 0.4] In other words, we can calculate the conditional probability distribution without knowing P(toothache) using normalization 4

5 199 More generally: we need to find out the distribution of the query variable X (Cavity), evidence variables E (Toothache) have observed values e, and the remaining unobserved variables are Y (Catch) Evaluation of a query: P(X e) = P(X, e) = y P(X, e, y), where the summation is over all possible ys; i.e., all possible combinations of values of the unobserved variables Y 200 P(X, e, y) is simply a subset of the joint probability distribution of variables X, E, and Y X, E, and Y together constitute the complete set of variables for the domain Given the full joint distribution to work with, the equation in the previous slide can answer probabilistic queries for discrete variables It does not scale well For a domain described by n Boolean variables, it requires an input table of size O(2 n ) and takes O(2 n ) time to process the table In realistic problems the approach is completely impractical 5

6 Independence If we expand the previous example with a fourth random variable Weather, which has four possible values, we have to copy the table of joint probabilities four times to have 32 entries together Dental problems have no influence on the weather, hence: P(Weather = cloudy toothache, catch, cavity) = P(Weather = cloudy) By this observation and product rule P(toothache, catch, cavity, Weather = cloudy) = P(Weather = cloudy) P(toothache, catch, cavity) 202 A similar equation holds for the other values of the variable Weather, and hence P(Toothache, Catch, Cavity, Weather) = P(Toothache, Catch, Cavity) P(Weather) The required joint distribution tables have 8 and 4 elements Propositions a and b are independent if P(a b) = P(a) P(b a) = P(b) P(a b) = P(a) P(b) Respectively variables X and Y are independent of each other if P(X Y) = P(X) P(Y X) = P(Y) P(X, Y) = P(X)P(Y) Independent coin flips: P(C 1,, C n ) can be represented as the product of n single-variable distributions P(C i ) 6

7 Bayes Rule and Its Use By the product rule P(a b) = P(a b) P(b) and the commutativity of conjunction P(a b) = P(b a) P(a) Equating the two right-hand sides and dividing by P(a), we get the Bayes rule P(b a) = P(a b) P(b) / P(a) The more general case of multivalued variables X and Y conditionalized on some background evidence e P(Y X, e) = P(X Y, e) P(Y e) / P(X e) Using normalization Bayes rule can be written as P(Y X) = P(X Y) P(Y) 204 Half of meningitis patients have a stiff neck P(s m) = 0.5 The prior probability of meningitis is 1 / : P(m) = 1/ Every 20th patient complains about a stiff neck P(s) = 1/20 What is the probability that a patient complaining about a stiff neck has meningitis? P(m s) = P(s m) P(m) / P(s) = 20 / ( ) =

8 205 Perhaps the doctor knows that a stiff neck implies meningitis in 1 out of cases The doctor, hence, has quantitative information in the diagnostic direction from symptoms to causes, and no need to use Bayes rule Unfortunately, diagnostic knowledge is often more fragile than causal knowledge If there is a sudden epidemic of meningitis, the unconditional probability of meningitis P(m) will go up The conditional probability P(s m), however, stays the same The doctor who derived diagnostic probability P(m s) directly from statistical observation of patients before the epidemic will have no idea how to update the value The doctor who computes P(m s) from the other three values will see P(m s) go up proportionally with P(m) 206 All modern probabilistic inference systems are based on the use of Bayes rule On the surface the relatively simple rule does not seem very useful However, as the previous example illustrates, Bayes rule gives a chance to apply existing knowledge We can avoid assessing the probability of the evidence P(s) by instead computing a posterior probability for each value of the query variable m and m and then normalizing the result P(M s) = [ P(s m) P(m), P(s m) P( m) ] Thus, we need to estimate P(s m) instead of P(s) Sometimes easier, sometimes harder 8

9 207 When a probabilistic query has more than one piece of evidence the approach based on full joint probability will not scale up P(Cavity toothache catch) Neither will applying Bayes rule scale up in general P(toothache catch Cavity) P(Cavity) We would need variables to be independent, but variable Toothache and Catch obviously are not: if the probe catches in the tooth, it probably has a cavity and that probably cases a toothache Each is directly caused by the cavity, but neither has a direct effect on the other catch and toothache are conditionally independent given Cavity 208 Conditional independence: P(toothache catch Cavity) = P(toothache Cavity) P(catch Cavity) Plugging this into Bayes rule yields P(Cavity toothache catch) = P(Cavity) P(toothache Cavity) P(catch Cavity) Now we only need three separate distributions The general definition of conditional independence of variables X and Y, given a third variable Z is P(X, Y Z) = P(X Z) P(Y Z) Equivalently, P(X Y, Z) = P(X Z) and P(Y X, Z) = P(Y Z) 9

10 209 If all effects are conditionally independent given a single cause, the exponential size of knowledge representation is cut to linear A probability distribution is called a naïve Bayes model if all effects E 1,, E n are conditionally independent, given a single cause C The full joint probability distribution can be written as P(C, E 1,, E n ) = P(C) i P(E i C) It is often used as a simplifying assumption even in cases where the effect variables are not conditionally independent given the cause variable In practice, naïve Bayes systems can work surprisingly well, even when the independence assumption is not true 10

### 13.3 Inference Using Full Joint Distribution

191 The probability distribution on a single variable must sum to 1 It is also true that any joint probability distribution on any set of variables must sum to 1 Recall that any proposition a is equivalent

### Outline. Uncertainty. Uncertainty. Making decisions under uncertainty. Probability. Methods for handling uncertainty

Outline Uncertainty Chapter 13 Uncertainty Probability Syntax and Semantics Inference Independence and Bayes' Rule Uncertainty Let action A t = leave for airport t minutes before flight Will A t get me

### Artificial Intelligence

Artificial Intelligence ICS461 Fall 2010 Nancy E. Reed nreed@hawaii.edu 1 Lecture #13 Uncertainty Outline Uncertainty Probability Syntax and Semantics Inference Independence and Bayes' Rule Uncertainty

### Basics of Probability

Basics of Probability 1 Sample spaces, events and probabilities Begin with a set Ω the sample space e.g., 6 possible rolls of a die. ω Ω is a sample point/possible world/atomic event A probability space

### / Informatica. Intelligent Systems. Agents dealing with uncertainty. Example: driving to the airport. Why logic fails

Intelligent Systems Prof. dr. Paul De Bra Technische Universiteit Eindhoven debra@win.tue.nl Agents dealing with uncertainty Uncertainty Probability Syntax and Semantics Inference Independence and Bayes'

### Logic, Probability and Learning

Logic, Probability and Learning Luc De Raedt luc.deraedt@cs.kuleuven.be Overview Logic Learning Probabilistic Learning Probabilistic Logic Learning Closely following : Russell and Norvig, AI: a modern

### CS 331: Artificial Intelligence Fundamentals of Probability II. Joint Probability Distribution. Marginalization. Marginalization

Full Joint Probability Distributions CS 331: Artificial Intelligence Fundamentals of Probability II Toothache Cavity Catch Toothache, Cavity, Catch false false false 0.576 false false true 0.144 false

### Uncertainty. Chapter 13. Chapter 13 1

Uncertainty Chapter 13 Chapter 13 1 Attribution Modified from Stuart Russell s slides (Berkeley) Chapter 13 2 Outline Uncertainty Probability Syntax and Semantics Inference Independence and Bayes Rule

### Marginal Independence and Conditional Independence

Marginal Independence and Conditional Independence Computer Science cpsc322, Lecture 26 (Textbook Chpt 6.1-2) March, 19, 2010 Recap with Example Marginalization Conditional Probability Chain Rule Lecture

### Outline. Probability. Random Variables. Joint and Marginal Distributions Conditional Distribution Product Rule, Chain Rule, Bayes Rule.

Probability 1 Probability Random Variables Outline Joint and Marginal Distributions Conditional Distribution Product Rule, Chain Rule, Bayes Rule Inference Independence 2 Inference in Ghostbusters A ghost

### 22c:145 Artificial Intelligence

22c:145 Artificial Intelligence Fall 2005 Uncertainty Cesare Tinelli The University of Iowa Copyright 2001-05 Cesare Tinelli and Hantao Zhang. a a These notes are copyrighted material and may not be used

### Lecture 2: Introduction to belief (Bayesian) networks

Lecture 2: Introduction to belief (Bayesian) networks Conditional independence What is a belief network? Independence maps (I-maps) January 7, 2008 1 COMP-526 Lecture 2 Recall from last time: Conditional

### CS 771 Artificial Intelligence. Reasoning under uncertainty

CS 771 Artificial Intelligence Reasoning under uncertainty Today Representing uncertainty is useful in knowledge bases Probability provides a coherent framework for uncertainty Review basic concepts in

### Data Modeling & Analysis Techniques. Probability & Statistics. Manfred Huber 2011 1

Data Modeling & Analysis Techniques Probability & Statistics Manfred Huber 2011 1 Probability and Statistics Probability and statistics are often used interchangeably but are different, related fields

### Probability, Conditional Independence

Probability, Conditional Independence June 19, 2012 Probability, Conditional Independence Probability Sample space Ω of events Each event ω Ω has an associated measure Probability of the event P(ω) Axioms

### A crash course in probability and Naïve Bayes classification

Probability theory A crash course in probability and Naïve Bayes classification Chapter 9 Random variable: a variable whose possible values are numerical outcomes of a random phenomenon. s: A person s

### CS 188: Artificial Intelligence. Probability recap

CS 188: Artificial Intelligence Bayes Nets Representation and Independence Pieter Abbeel UC Berkeley Many slides over this course adapted from Dan Klein, Stuart Russell, Andrew Moore Conditional probability

### Introduction to Probability (1)

Introduction to Probability (1) John Kelleher & Brian Mac Namee 1 Introduction Summarizing Uncertainty 2 Where Do Probabilities Come From? 3 Basic Probability Notation Basics 4 The Axioms of Probabilities

### Artificial Intelligence Mar 27, Bayesian Networks 1 P (T D)P (D) + P (T D)P ( D) =

Artificial Intelligence 15-381 Mar 27, 2007 Bayesian Networks 1 Recap of last lecture Probability: precise representation of uncertainty Probability theory: optimal updating of knowledge based on new information

### Probability Review. ICPSR Applied Bayesian Modeling

Probability Review ICPSR Applied Bayesian Modeling Random Variables Flip a coin. Will it be heads or tails? The outcome of a single event is random, or unpredictable What if we flip a coin 10 times? How

### Artificial Intelligence. Conditional probability. Inference by enumeration. Independence. Lesson 11 (From Russell & Norvig)

Artificial Intelligence Conditional probability Conditional or posterior probabilities e.g., cavity toothache) = 0.8 i.e., given that toothache is all I know tation for conditional distributions: Cavity

### E3: PROBABILITY AND STATISTICS lecture notes

E3: PROBABILITY AND STATISTICS lecture notes 2 Contents 1 PROBABILITY THEORY 7 1.1 Experiments and random events............................ 7 1.2 Certain event. Impossible event............................

### Probability & Statistics Primer Gregory J. Hakim University of Washington 2 January 2009 v2.0

Probability & Statistics Primer Gregory J. Hakim University of Washington 2 January 2009 v2.0 This primer provides an overview of basic concepts and definitions in probability and statistics. We shall

### Bayesian Networks. Mausam (Slides by UW-AI faculty)

Bayesian Networks Mausam (Slides by UW-AI faculty) Bayes Nets In general, joint distribution P over set of variables (X 1 x... x X n ) requires exponential space for representation & inference BNs provide

### The Basics of Graphical Models

The Basics of Graphical Models David M. Blei Columbia University October 3, 2015 Introduction These notes follow Chapter 2 of An Introduction to Probabilistic Graphical Models by Michael Jordan. Many figures

### + Section 6.2 and 6.3

Section 6.2 and 6.3 Learning Objectives After this section, you should be able to DEFINE and APPLY basic rules of probability CONSTRUCT Venn diagrams and DETERMINE probabilities DETERMINE probabilities

### Bayesian Tutorial (Sheet Updated 20 March)

Bayesian Tutorial (Sheet Updated 20 March) Practice Questions (for discussing in Class) Week starting 21 March 2016 1. What is the probability that the total of two dice will be greater than 8, given that

### CHAPTER 2 Estimating Probabilities

CHAPTER 2 Estimating Probabilities Machine Learning Copyright c 2016. Tom M. Mitchell. All rights reserved. *DRAFT OF January 24, 2016* *PLEASE DO NOT DISTRIBUTE WITHOUT AUTHOR S PERMISSION* This is a

### Querying Joint Probability Distributions

Querying Joint Probability Distributions Sargur Srihari srihari@cedar.buffalo.edu 1 Queries of Interest Probabilistic Graphical Models (BNs and MNs) represent joint probability distributions over multiple

### Bayesian Networks Chapter 14. Mausam (Slides by UW-AI faculty & David Page)

Bayesian Networks Chapter 14 Mausam (Slides by UW-AI faculty & David Page) Bayes Nets In general, joint distribution P over set of variables (X 1 x... x X n ) requires exponential space for representation

### Statistics in Geophysics: Introduction and Probability Theory

Statistics in Geophysics: Introduction and Steffen Unkel Department of Statistics Ludwig-Maximilians-University Munich, Germany Winter Term 2013/14 1/32 What is Statistics? Introduction Statistics is the

### Probability Theory. Elementary rules of probability Sum rule. Product rule. p. 23

Probability Theory Uncertainty is key concept in machine learning. Probability provides consistent framework for the quantification and manipulation of uncertainty. Probability of an event is the fraction

### Proof: A logical argument establishing the truth of the theorem given the truth of the axioms and any previously proven theorems.

Math 232 - Discrete Math 2.1 Direct Proofs and Counterexamples Notes Axiom: Proposition that is assumed to be true. Proof: A logical argument establishing the truth of the theorem given the truth of the

### Ching-Han Hsu, BMES, National Tsing Hua University c 2014 by Ching-Han Hsu, Ph.D., BMIR Lab

Lecture 2 Probability BMIR Lecture Series in Probability and Statistics Ching-Han Hsu, BMES, National Tsing Hua University c 2014 by Ching-Han Hsu, Ph.D., BMIR Lab 2.1 1 Sample Spaces and Events Random

### A Few Basics of Probability

A Few Basics of Probability Philosophy 57 Spring, 2004 1 Introduction This handout distinguishes between inductive and deductive logic, and then introduces probability, a concept essential to the study

### Intelligent Systems: Reasoning and Recognition. Uncertainty and Plausible Reasoning

Intelligent Systems: Reasoning and Recognition James L. Crowley ENSIMAG 2 / MoSIG M1 Second Semester 2015/2016 Lesson 12 25 march 2015 Uncertainty and Plausible Reasoning MYCIN (continued)...2 Backward

### Probability OPRE 6301

Probability OPRE 6301 Random Experiment... Recall that our eventual goal in this course is to go from the random sample to the population. The theory that allows for this transition is the theory of probability.

### Learning from Data: Naive Bayes

Semester 1 http://www.anc.ed.ac.uk/ amos/lfd/ Naive Bayes Typical example: Bayesian Spam Filter. Naive means naive. Bayesian methods can be much more sophisticated. Basic assumption: conditional independence.

### Welcome to Stochastic Processes 1. Welcome to Aalborg University No. 1 of 31

Welcome to Stochastic Processes 1 Welcome to Aalborg University No. 1 of 31 Welcome to Aalborg University No. 2 of 31 Course Plan Part 1: Probability concepts, random variables and random processes Lecturer:

### Section 7.1 First-Order Predicate Calculus predicate Existential Quantifier Universal Quantifier.

Section 7.1 First-Order Predicate Calculus Predicate calculus studies the internal structure of sentences where subjects are applied to predicates existentially or universally. A predicate describes a

### 15-381: Artificial Intelligence. Probabilistic Reasoning and Inference

5-38: Artificial Intelligence robabilistic Reasoning and Inference Advantages of probabilistic reasoning Appropriate for complex, uncertain, environments - Will it rain tomorrow? Applies naturally to many

### Probability for Estimation (review)

Probability for Estimation (review) In general, we want to develop an estimator for systems of the form: x = f x, u + η(t); y = h x + ω(t); ggggg y, ffff x We will primarily focus on discrete time linear

10-601 Machine Learning http://www.cs.cmu.edu/afs/cs/academic/class/10601-f10/index.html Course data All up-to-date info is on the course web page: http://www.cs.cmu.edu/afs/cs/academic/class/10601-f10/index.html

### Joint Probability Distributions and Random Samples. Week 5, 2011 Stat 4570/5570 Material from Devore s book (Ed 8), and Cengage

5 Joint Probability Distributions and Random Samples Week 5, 2011 Stat 4570/5570 Material from Devore s book (Ed 8), and Cengage Two Discrete Random Variables The probability mass function (pmf) of a single

### Probabilities and Random Variables

Probabilities and Random Variables This is an elementary overview of the basic concepts of probability theory. 1 The Probability Space The purpose of probability theory is to model random experiments so

### People have thought about, and defined, probability in different ways. important to note the consequences of the definition:

PROBABILITY AND LIKELIHOOD, A BRIEF INTRODUCTION IN SUPPORT OF A COURSE ON MOLECULAR EVOLUTION (BIOL 3046) Probability The subject of PROBABILITY is a branch of mathematics dedicated to building models

### 6.3 Conditional Probability and Independence

222 CHAPTER 6. PROBABILITY 6.3 Conditional Probability and Independence Conditional Probability Two cubical dice each have a triangle painted on one side, a circle painted on two sides and a square painted

### n logical not (negation) n logical or (disjunction) n logical and (conjunction) n logical exclusive or n logical implication (conditional)

Discrete Math Review Discrete Math Review (Rosen, Chapter 1.1 1.6) TOPICS Propositional Logic Logical Operators Truth Tables Implication Logical Equivalence Inference Rules What you should know about propositional

### Discrete Mathematics and Probability Theory Fall 2009 Satish Rao,David Tse Note 11

CS 70 Discrete Mathematics and Probability Theory Fall 2009 Satish Rao,David Tse Note Conditional Probability A pharmaceutical company is marketing a new test for a certain medical condition. According

### Unit 19: Probability Models

Unit 19: Probability Models Summary of Video Probability is the language of uncertainty. Using statistics, we can better predict the outcomes of random phenomena over the long term from the very complex,

### These axioms must hold for all vectors ū, v, and w in V and all scalars c and d.

DEFINITION: A vector space is a nonempty set V of objects, called vectors, on which are defined two operations, called addition and multiplication by scalars (real numbers), subject to the following axioms

### Incorporating Evidence in Bayesian networks with the Select Operator

Incorporating Evidence in Bayesian networks with the Select Operator C.J. Butz and F. Fang Department of Computer Science, University of Regina Regina, Saskatchewan, Canada SAS 0A2 {butz, fang11fa}@cs.uregina.ca

### Section 6.1 Joint Distribution Functions

Section 6.1 Joint Distribution Functions We often care about more than one random variable at a time. DEFINITION: For any two random variables X and Y the joint cumulative probability distribution function

### Applications of Methods of Proof

CHAPTER 4 Applications of Methods of Proof 1. Set Operations 1.1. Set Operations. The set-theoretic operations, intersection, union, and complementation, defined in Chapter 1.1 Introduction to Sets are

### Math/Stats 425 Introduction to Probability. 1. Uncertainty and the axioms of probability

Math/Stats 425 Introduction to Probability 1. Uncertainty and the axioms of probability Processes in the real world are random if outcomes cannot be predicted with certainty. Example: coin tossing, stock

### Bayesian Updating with Discrete Priors Class 11, 18.05, Spring 2014 Jeremy Orloff and Jonathan Bloom

1 Learning Goals Bayesian Updating with Discrete Priors Class 11, 18.05, Spring 2014 Jeremy Orloff and Jonathan Bloom 1. Be able to apply Bayes theorem to compute probabilities. 2. Be able to identify

### MATH 10: Elementary Statistics and Probability Chapter 3: Probability Topics

MATH 10: Elementary Statistics and Probability Chapter 3: Probability Topics Tony Pourmohamad Department of Mathematics De Anza College Spring 2015 Objectives By the end of this set of slides, you should

### Information Theory and Coding Prof. S. N. Merchant Department of Electrical Engineering Indian Institute of Technology, Bombay

Information Theory and Coding Prof. S. N. Merchant Department of Electrical Engineering Indian Institute of Technology, Bombay Lecture - 17 Shannon-Fano-Elias Coding and Introduction to Arithmetic Coding

### Review. Bayesianism and Reliability. Today s Class

Review Bayesianism and Reliability Models and Simulations in Philosophy April 14th, 2014 Last Class: Difference between individual and social epistemology Why simulations are particularly useful for social

### CS 441 Discrete Mathematics for CS Lecture 2. Propositional logic. CS 441 Discrete mathematics for CS. Course administration

CS 441 Discrete Mathematics for CS Lecture 2 Propositional logic Milos Hauskrecht milos@cs.pitt.edu 5329 Sennott Square Course administration Homework 1 First homework assignment is out today will be posted

### The rule for computing conditional property can be interpreted different. In Question 2, P B

Question 4: What is the product rule for probability? The rule for computing conditional property can be interpreted different. In Question 2, P A and B we defined the conditional probability PA B. If

### Chapter 3: The basic concepts of probability

Chapter 3: The basic concepts of probability Experiment: a measurement process that produces quantifiable results (e.g. throwing two dice, dealing cards, at poker, measuring heights of people, recording

### Fundamentals of Mathematics Lecture 6: Propositional Logic

Fundamentals of Mathematics Lecture 6: Propositional Logic Guan-Shieng Huang National Chi Nan University, Taiwan Spring, 2008 1 / 39 Connectives Propositional Connectives I 1 Negation: (not A) A A T F

### Probability - Part I. Definition : A random experiment is an experiment or a process for which the outcome cannot be predicted with certainty.

Probability - Part I Definition : A random experiment is an experiment or a process for which the outcome cannot be predicted with certainty. Definition : The sample space (denoted S) of a random experiment

### STA 256: Statistics and Probability I

Al Nosedal. University of Toronto. Fall 2014 1 2 3 4 5 My momma always said: Life was like a box of chocolates. You never know what you re gonna get. Forrest Gump. Experiment, outcome, sample space, and

### EE 302 Division 1. Homework 5 Solutions.

EE 32 Division. Homework 5 Solutions. Problem. A fair four-sided die (with faces labeled,, 2, 3) is thrown once to determine how many times a fair coin is to be flipped: if N is the number that results

### Probability for AI students who have seen some probability before but didn t understand any of it

Probability for AI students who have seen some probability before but didn t understand any of it I am inflicting these proofs on you for two reasons: 1. These kind of manipulations will need to be second

### Attribution. Modified from Stuart Russell s slides (Berkeley) Parts of the slides are inspired by Dan Klein s lecture material for CS 188 (Berkeley)

Machine Learning 1 Attribution Modified from Stuart Russell s slides (Berkeley) Parts of the slides are inspired by Dan Klein s lecture material for CS 188 (Berkeley) 2 Outline Inductive learning Decision

### Probability. Vocabulary

MAT 142 College Mathematics Probability Module #PM Terri L. Miller & Elizabeth E. K. Jones revised January 5, 2011 Vocabulary In order to discuss probability we will need a fair bit of vocabulary. Probability

### Bayesian Networks. Read R&N Ch. 14.1-14.2. Next lecture: Read R&N 18.1-18.4

Bayesian Networks Read R&N Ch. 14.1-14.2 Next lecture: Read R&N 18.1-18.4 You will be expected to know Basic concepts and vocabulary of Bayesian networks. Nodes represent random variables. Directed arcs

### Bayesian Analysis for the Social Sciences

Bayesian Analysis for the Social Sciences Simon Jackman Stanford University http://jackman.stanford.edu/bass November 9, 2012 Simon Jackman (Stanford) Bayesian Analysis for the Social Sciences November

### A Tutorial on Probability Theory

Paola Sebastiani Department of Mathematics and Statistics University of Massachusetts at Amherst Corresponding Author: Paola Sebastiani. Department of Mathematics and Statistics, University of Massachusetts,

### Linear Threshold Units

Linear Threshold Units w x hx (... w n x n w We assume that each feature x j and each weight w j is a real number (we will relax this later) We will study three different algorithms for learning linear

### The Foundations: Logic and Proofs. Chapter 1, Part III: Proofs

The Foundations: Logic and Proofs Chapter 1, Part III: Proofs Rules of Inference Section 1.6 Section Summary Valid Arguments Inference Rules for Propositional Logic Using Rules of Inference to Build Arguments

### Chapter Chapter Goals. Assessing Probability. Important Terms. Events. Sample Space. Chapter 4 Basic Probability

Chapter 4 4- Chapter Goals Chapter 4 Basic Probability fter completing this chapter, you should be able to: Explain basic probability concepts and definitions Use contingency tables to view a sample space

### A set is a Many that allows itself to be thought of as a One. (Georg Cantor)

Chapter 4 Set Theory A set is a Many that allows itself to be thought of as a One. (Georg Cantor) In the previous chapters, we have often encountered sets, for example, prime numbers form a set, domains

### Probability II (MATH 2647)

Probability II (MATH 2647) Lecturer Dr. O. Hryniv email Ostap.Hryniv@durham.ac.uk office CM309 http://maths.dur.ac.uk/stats/courses/probmc2h/probability2h.html or via DUO This term we shall consider: Review

### Factoring. Factoring 1

Factoring Factoring 1 Factoring Security of RSA algorithm depends on (presumed) difficulty of factoring o Given N = pq, find p or q and RSA is broken o Rabin cipher also based on factoring Factoring like

### Sets and functions. {x R : x > 0}.

Sets and functions 1 Sets The language of sets and functions pervades mathematics, and most of the important operations in mathematics turn out to be functions or to be expressible in terms of functions.

### 3. Mathematical Induction

3. MATHEMATICAL INDUCTION 83 3. Mathematical Induction 3.1. First Principle of Mathematical Induction. Let P (n) be a predicate with domain of discourse (over) the natural numbers N = {0, 1,,...}. If (1)

### 4.6. section. Minors. Solution To find the minor for 2, delete the first row and first column of the matrix:

In this section 4.6 4.6 Cramer s Rule for Systems in Three Variables (4-39) 237 CRAMER S RULE FOR SYSTEMS IN THREE VARIABLES The solution of linear systems involving three variables using determinants

### Statistical Inference

Statistical Inference Idea: Estimate parameters of the population distribution using data. How: Use the sampling distribution of sample statistics and methods based on what would happen if we used this

### MATH 55: HOMEWORK #2 SOLUTIONS

MATH 55: HOMEWORK # SOLUTIONS ERIC PETERSON * 1. SECTION 1.5: NESTED QUANTIFIERS 1.1. Problem 1.5.8. Determine the truth value of each of these statements if the domain of each variable consists of all

### L10: Probability, statistics, and estimation theory

L10: Probability, statistics, and estimation theory Review of probability theory Bayes theorem Statistics and the Normal distribution Least Squares Error estimation Maximum Likelihood estimation Bayesian

### def: An axiom is a statement that is assumed to be true, or in the case of a mathematical system, is used to specify the system.

Section 1.5 Methods of Proof 1.5.1 1.5 METHODS OF PROOF Some forms of argument ( valid ) never lead from correct statements to an incorrect. Some other forms of argument ( fallacies ) can lead from true

### MATRIX ALGEBRA AND SYSTEMS OF EQUATIONS

MATRIX ALGEBRA AND SYSTEMS OF EQUATIONS Systems of Equations and Matrices Representation of a linear system The general system of m equations in n unknowns can be written a x + a 2 x 2 + + a n x n b a

### CONTINGENCY (CROSS- TABULATION) TABLES

CONTINGENCY (CROSS- TABULATION) TABLES Presents counts of two or more variables A 1 A 2 Total B 1 a b a+b B 2 c d c+d Total a+c b+d n = a+b+c+d 1 Joint, Marginal, and Conditional Probability We study methods

### The Classical and Relative Frequency Definitions of Probability

The Classical and Relative Frequency Definitions of Probability Jacco Thijssen January 2008 Let (S, A) be a sample space and a collection of events, i.e. subsets of S. As mentioned before, a probability

### max cx s.t. Ax c where the matrix A, cost vector c and right hand side b are given and x is a vector of variables. For this example we have x

Linear Programming Linear programming refers to problems stated as maximization or minimization of a linear function subject to constraints that are linear equalities and inequalities. Although the study

### Probability and statistics; Rehearsal for pattern recognition

Probability and statistics; Rehearsal for pattern recognition Václav Hlaváč Czech Technical University in Prague Faculty of Electrical Engineering, Department of Cybernetics Center for Machine Perception

### Chapter ML:IV. IV. Statistical Learning. Probability Basics Bayes Classification Maximum a-posteriori Hypotheses

Chapter ML:IV IV. Statistical Learning Probability Basics Bayes Classification Maximum a-posteriori Hypotheses ML:IV-1 Statistical Learning STEIN 2005-2015 Area Overview Mathematics Statistics...... Stochastics

### Probability Models.S1 Introduction to Probability

Probability Models.S1 Introduction to Probability Operations Research Models and Methods Paul A. Jensen and Jonathan F. Bard The stochastic chapters of this book involve random variability. Decisions are

### Section 1. Statements and Truth Tables. Definition 1.1: A mathematical statement is a declarative sentence that is true or false, but not both.

M3210 Supplemental Notes: Basic Logic Concepts In this course we will examine statements about mathematical concepts and relationships between these concepts (definitions, theorems). We will also consider

### Lecture 1 Introduction Properties of Probability Methods of Enumeration Asrat Temesgen Stockholm University

Lecture 1 Introduction Properties of Probability Methods of Enumeration Asrat Temesgen Stockholm University 1 Chapter 1 Probability 1.1 Basic Concepts In the study of statistics, we consider experiments

### Finite and discrete probability distributions

8 Finite and discrete probability distributions To understand the algorithmic aspects of number theory and algebra, and applications such as cryptography, a firm grasp of the basics of probability theory

### Business Statistics 41000: Probability 1

Business Statistics 41000: Probability 1 Drew D. Creal University of Chicago, Booth School of Business Week 3: January 24 and 25, 2014 1 Class information Drew D. Creal Email: dcreal@chicagobooth.edu Office:

### 1.4/1.5 Predicates and Quantifiers

1.4/1.5 Predicates and Quantifiers Can we express the following statement in propositional logic? Every computer has a CPU No Outline: Predicates Quantifiers Domain of Predicate Translation: Natural Languauge

### Conditional Probability, Hypothesis Testing, and the Monty Hall Problem

Conditional Probability, Hypothesis Testing, and the Monty Hall Problem Ernie Croot September 17, 2008 On more than one occasion I have heard the comment Probability does not exist in the real world, and