/ Informatica. Intelligent Systems. Agents dealing with uncertainty. Example: driving to the airport. Why logic fails

Save this PDF as:
 WORD  PNG  TXT  JPG

Size: px
Start display at page:

Download "/ Informatica. Intelligent Systems. Agents dealing with uncertainty. Example: driving to the airport. Why logic fails"

Transcription

1 Intelligent Systems Prof. dr. Paul De Bra Technische Universiteit Eindhoven Agents dealing with uncertainty Uncertainty Probability Syntax and Semantics Inference Independence and Bayes' Rule Intelligent Systems Intelligent Systems 3 Example: driving to the airport A t = leave for airport t minutes before flight. Will A t get me there on time? Problems: 1. partial observability (road state, other drivers' plans, etc.) 2. noisy sensors (traffic reports) 3. uncertainty in action outcomes (flat tire, engine failure, etc.) 4. immense complexity of modeling and predicting traffic Hence a purely logical approach either 1. risks falsehood: A 60 will get me there on time, or 2. leads to conclusions that are too weak for decision making: A 60 will get me there on time if there's no accident on the highway and it doesn't rain and my tires remain intact, and the parking lot isn t full, etc. (A 180 will get me there on time even if I have to abandon my car along the way and walk, but I risk having a very long wait before my flight.) Intelligent Systems 4 Why logic fails Example: dental diagnosis p Symptom(p, Toothache) Disease(p, Cavity) this rule is wrong: a cavity is not the only cause of a toothache p Symptom(p, Toothache) Disease(p, Cavity) Disease(p, GumDisease) Disease(p, Abscess) unfortunately the causes for toothache are almost unlimited p Disease(p, Cavity) Symptom(p, Toothache) this is also not true (after a root canal you do not feel a cavity) Three reasons why (first-order) logic fails: Laziness: too much work to write all the rule that apply Theoretical ignorance: medical science is largely incomplete Practical ignorance: you cannot run all the test needed to be certain Intelligent Systems 5 Reasoning with uncertainty fuzzy logic We deal with sentences that are either true or false (in the real world) the patient either has a cavity or has no cavity we may use belief or probability: we are 80% certain that the patient has a cavity it is not 80% true that the patient has a cavity In fuzzy logic a sentence can be partially true: truth is a value between 0 and 1 T(A B) = min(t(a),t(b)), T(A B) = max(t(a),t(b), T( A) = 1 T(A) We deal with uncertainty, not with fuzzy logic Intelligent Systems 6 Methods for handling uncertainty Default or nonmonotonic logic: Assume my car does not have a flat tire Assume A 60 works unless contradicted by evidence Issues: Which assumptions are reasonable? How to handle contradiction? Rules with fudge factors : A get there on time Sprinkler 0.99 WetGrass WetGrass 0.7 Rain Issues: Problems with combination, e.g., Sprinkler causes Rain?? Probability Model agent's degree of belief Given the available evidence, A 45 will get me there on time with probability Intelligent Systems 7

2 Nonmonotonic logics Circumscription (closed world assumption): uses predicates that are false for every object unless known to be true (for abnormal objects) Bird(x) Abnormal(x) Flies(x) this is model preference logic: entailment when true in all preferred models of the KB Default logic (uses default rules): Bird(x) : Flies(x) Flies(x) if Bird(x) is true, and Flies(x) is consistent with KB then Flies(x) is concluded by default in general: P : J1,,Jn C, P is prerequisite, C is conclusion, Ji are justifications We are going to use probability instead of nonmonotonic logics Intelligent Systems 8 Probability Probabilistic assertions summarize effects of laziness : failure to enumerate exceptions, qualifications, etc. ignorance : lack of relevant facts, initial conditions, etc. Subjective probability: Probabilities relate propositions to agent's own state of knowledge e.g., P(A 45 no reported accidents) = 0.05 These are not assertions about the world Probabilities of propositions change with new evidence: e.g., P(A 45 no reported accidents, 5 a.m.) = Intelligent Systems 9 Making decisions under uncertainty Suppose I believe the following: P(A 45 gets me there on time ) = 0.05 P(A 60 gets me there on time ) = 0.70 P(A 90 gets me there on time ) = 0.95 P(A 270 gets me there on time ) = Which action to choose? Depends on my preferences for missing flight vs. time spent waiting, etc. Utility theory is used to represent and infer preferences Decision theory = probability theory + utility theory Intelligent Systems 10 Syntax Basic element: random variable Similar to propositional logic: possible worlds defined by assignment of values to random variables. Three kinds: Boolean, discrete and continuous random variables e.g., cavity (tf), weather (sunny,rainy,cloudy,snow>, time, speed, size Domain values must be exhaustive and mutually exclusive Elementary proposition constructed by assignment of a value to a random variable: e.g., Weather = sunny, Cavity = false Complex propositions formed from elementary propositions and standard logical connectives e.g., Weather = sunny Cavity = false Intelligent Systems 11 Probability of continuous variables With continuous variables the probability of any specific value is always 0 We use a probability density function: assume is distributed uniformly between 5 and 15 P( = 10.0) = U[5,15](10.0) = 0.1C. This does not mean the chance that will be 10.0 is 0.1 This means: lim P(10.0 dx 10.0+dx) 2dx = 0.1C. dx 0 Syntax Atomic event: A complete specification of the state of the world about which the agent is uncertain E.g., if the world consists of only two Boolean variables Cavity and Toothache, then there are 4 distinct atomic events: Cavity = false Toothache = false Cavity = false Toothache = true Cavity = true Toothache = false Cavity = true Toothache = true Atomic events are mutually exclusive and exhaustive Intelligent Systems Intelligent Systems 13

3 Axioms of probability For any propositions A, B 1. 0 P(A) 1 2. P(true) = 1 and P(false) = 0 3. P(A B) = P(A) + P(B) P(A B) Exercise: show that P( A) = 1 P(A) using the above axioms This exercise can be easily extended to the discrete case Exercise: why is it impossible that: P(A) = 0.4, P(B) = 0.3, P(A B) = 0.0 and P(A B) = Intelligent Systems 14 Prior probability Prioror unconditional probabilities of propositions e.g., P(Cavity = true) = 0.25 and P(Weather = sunny) = 0.72 correspond to belief prior to arrival of any (new) evidence Probability distribution gives values for all possible assignments: P(Weather) = <0.72,0.1,0.08,0.1> (normalized, i.e., sums to 1) Joint probability distribution for a set of random variables gives the probability of every atomic event on those random variables P(Weather, Cavity) = a 4 2 matrix of values: Weather = sunny rainy cloudy snow Cavity = true Cavity = false Every question about a domain can be answered by the joint distribution Intelligent Systems 15 Conditional probability Conditional or posterior probabilities e.g., P(cavity toothache ) = 0.8 i.e., given that toothache is all I know (Notation for conditional distributions: P(Cavity Toothache) = 2-element vector of 2-element vectors) If we know more, e.g., cavity is also given, then we have P(cavity toothache, cavity ) = 1 New evidence may be irrelevant, allowing simplification, e.g., P(cavity toothache, sunny) = P(cavity toothache) = 0.8 This kind of inference, sanctioned by domain knowledge, is crucial Intelligent Systems 16 Conditional probability Definition of conditional probability: P(a b) = P(a b) P(b) if P(b ) > 0 Product rule gives an alternative formulation: P(a b) = P(a b) P(b) = P(b a) P(a ) A general version holds for whole distributions, e.g., P(Weather, Cavity) = P(Weather Cavity) P(Cavity) (View as a set of 4 2 equations, not matrix mult.) Chain rule is derived by successive application of product rule: P( 1,, n ) = P( 1,..., n-1 ) P( n 1,..., n-1 ) = P( 1,..., n-2 ) P( n-1 1,..., n-2 ) P( n 1,..., n-1 ) = = π i= 1^n P( i 1,, i-1 ) Intelligent Systems 17 Start with the full joint probability distribution: Start with the joint probability distribution: For any proposition φ, sum the atomic events where it is true: P(φ) = Σ ω:ω φ P(ω) For any proposition φ, sum the atomic events where it is true: P(φ) = Σ ω:ω φ P(ω ) P(toothache cavity) = = Intelligent Systems Intelligent Systems 19

4 Start with the joint probability distribution: Start with the joint probability distribution: For any proposition φ, sum the atomic events where it is true: P(φ) = Σ ω:ω φ P(ω ) P(toothache ) = = 0.2 also called marginal probability of toothache Intelligent Systems 20 Can also compute conditional probabilities: P( cavity toothache) = P( cavity toothache) P(toothache) = = Intelligent Systems 21 Normalization Denominator can be viewed as a normalization constant α P(Cavity toothache) = α P(Cavity,toothache) = α [P(Cavity,toothache,catch) + P(Cavity,toothache, catch)] = α [<0.108,0.016> + <0.0,0.064>] = α <0.,0.08> = <0.6,0.4> General idea: compute distribution on query variable by fixing evidence variables and summing over hidden variables Intelligent Systems 22, contd. Typically, we are interested in the posterior joint distribution of the query variable(s) given specific values e for the evidence variables E Let the remaining unobserved variables be Y Then the required summation of joint entries is done by summing out the hidden variables: P( E = e) = αp(, E = e) = ασ y P(, E= e, Y = y ) The terms in the summation are joint entries because, E and Y together exhaust the set of random variables Obvious problems: 1. Worst-case time complexity O(d n ) where d is the largest arity 2. Space complexity O(d n ) to store the joint distribution 3. How to find the numbers for O(d n ) entries? Intelligent Systems 23 Independence A and B are independent iff P(A B) = P(A) or P(B A) = P(B) P(Toothache, Catch, Cavity, Weather) = P(Toothache, Catch, Cavity) P(Weather) or P(A B) = P(A) P(B ) 32 entries reduced to ; for n independent biased coins, O(2 n ) O(n) Absolute independence is powerful but rare Dentistry is a large field with hundreds of variables, none of which are independent. What to do? Intelligent Systems 24 Conditional independence P(Toothache, Cavity, Catch) has = 7 independent entries If I have a cavity, the probability that the probe catches in it doesn't depend on whether I have a toothache: (1) P(catch toothache, cavity) = P(catch cavity) The same independence holds if I haven't got a cavity: (2) P(catch toothache, cavity) = P(catch cavity) Catch is conditionally independent of toothache given cavity : P(catch toothache,cavity) = P(catch cavity ) Equivalent statements: P(toothache catch, cavity) = P(toothache cavity ) P(toothache, catch cavity) = P(toothache cavity) P(catch cavity ) Intelligent Systems 25

5 Conditional independence contd. Write out full joint distribution using chain rule: P(toothache, catch, cavity) = P(toothache catch, cavity) P(catch, cavity ) = P(toothache catch, cavity) P(catch cavity) P(cavity ) = P(toothache cavity) P(catch cavity) P(cavity ) i.e., = 5 independent numbers In most cases, the use of conditional independence reduces the size of the representation of the joint distribution from exponential in n to linear in n. Conditional independence is our most basic and robust form of knowledge about uncertain environments Intelligent Systems 26 Bayes' Rule Product rule P(a b) = P(a b) P(b) = P(b a) P(a ) Bayes' rule: P(a b) = P(b a) P(a) P(b ) or in distribution form P(Y ) = P( Y) P(Y) P() = α P( Y) P(Y) Useful for assessing diagnostic probability from causal probability: P(Cause Effect) = P(Effect Cause) P(Cause) P(Effect ) E.g., let M be meningitis, S be stiff neck: P(m s) = P(s m) P(m) P(s) = = Note: posterior probability of meningitis still very small! Intelligent Systems 27 Bayes' Rule and conditional independence P(cavity toothache catch) = α P(toothache catch Cavity) P(Cavity) = α P(toothache Cavity) P(catch Cavity) P(Cavity) This is an example of a naïve Bayes model: P(Cause,Effect 1,,Effect n ) = P(Cause) π i P(Effect i Cause ) Total number of parameters is linear in n The Wumpus World Revisited Propositional logic cannot solve all cases A breeze in [1,2] and [2,1]. which move to make? (no move is safe) Calculate best move, knowing that each square except [1,1] contains a pit with probability 0.2. Can you guess P(p1,3), P(p2,2) and P(p3,1)?? B? B? Intelligent Systems Intelligent Systems 29 The Wumpus World Revisited (cont.) We do not need the full joint distribution because all the squares marked with do not matter. Let us query [1,3]. Let fringe be the other squares adjacent to visited squares. known is p1,1 p1,2 p2,1 evidence b is b1,1 b1,2 b2,1 probabilities for pits in a pair of squares is = 0.04 probabilities for a pit in one square of a pair and not in the other square of the pair is 0.2 x 0.8 = 0.16 P(p1,3 known, b ) = α 0.2( ), 0.8( ) 0.31, 0.69 likewise P(p2,2 known, b ) = 0.86, 0.14 so moving to [2,2] is a very bad decision Intelligent Systems 30 Example: Burglary Alarm Two types of event set off the alarm. Two people might call when the alarm goes off Intelligent Systems 31

6 Order is important in a Bayesian Network Network becomes very strange if we start from the other end Intelligent Systems 32 Summary Probability is a rigorous formalism for uncertain knowledge Joint probability distribution specifies probability of every atomic event Queries can be answered by summing over atomic events For nontrivial domains, we must find a way to reduce the joint size Independence and conditional independence provide the tools Intelligent Systems 33

Basics of Probability

Basics of Probability Basics of Probability 1 Sample spaces, events and probabilities Begin with a set Ω the sample space e.g., 6 possible rolls of a die. ω Ω is a sample point/possible world/atomic event A probability space

More information

13.3 Inference Using Full Joint Distribution

13.3 Inference Using Full Joint Distribution 191 The probability distribution on a single variable must sum to 1 It is also true that any joint probability distribution on any set of variables must sum to 1 Recall that any proposition a is equivalent

More information

Marginal Independence and Conditional Independence

Marginal Independence and Conditional Independence Marginal Independence and Conditional Independence Computer Science cpsc322, Lecture 26 (Textbook Chpt 6.1-2) March, 19, 2010 Recap with Example Marginalization Conditional Probability Chain Rule Lecture

More information

Outline. Probability. Random Variables. Joint and Marginal Distributions Conditional Distribution Product Rule, Chain Rule, Bayes Rule.

Outline. Probability. Random Variables. Joint and Marginal Distributions Conditional Distribution Product Rule, Chain Rule, Bayes Rule. Probability 1 Probability Random Variables Outline Joint and Marginal Distributions Conditional Distribution Product Rule, Chain Rule, Bayes Rule Inference Independence 2 Inference in Ghostbusters A ghost

More information

Artificial Intelligence. Conditional probability. Inference by enumeration. Independence. Lesson 11 (From Russell & Norvig)

Artificial Intelligence. Conditional probability. Inference by enumeration. Independence. Lesson 11 (From Russell & Norvig) Artificial Intelligence Conditional probability Conditional or posterior probabilities e.g., cavity toothache) = 0.8 i.e., given that toothache is all I know tation for conditional distributions: Cavity

More information

Logic, Probability and Learning

Logic, Probability and Learning Logic, Probability and Learning Luc De Raedt luc.deraedt@cs.kuleuven.be Overview Logic Learning Probabilistic Learning Probabilistic Logic Learning Closely following : Russell and Norvig, AI: a modern

More information

CS 331: Artificial Intelligence Fundamentals of Probability II. Joint Probability Distribution. Marginalization. Marginalization

CS 331: Artificial Intelligence Fundamentals of Probability II. Joint Probability Distribution. Marginalization. Marginalization Full Joint Probability Distributions CS 331: Artificial Intelligence Fundamentals of Probability II Toothache Cavity Catch Toothache, Cavity, Catch false false false 0.576 false false true 0.144 false

More information

Probability, Conditional Independence

Probability, Conditional Independence Probability, Conditional Independence June 19, 2012 Probability, Conditional Independence Probability Sample space Ω of events Each event ω Ω has an associated measure Probability of the event P(ω) Axioms

More information

CS 188: Artificial Intelligence. Probability recap

CS 188: Artificial Intelligence. Probability recap CS 188: Artificial Intelligence Bayes Nets Representation and Independence Pieter Abbeel UC Berkeley Many slides over this course adapted from Dan Klein, Stuart Russell, Andrew Moore Conditional probability

More information

Data Modeling & Analysis Techniques. Probability & Statistics. Manfred Huber 2011 1

Data Modeling & Analysis Techniques. Probability & Statistics. Manfred Huber 2011 1 Data Modeling & Analysis Techniques Probability & Statistics Manfred Huber 2011 1 Probability and Statistics Probability and statistics are often used interchangeably but are different, related fields

More information

Lecture 2: Introduction to belief (Bayesian) networks

Lecture 2: Introduction to belief (Bayesian) networks Lecture 2: Introduction to belief (Bayesian) networks Conditional independence What is a belief network? Independence maps (I-maps) January 7, 2008 1 COMP-526 Lecture 2 Recall from last time: Conditional

More information

Bayesian Networks. Mausam (Slides by UW-AI faculty)

Bayesian Networks. Mausam (Slides by UW-AI faculty) Bayesian Networks Mausam (Slides by UW-AI faculty) Bayes Nets In general, joint distribution P over set of variables (X 1 x... x X n ) requires exponential space for representation & inference BNs provide

More information

Bayesian Networks Chapter 14. Mausam (Slides by UW-AI faculty & David Page)

Bayesian Networks Chapter 14. Mausam (Slides by UW-AI faculty & David Page) Bayesian Networks Chapter 14 Mausam (Slides by UW-AI faculty & David Page) Bayes Nets In general, joint distribution P over set of variables (X 1 x... x X n ) requires exponential space for representation

More information

Attribution. Modified from Stuart Russell s slides (Berkeley) Parts of the slides are inspired by Dan Klein s lecture material for CS 188 (Berkeley)

Attribution. Modified from Stuart Russell s slides (Berkeley) Parts of the slides are inspired by Dan Klein s lecture material for CS 188 (Berkeley) Machine Learning 1 Attribution Modified from Stuart Russell s slides (Berkeley) Parts of the slides are inspired by Dan Klein s lecture material for CS 188 (Berkeley) 2 Outline Inductive learning Decision

More information

A Few Basics of Probability

A Few Basics of Probability A Few Basics of Probability Philosophy 57 Spring, 2004 1 Introduction This handout distinguishes between inductive and deductive logic, and then introduces probability, a concept essential to the study

More information

15-381: Artificial Intelligence. Probabilistic Reasoning and Inference

15-381: Artificial Intelligence. Probabilistic Reasoning and Inference 5-38: Artificial Intelligence robabilistic Reasoning and Inference Advantages of probabilistic reasoning Appropriate for complex, uncertain, environments - Will it rain tomorrow? Applies naturally to many

More information

A crash course in probability and Naïve Bayes classification

A crash course in probability and Naïve Bayes classification Probability theory A crash course in probability and Naïve Bayes classification Chapter 9 Random variable: a variable whose possible values are numerical outcomes of a random phenomenon. s: A person s

More information

Logic in general. Inference rules and theorem proving

Logic in general. Inference rules and theorem proving Logical Agents Knowledge-based agents Logic in general Propositional logic Inference rules and theorem proving First order logic Knowledge-based agents Inference engine Knowledge base Domain-independent

More information

10-601. Machine Learning. http://www.cs.cmu.edu/afs/cs/academic/class/10601-f10/index.html

10-601. Machine Learning. http://www.cs.cmu.edu/afs/cs/academic/class/10601-f10/index.html 10-601 Machine Learning http://www.cs.cmu.edu/afs/cs/academic/class/10601-f10/index.html Course data All up-to-date info is on the course web page: http://www.cs.cmu.edu/afs/cs/academic/class/10601-f10/index.html

More information

Decision Trees and Networks

Decision Trees and Networks Lecture 21: Uncertainty 6 Today s Lecture Victor R. Lesser CMPSCI 683 Fall 2010 Decision Trees and Networks Decision Trees A decision tree is an explicit representation of all the possible scenarios from

More information

Problems often have a certain amount of uncertainty, possibly due to: Incompleteness of information about the environment,

Problems often have a certain amount of uncertainty, possibly due to: Incompleteness of information about the environment, Uncertainty Problems often have a certain amount of uncertainty, possibly due to: Incompleteness of information about the environment, E.g., loss of sensory information such as vision Incorrectness in

More information

Discrete Mathematics and Probability Theory Fall 2009 Satish Rao,David Tse Note 11

Discrete Mathematics and Probability Theory Fall 2009 Satish Rao,David Tse Note 11 CS 70 Discrete Mathematics and Probability Theory Fall 2009 Satish Rao,David Tse Note Conditional Probability A pharmaceutical company is marketing a new test for a certain medical condition. According

More information

Probability Theory. Elementary rules of probability Sum rule. Product rule. p. 23

Probability Theory. Elementary rules of probability Sum rule. Product rule. p. 23 Probability Theory Uncertainty is key concept in machine learning. Probability provides consistent framework for the quantification and manipulation of uncertainty. Probability of an event is the fraction

More information

Probability and statistics; Rehearsal for pattern recognition

Probability and statistics; Rehearsal for pattern recognition Probability and statistics; Rehearsal for pattern recognition Václav Hlaváč Czech Technical University in Prague Faculty of Electrical Engineering, Department of Cybernetics Center for Machine Perception

More information

Uncertainty in AI. Uncertainty I: Probability as Degree of Belief. The (predominantly logic-based) methods covered so far have assorted

Uncertainty in AI. Uncertainty I: Probability as Degree of Belief. The (predominantly logic-based) methods covered so far have assorted We now examine: Uncertainty I: Probability as Degree of Belief how probability theory might be used to represent and reason with knowledge when we are uncertain about the world; how inference in the presence

More information

Artificial Intelligence Mar 27, Bayesian Networks 1 P (T D)P (D) + P (T D)P ( D) =

Artificial Intelligence Mar 27, Bayesian Networks 1 P (T D)P (D) + P (T D)P ( D) = Artificial Intelligence 15-381 Mar 27, 2007 Bayesian Networks 1 Recap of last lecture Probability: precise representation of uncertainty Probability theory: optimal updating of knowledge based on new information

More information

Fuzzy Implication Rules. Adnan Yazıcı Dept. of Computer Engineering, Middle East Technical University Ankara/Turkey

Fuzzy Implication Rules. Adnan Yazıcı Dept. of Computer Engineering, Middle East Technical University Ankara/Turkey Fuzzy Implication Rules Adnan Yazıcı Dept. of Computer Engineering, Middle East Technical University Ankara/Turkey Fuzzy If-Then Rules Remember: The difference between the semantics of fuzzy mapping rules

More information

L10: Probability, statistics, and estimation theory

L10: Probability, statistics, and estimation theory L10: Probability, statistics, and estimation theory Review of probability theory Bayes theorem Statistics and the Normal distribution Least Squares Error estimation Maximum Likelihood estimation Bayesian

More information

Cell Phone based Activity Detection using Markov Logic Network

Cell Phone based Activity Detection using Markov Logic Network Cell Phone based Activity Detection using Markov Logic Network Somdeb Sarkhel sxs104721@utdallas.edu 1 Introduction Mobile devices are becoming increasingly sophisticated and the latest generation of smart

More information

The Foundations: Logic and Proofs. Chapter 1, Part III: Proofs

The Foundations: Logic and Proofs. Chapter 1, Part III: Proofs The Foundations: Logic and Proofs Chapter 1, Part III: Proofs Rules of Inference Section 1.6 Section Summary Valid Arguments Inference Rules for Propositional Logic Using Rules of Inference to Build Arguments

More information

2. Methods of Proof Types of Proofs. Suppose we wish to prove an implication p q. Here are some strategies we have available to try.

2. Methods of Proof Types of Proofs. Suppose we wish to prove an implication p q. Here are some strategies we have available to try. 2. METHODS OF PROOF 69 2. Methods of Proof 2.1. Types of Proofs. Suppose we wish to prove an implication p q. Here are some strategies we have available to try. Trivial Proof: If we know q is true then

More information

The Conquest of U.S. Inflation: Learning and Robustness to Model Uncertainty

The Conquest of U.S. Inflation: Learning and Robustness to Model Uncertainty The Conquest of U.S. Inflation: Learning and Robustness to Model Uncertainty Timothy Cogley and Thomas Sargent University of California, Davis and New York University and Hoover Institution Conquest of

More information

Discrete Mathematics and Probability Theory Fall 2009 Satish Rao, David Tse Note 2

Discrete Mathematics and Probability Theory Fall 2009 Satish Rao, David Tse Note 2 CS 70 Discrete Mathematics and Probability Theory Fall 2009 Satish Rao, David Tse Note 2 Proofs Intuitively, the concept of proof should already be familiar We all like to assert things, and few of us

More information

CHAPTER 3. Methods of Proofs. 1. Logical Arguments and Formal Proofs

CHAPTER 3. Methods of Proofs. 1. Logical Arguments and Formal Proofs CHAPTER 3 Methods of Proofs 1. Logical Arguments and Formal Proofs 1.1. Basic Terminology. An axiom is a statement that is given to be true. A rule of inference is a logical rule that is used to deduce

More information

Pooling and Meta-analysis. Tony O Hagan

Pooling and Meta-analysis. Tony O Hagan Pooling and Meta-analysis Tony O Hagan Pooling Synthesising prior information from several experts 2 Multiple experts The case of multiple experts is important When elicitation is used to provide expert

More information

Intelligent Systems: Reasoning and Recognition. Uncertainty and Plausible Reasoning

Intelligent Systems: Reasoning and Recognition. Uncertainty and Plausible Reasoning Intelligent Systems: Reasoning and Recognition James L. Crowley ENSIMAG 2 / MoSIG M1 Second Semester 2015/2016 Lesson 12 25 march 2015 Uncertainty and Plausible Reasoning MYCIN (continued)...2 Backward

More information

E3: PROBABILITY AND STATISTICS lecture notes

E3: PROBABILITY AND STATISTICS lecture notes E3: PROBABILITY AND STATISTICS lecture notes 2 Contents 1 PROBABILITY THEORY 7 1.1 Experiments and random events............................ 7 1.2 Certain event. Impossible event............................

More information

Experimental Uncertainty and Probability

Experimental Uncertainty and Probability 02/04/07 PHY310: Statistical Data Analysis 1 PHY310: Lecture 03 Experimental Uncertainty and Probability Road Map The meaning of experimental uncertainty The fundamental concepts of probability 02/04/07

More information

Basics of Statistical Machine Learning

Basics of Statistical Machine Learning CS761 Spring 2013 Advanced Machine Learning Basics of Statistical Machine Learning Lecturer: Xiaojin Zhu jerryzhu@cs.wisc.edu Modern machine learning is rooted in statistics. You will find many familiar

More information

(x + a) n = x n + a Z n [x]. Proof. If n is prime then the map

(x + a) n = x n + a Z n [x]. Proof. If n is prime then the map 22. A quick primality test Prime numbers are one of the most basic objects in mathematics and one of the most basic questions is to decide which numbers are prime (a clearly related problem is to find

More information

Data Mining Chapter 6: Models and Patterns Fall 2011 Ming Li Department of Computer Science and Technology Nanjing University

Data Mining Chapter 6: Models and Patterns Fall 2011 Ming Li Department of Computer Science and Technology Nanjing University Data Mining Chapter 6: Models and Patterns Fall 2011 Ming Li Department of Computer Science and Technology Nanjing University Models vs. Patterns Models A model is a high level, global description of a

More information

Statistical Models in Data Mining

Statistical Models in Data Mining Statistical Models in Data Mining Sargur N. Srihari University at Buffalo The State University of New York Department of Computer Science and Engineering Department of Biostatistics 1 Srihari Flood of

More information

Learning from Data: Naive Bayes

Learning from Data: Naive Bayes Semester 1 http://www.anc.ed.ac.uk/ amos/lfd/ Naive Bayes Typical example: Bayesian Spam Filter. Naive means naive. Bayesian methods can be much more sophisticated. Basic assumption: conditional independence.

More information

Logical Agents. Explorations in Artificial Intelligence. Knowledge-based Agents. Knowledge-base Agents. Outline. Knowledge bases

Logical Agents. Explorations in Artificial Intelligence. Knowledge-based Agents. Knowledge-base Agents. Outline. Knowledge bases Logical Agents Explorations in Artificial Intelligence rof. Carla. Gomes gomes@cs.cornell.edu Agents that are able to: Form representations of the world Use a process to derive new representations of the

More information

Joint Probability Distributions and Random Samples. Week 5, 2011 Stat 4570/5570 Material from Devore s book (Ed 8), and Cengage

Joint Probability Distributions and Random Samples. Week 5, 2011 Stat 4570/5570 Material from Devore s book (Ed 8), and Cengage 5 Joint Probability Distributions and Random Samples Week 5, 2011 Stat 4570/5570 Material from Devore s book (Ed 8), and Cengage Two Discrete Random Variables The probability mass function (pmf) of a single

More information

On the Efficiency of Competitive Stock Markets Where Traders Have Diverse Information

On the Efficiency of Competitive Stock Markets Where Traders Have Diverse Information Finance 400 A. Penati - G. Pennacchi Notes on On the Efficiency of Competitive Stock Markets Where Traders Have Diverse Information by Sanford Grossman This model shows how the heterogeneous information

More information

Bayesian Networks. Read R&N Ch. 14.1-14.2. Next lecture: Read R&N 18.1-18.4

Bayesian Networks. Read R&N Ch. 14.1-14.2. Next lecture: Read R&N 18.1-18.4 Bayesian Networks Read R&N Ch. 14.1-14.2 Next lecture: Read R&N 18.1-18.4 You will be expected to know Basic concepts and vocabulary of Bayesian networks. Nodes represent random variables. Directed arcs

More information

Measuring the Performance of an Agent

Measuring the Performance of an Agent 25 Measuring the Performance of an Agent The rational agent that we are aiming at should be successful in the task it is performing To assess the success we need to have a performance measure What is rational

More information

CS510 Software Engineering

CS510 Software Engineering CS510 Software Engineering Propositional Logic Asst. Prof. Mathias Payer Department of Computer Science Purdue University TA: Scott A. Carr Slides inspired by Xiangyu Zhang http://nebelwelt.net/teaching/15-cs510-se

More information

Lecture Note 1 Set and Probability Theory. MIT 14.30 Spring 2006 Herman Bennett

Lecture Note 1 Set and Probability Theory. MIT 14.30 Spring 2006 Herman Bennett Lecture Note 1 Set and Probability Theory MIT 14.30 Spring 2006 Herman Bennett 1 Set Theory 1.1 Definitions and Theorems 1. Experiment: any action or process whose outcome is subject to uncertainty. 2.

More information

Hypothesis Testing & Data Analysis. Statistics. Descriptive Statistics. What is the difference between descriptive and inferential statistics?

Hypothesis Testing & Data Analysis. Statistics. Descriptive Statistics. What is the difference between descriptive and inferential statistics? 2 Hypothesis Testing & Data Analysis 5 What is the difference between descriptive and inferential statistics? Statistics 8 Tools to help us understand our data. Makes a complicated mess simple to understand.

More information

CS 441 Discrete Mathematics for CS Lecture 5. Predicate logic. CS 441 Discrete mathematics for CS. Negation of quantifiers

CS 441 Discrete Mathematics for CS Lecture 5. Predicate logic. CS 441 Discrete mathematics for CS. Negation of quantifiers CS 441 Discrete Mathematics for CS Lecture 5 Predicate logic Milos Hauskrecht milos@cs.pitt.edu 5329 Sennott Square Negation of quantifiers English statement: Nothing is perfect. Translation: x Perfect(x)

More information

Welcome to Stochastic Processes 1. Welcome to Aalborg University No. 1 of 31

Welcome to Stochastic Processes 1. Welcome to Aalborg University No. 1 of 31 Welcome to Stochastic Processes 1 Welcome to Aalborg University No. 1 of 31 Welcome to Aalborg University No. 2 of 31 Course Plan Part 1: Probability concepts, random variables and random processes Lecturer:

More information

Chapter ML:IV. IV. Statistical Learning. Probability Basics Bayes Classification Maximum a-posteriori Hypotheses

Chapter ML:IV. IV. Statistical Learning. Probability Basics Bayes Classification Maximum a-posteriori Hypotheses Chapter ML:IV IV. Statistical Learning Probability Basics Bayes Classification Maximum a-posteriori Hypotheses ML:IV-1 Statistical Learning STEIN 2005-2015 Area Overview Mathematics Statistics...... Stochastics

More information

Practice Problems for Homework #8. Markov Chains. Read Sections 7.1-7.3. Solve the practice problems below.

Practice Problems for Homework #8. Markov Chains. Read Sections 7.1-7.3. Solve the practice problems below. Practice Problems for Homework #8. Markov Chains. Read Sections 7.1-7.3 Solve the practice problems below. Open Homework Assignment #8 and solve the problems. 1. (10 marks) A computer system can operate

More information

Real Time Traffic Monitoring With Bayesian Belief Networks

Real Time Traffic Monitoring With Bayesian Belief Networks Real Time Traffic Monitoring With Bayesian Belief Networks Sicco Pier van Gosliga TNO Defence, Security and Safety, P.O.Box 96864, 2509 JG The Hague, The Netherlands +31 70 374 02 30, sicco_pier.vangosliga@tno.nl

More information

Basic Probability Concepts

Basic Probability Concepts page 1 Chapter 1 Basic Probability Concepts 1.1 Sample and Event Spaces 1.1.1 Sample Space A probabilistic (or statistical) experiment has the following characteristics: (a) the set of all possible outcomes

More information

3. Mathematical Induction

3. Mathematical Induction 3. MATHEMATICAL INDUCTION 83 3. Mathematical Induction 3.1. First Principle of Mathematical Induction. Let P (n) be a predicate with domain of discourse (over) the natural numbers N = {0, 1,,...}. If (1)

More information

Bayesian probability theory

Bayesian probability theory Bayesian probability theory Bruno A. Olshausen arch 1, 2004 Abstract Bayesian probability theory provides a mathematical framework for peforming inference, or reasoning, using probability. The foundations

More information

Lecture 11: Graphical Models for Inference

Lecture 11: Graphical Models for Inference Lecture 11: Graphical Models for Inference So far we have seen two graphical models that are used for inference - the Bayesian network and the Join tree. These two both represent the same joint probability

More information

Knowledge base. Logical agents. A simple knowledge-based agent. Outline

Knowledge base. Logical agents. A simple knowledge-based agent. Outline Knowledge bases Inference engine domain independent algorithms ogical agents Chapter 7 Knowledge base domain specific content Knowledge base = set of sentences in a formal language Declarative approach

More information

Statistics in Geophysics: Introduction and Probability Theory

Statistics in Geophysics: Introduction and Probability Theory Statistics in Geophysics: Introduction and Steffen Unkel Department of Statistics Ludwig-Maximilians-University Munich, Germany Winter Term 2013/14 1/32 What is Statistics? Introduction Statistics is the

More information

The Basics of Graphical Models

The Basics of Graphical Models The Basics of Graphical Models David M. Blei Columbia University October 3, 2015 Introduction These notes follow Chapter 2 of An Introduction to Probabilistic Graphical Models by Michael Jordan. Many figures

More information

First-order logic. Chapter 8. Chapter 8 1

First-order logic. Chapter 8. Chapter 8 1 First-order logic Chapter 8 Chapter 8 1 Outline Why FOL? Syntax and semantics of FOL Fun with sentences Wumpus world in FOL Chapter 8 2 Pros and cons of propositional logic Propositional logic is declarative:

More information

Applications of Methods of Proof

Applications of Methods of Proof CHAPTER 4 Applications of Methods of Proof 1. Set Operations 1.1. Set Operations. The set-theoretic operations, intersection, union, and complementation, defined in Chapter 1.1 Introduction to Sets are

More information

196 Chapter 7. Logical Agents

196 Chapter 7. Logical Agents 7 LOGICAL AGENTS In which we design agents that can form representations of the world, use a process of inference to derive new representations about the world, and use these new representations to deduce

More information

Stat260: Bayesian Modeling and Inference Lecture Date: February 1, Lecture 3

Stat260: Bayesian Modeling and Inference Lecture Date: February 1, Lecture 3 Stat26: Bayesian Modeling and Inference Lecture Date: February 1, 21 Lecture 3 Lecturer: Michael I. Jordan Scribe: Joshua G. Schraiber 1 Decision theory Recall that decision theory provides a quantification

More information

The Refutation of Relativism

The Refutation of Relativism The Refutation of Relativism There are many different versions of relativism: ethical relativism conceptual relativism, and epistemic relativism are three. In this paper, I will be concerned with only

More information

Bayesian Tutorial (Sheet Updated 20 March)

Bayesian Tutorial (Sheet Updated 20 March) Bayesian Tutorial (Sheet Updated 20 March) Practice Questions (for discussing in Class) Week starting 21 March 2016 1. What is the probability that the total of two dice will be greater than 8, given that

More information

VALIDATION / EVALUATION OF A MODEL

VALIDATION / EVALUATION OF A MODEL VALIDATION / EVALUATION OF A MODEL 1 Validation or evaluation? Treat model as a scientific hypothesis Hypothesis: does the model imitate the way the real world functions? We want to validate or invalidate

More information

Simple Decision Making

Simple Decision Making Decision Making Under Uncertainty Ch 16 Simple Decision Making Many environments have multiple possible outcomes Some of these outcomes may be good; others may be bad Some may be very likely; others unlikely

More information

Instrumental Variables & 2SLS

Instrumental Variables & 2SLS Instrumental Variables & 2SLS y 1 = β 0 + β 1 y 2 + β 2 z 1 +... β k z k + u y 2 = π 0 + π 1 z k+1 + π 2 z 1 +... π k z k + v Economics 20 - Prof. Schuetze 1 Why Use Instrumental Variables? Instrumental

More information

Lecture 8 The Subjective Theory of Betting on Theories

Lecture 8 The Subjective Theory of Betting on Theories Lecture 8 The Subjective Theory of Betting on Theories Patrick Maher Philosophy 517 Spring 2007 Introduction The subjective theory of probability holds that the laws of probability are laws that rational

More information

Chapter 28. Bayesian Networks

Chapter 28. Bayesian Networks Chapter 28. Bayesian Networks The Quest for Artificial Intelligence, Nilsson, N. J., 2009. Lecture Notes on Artificial Intelligence, Spring 2012 Summarized by Kim, Byoung-Hee and Lim, Byoung-Kwon Biointelligence

More information

Rigorous Software Development CSCI-GA 3033-009

Rigorous Software Development CSCI-GA 3033-009 Rigorous Software Development CSCI-GA 3033-009 Instructor: Thomas Wies Spring 2013 Lecture 11 Semantics of Programming Languages Denotational Semantics Meaning of a program is defined as the mathematical

More information

4. Introduction to Statistics

4. Introduction to Statistics Statistics for Engineers 4-1 4. Introduction to Statistics Descriptive Statistics Types of data A variate or random variable is a quantity or attribute whose value may vary from one unit of investigation

More information

Review. Bayesianism and Reliability. Today s Class

Review. Bayesianism and Reliability. Today s Class Review Bayesianism and Reliability Models and Simulations in Philosophy April 14th, 2014 Last Class: Difference between individual and social epistemology Why simulations are particularly useful for social

More information

m (t) = e nt m Y ( t) = e nt (pe t + q) n = (pe t e t + qe t ) n = (qe t + p) n

m (t) = e nt m Y ( t) = e nt (pe t + q) n = (pe t e t + qe t ) n = (qe t + p) n 1. For a discrete random variable Y, prove that E[aY + b] = ae[y] + b and V(aY + b) = a 2 V(Y). Solution: E[aY + b] = E[aY] + E[b] = ae[y] + b where each step follows from a theorem on expected value from

More information

Probability and Statistics

Probability and Statistics CHAPTER 2: RANDOM VARIABLES AND ASSOCIATED FUNCTIONS 2b - 0 Probability and Statistics Kristel Van Steen, PhD 2 Montefiore Institute - Systems and Modeling GIGA - Bioinformatics ULg kristel.vansteen@ulg.ac.be

More information

Probability. A random sample is selected in such a way that every different sample of size n has an equal chance of selection.

Probability. A random sample is selected in such a way that every different sample of size n has an equal chance of selection. 1 3.1 Sample Spaces and Tree Diagrams Probability This section introduces terminology and some techniques which will eventually lead us to the basic concept of the probability of an event. The Rare Event

More information

CHAPTER 7 GENERAL PROOF SYSTEMS

CHAPTER 7 GENERAL PROOF SYSTEMS CHAPTER 7 GENERAL PROOF SYSTEMS 1 Introduction Proof systems are built to prove statements. They can be thought as an inference machine with special statements, called provable statements, or sometimes

More information

Statistical Analysis with Missing Data

Statistical Analysis with Missing Data Statistical Analysis with Missing Data Second Edition RODERICK J. A. LITTLE DONALD B. RUBIN WILEY- INTERSCIENCE A JOHN WILEY & SONS, INC., PUBLICATION Contents Preface PARTI OVERVIEW AND BASIC APPROACHES

More information

Another Look at Sensitivity of Bayesian Networks to Imprecise Probabilities

Another Look at Sensitivity of Bayesian Networks to Imprecise Probabilities Another Look at Sensitivity of Bayesian Networks to Imprecise Probabilities Oscar Kipersztok Mathematics and Computing Technology Phantom Works, The Boeing Company P.O.Box 3707, MC: 7L-44 Seattle, WA 98124

More information

6.254 : Game Theory with Engineering Applications Lecture 2: Strategic Form Games

6.254 : Game Theory with Engineering Applications Lecture 2: Strategic Form Games 6.254 : Game Theory with Engineering Applications Lecture 2: Strategic Form Games Asu Ozdaglar MIT February 4, 2009 1 Introduction Outline Decisions, utility maximization Strategic form games Best responses

More information

P (A) = lim P (A) = N(A)/N,

P (A) = lim P (A) = N(A)/N, 1.1 Probability, Relative Frequency and Classical Definition. Probability is the study of random or non-deterministic experiments. Suppose an experiment can be repeated any number of times, so that we

More information

Probability OPRE 6301

Probability OPRE 6301 Probability OPRE 6301 Random Experiment... Recall that our eventual goal in this course is to go from the random sample to the population. The theory that allows for this transition is the theory of probability.

More information

(LMCS, p. 317) V.1. First Order Logic. This is the most powerful, most expressive logic that we will examine.

(LMCS, p. 317) V.1. First Order Logic. This is the most powerful, most expressive logic that we will examine. (LMCS, p. 317) V.1 First Order Logic This is the most powerful, most expressive logic that we will examine. Our version of first-order logic will use the following symbols: variables connectives (,,,,

More information

Optimizing Description Logic Subsumption

Optimizing Description Logic Subsumption Topics in Knowledge Representation and Reasoning Optimizing Description Logic Subsumption Maryam Fazel-Zarandi Company Department of Computer Science University of Toronto Outline Introduction Optimization

More information

Joint Probability Distributions and Random Samples (Devore Chapter Five)

Joint Probability Distributions and Random Samples (Devore Chapter Five) Joint Probability Distributions and Random Samples (Devore Chapter Five) 1016-345-01 Probability and Statistics for Engineers Winter 2010-2011 Contents 1 Joint Probability Distributions 1 1.1 Two Discrete

More information

Section 6.1 Joint Distribution Functions

Section 6.1 Joint Distribution Functions Section 6.1 Joint Distribution Functions We often care about more than one random variable at a time. DEFINITION: For any two random variables X and Y the joint cumulative probability distribution function

More information

Bayesian Analysis for the Social Sciences

Bayesian Analysis for the Social Sciences Bayesian Analysis for the Social Sciences Simon Jackman Stanford University http://jackman.stanford.edu/bass November 9, 2012 Simon Jackman (Stanford) Bayesian Analysis for the Social Sciences November

More information

Course: Model, Learning, and Inference: Lecture 5

Course: Model, Learning, and Inference: Lecture 5 Course: Model, Learning, and Inference: Lecture 5 Alan Yuille Department of Statistics, UCLA Los Angeles, CA 90095 yuille@stat.ucla.edu Abstract Probability distributions on structured representation.

More information

Generalized Modus Ponens

Generalized Modus Ponens Generalized Modus Ponens This rule allows us to derive an implication... True p 1 and... p i and... p n p 1... p i-1 and p i+1... p n implies p i implies q implies q allows: a 1 and... a i and... a n implies

More information

A Tutorial on Probability Theory

A Tutorial on Probability Theory Paola Sebastiani Department of Mathematics and Statistics University of Massachusetts at Amherst Corresponding Author: Paola Sebastiani. Department of Mathematics and Statistics, University of Massachusetts,

More information

Gibbs Sampling and Online Learning Introduction

Gibbs Sampling and Online Learning Introduction Statistical Techniques in Robotics (16-831, F14) Lecture#10(Tuesday, September 30) Gibbs Sampling and Online Learning Introduction Lecturer: Drew Bagnell Scribes: {Shichao Yang} 1 1 Sampling Samples are

More information

FEGYVERNEKI SÁNDOR, PROBABILITY THEORY AND MATHEmATICAL

FEGYVERNEKI SÁNDOR, PROBABILITY THEORY AND MATHEmATICAL FEGYVERNEKI SÁNDOR, PROBABILITY THEORY AND MATHEmATICAL STATIsTICs 4 IV. RANDOm VECTORs 1. JOINTLY DIsTRIBUTED RANDOm VARIABLEs If are two rom variables defined on the same sample space we define the joint

More information

Performance Analysis of Naive Bayes and J48 Classification Algorithm for Data Classification

Performance Analysis of Naive Bayes and J48 Classification Algorithm for Data Classification Performance Analysis of Naive Bayes and J48 Classification Algorithm for Data Classification Tina R. Patil, Mrs. S. S. Sherekar Sant Gadgebaba Amravati University, Amravati tnpatil2@gmail.com, ss_sherekar@rediffmail.com

More information

Change of Continuous Random Variable

Change of Continuous Random Variable Change of Continuous Random Variable All you are responsible for from this lecture is how to implement the Engineer s Way (see page 4) to compute how the probability density function changes when we make

More information

Economics 1011a: Intermediate Microeconomics

Economics 1011a: Intermediate Microeconomics Lecture 11: Choice Under Uncertainty Economics 1011a: Intermediate Microeconomics Lecture 11: Choice Under Uncertainty Tuesday, October 21, 2008 Last class we wrapped up consumption over time. Today we

More information

UPDATES OF LOGIC PROGRAMS

UPDATES OF LOGIC PROGRAMS Computing and Informatics, Vol. 20, 2001,????, V 2006-Nov-6 UPDATES OF LOGIC PROGRAMS Ján Šefránek Department of Applied Informatics, Faculty of Mathematics, Physics and Informatics, Comenius University,

More information