Probability, Conditional Independence

Size: px
Start display at page:

Download "Probability, Conditional Independence"

Transcription

1 Probability, Conditional Independence June 19, 2012 Probability, Conditional Independence

2 Probability Sample space Ω of events Each event ω Ω has an associated measure Probability of the event P(ω) Axioms of Probability: ω, P(ω) [0, 1] P(Ω) = 1 P(ω 1 ω 2 ) = P(ω 1 ) + P(ω 2 ) P(ω 1 ω 2 ) Note: We are being informal Probability, Conditional Independence

3 Random Variables Random variables are mappings of events (to real numbers) Mapping X : Ω R Any event ω maps to X (ω) Example: Tossing a coin has two possible outcomes Denoted by {H, T } or {0, 1} Fair coin has uniform probabilities P(X = 0) = 1 2 P(X = 1) = 1 2 Random variables (r.v.s) can be Discrete, e.g., Bernoulli Continuous, e.g., Gaussian Probability, Conditional Independence

4 Distribution, Density For a continuous r.v. Distribution function F (x) = P(X x) Corresponding density function f (x)dx = df (x) Note that F (x) = x t= f (t)dt For a discrete r.v. Probability mass function f (x) = P(X = x) = P(x) We will call this the probability of a discrete event Distribution function F (x) = P(X x) Probability, Conditional Independence

5 Joint Distributions, Marginals For two continuous r.v.s X 1, X 2 Joint distribution F (x 1, x 2 ) = P(X 1 x 1, X 2 x 2 ) Joint density function f (x 1, x 2 ) can be defined as before The marginal probability density f (x 1 ) = x 2= f (x 1, x 2 )dx 2 For two discrete r.v.s X 1, X 2 Joint probability f (x 1, x 2 ) = P(X 1 = x 1, X 2 = x 2 ) = P(x 1, x 2 ) The marginal probability P(X 1 = x 1 ) = x 2 P(X 1 = x 1, X 2 = x 2 ) Can be extended to joint distribution over several r.v.s Many hard problems involve computing marginals Probability, Conditional Independence

6 Expectation The expected value of a r.v. X For continuous r.v.s E[X ] = x xp(x)dx For discrete r.v. E[X ] = i x ip i Expectation is a linear operator E[aX + by + c] = ae[x ] + be[y ] + c Probability, Conditional Independence

7 Independence Joint probability P(X 1 = x 1, X 2 = x 2 ) Probability, Conditional Independence

8 Independence Joint probability P(X 1 = x 1, X 2 = x 2 ) X 1, X 2 are different dice Probability, Conditional Independence

9 Independence Joint probability P(X 1 = x 1, X 2 = x 2 ) X 1, X 2 are different dice X 1 denotes if grass is wet, X 2 denotes if sprinkler was on Probability, Conditional Independence

10 Independence Joint probability P(X 1 = x 1, X 2 = x 2 ) X 1, X 2 are different dice X 1 denotes if grass is wet, X 2 denotes if sprinkler was on Two r.v.s are independent if P(X 1 = x 1, X 2 = x 2 ) = P(X 1 = x 1 )P(X 2 = x 2 ) Probability, Conditional Independence

11 Independence Joint probability P(X 1 = x 1, X 2 = x 2 ) X 1, X 2 are different dice X 1 denotes if grass is wet, X 2 denotes if sprinkler was on Two r.v.s are independent if P(X 1 = x 1, X 2 = x 2 ) = P(X 1 = x 1 )P(X 2 = x 2 ) Two different dice are independent Probability, Conditional Independence

12 Independence Joint probability P(X 1 = x 1, X 2 = x 2 ) X 1, X 2 are different dice X 1 denotes if grass is wet, X 2 denotes if sprinkler was on Two r.v.s are independent if P(X 1 = x 1, X 2 = x 2 ) = P(X 1 = x 1 )P(X 2 = x 2 ) Two different dice are independent If sprinkler was on, then grass will be wet dependent Probability, Conditional Independence

13 Conditional Probability, Bayes Rule Grass Wet Grass Dry Sprinkler On Sprinkler Off Inference problems: Probability, Conditional Independence

14 Conditional Probability, Bayes Rule Grass Wet Grass Dry Sprinkler On Sprinkler Off Inference problems: Given grass wet what is P( sprinkler on grass wet ) Probability, Conditional Independence

15 Conditional Probability, Bayes Rule Grass Wet Grass Dry Sprinkler On Sprinkler Off Inference problems: Given grass wet what is P( sprinkler on grass wet ) Given symptom what is P( disease symptom ) Probability, Conditional Independence

16 Conditional Probability, Bayes Rule Grass Wet Grass Dry Sprinkler On Sprinkler Off Inference problems: Given grass wet what is P( sprinkler on grass wet ) Given symptom what is P( disease symptom ) For any r.v.s X, Y, the conditional probability P(x y) = P(x, y) P(y) Probability, Conditional Independence

17 Conditional Probability, Bayes Rule Grass Wet Grass Dry Sprinkler On Sprinkler Off Inference problems: Given grass wet what is P( sprinkler on grass wet ) Given symptom what is P( disease symptom ) For any r.v.s X, Y, the conditional probability P(x y) = P(x, y) P(y) Since P(x, y) = P(y x)p(x), we have P(y x) = P(x y)p(y) P(x) Probability, Conditional Independence

18 Conditional Probability, Bayes Rule Grass Wet Grass Dry Sprinkler On Sprinkler Off Inference problems: Given grass wet what is P( sprinkler on grass wet ) Given symptom what is P( disease symptom ) For any r.v.s X, Y, the conditional probability P(x y) = P(x, y) P(y) Since P(x, y) = P(y x)p(x), we have P(y x) = P(x y)p(y) P(x) Expressing posterior in terms of conditional : Bayes Rule Probability, Conditional Independence

19 Product Rule & Independence Product Rule: For X 1, X 2, P(X 1, X 2 ) = P(X 1 )P(X 2 X 1 ) For X 1, X 2, X 3, P(X 1, X 2, X 3 ) = P(X 1 )P(X 2 X 1 )P(X 3 X 1, X 2 ) In general, the chain rule n P(X 1,, X n ) = P(X i X 1,..., X i 1 ) i=1 Joint distribution of n Boolean variables Specification requires 2 n 1 parameters Recall Independence: For X 1, X 2, P(X 1, X 2 ) = P(X 1 )P(X 2 ) In general n P(X 1,, X n ) = P(X i ) i=1 Independence reduces specification to n parameters Probability, Conditional Independence

20 Independence Cavity Toothache Catch Weather decomposes into Cavity Toothache Catch Weather Consider 4 variables: Toothache,Catch,Cavity,Weather Probability, Conditional Independence

21 Independence Cavity Toothache Catch Weather decomposes into Cavity Toothache Catch Weather Consider 4 variables: Toothache,Catch,Cavity,Weather Independence implies P(Toothache,Catch, Cavity, Weather) = P(Toothache, Catch, Cavity)P(Weather) Probability, Conditional Independence

22 Independence Cavity Toothache Catch Weather decomposes into Cavity Toothache Catch Weather Consider 4 variables: Toothache,Catch,Cavity,Weather Independence implies P(Toothache,Catch, Cavity, Weather) = P(Toothache, Catch, Cavity)P(Weather) In terms of the joint distribution: Probability, Conditional Independence

23 Independence Cavity Toothache Catch Weather decomposes into Cavity Toothache Catch Weather Consider 4 variables: Toothache,Catch,Cavity,Weather Independence implies P(Toothache,Catch, Cavity, Weather) In terms of the joint distribution: 32 parameters reduced to 12 = P(Toothache, Catch, Cavity)P(Weather) Probability, Conditional Independence

24 Independence Cavity Toothache Catch Weather decomposes into Cavity Toothache Catch Weather Consider 4 variables: Toothache,Catch,Cavity,Weather Independence implies P(Toothache,Catch, Cavity, Weather) = P(Toothache, Catch, Cavity)P(Weather) In terms of the joint distribution: 32 parameters reduced to 12 For boolean variables 2 n 1 reduces to n Probability, Conditional Independence

25 Independence Cavity Toothache Catch Weather decomposes into Cavity Toothache Catch Weather Consider 4 variables: Toothache,Catch,Cavity,Weather Independence implies P(Toothache,Catch, Cavity, Weather) = P(Toothache, Catch, Cavity)P(Weather) In terms of the joint distribution: 32 parameters reduced to 12 For boolean variables 2 n 1 reduces to n Absolute independence powerful but rare Probability, Conditional Independence

26 Conditional Independence X and Y are conditionally independent given Z P(X, Y Z) = P(X Z)P(Y Z) Probability, Conditional Independence

27 Conditional Independence X and Y are conditionally independent given Z P(X, Y Z) = P(X Z)P(Y Z) Example: P(Toothache,Catch Cavity) = P(Toothache Cavity)P(Catch Cavity) Probability, Conditional Independence

28 Conditional Independence X and Y are conditionally independent given Z P(X, Y Z) = P(X Z)P(Y Z) Example: P(Toothache,Catch Cavity) = P(Toothache Cavity)P(Catch Cavity) Conditional Independence simplifies joint distributions Probability, Conditional Independence

29 Conditional Independence X and Y are conditionally independent given Z P(X, Y Z) = P(X Z)P(Y Z) Example: P(Toothache,Catch Cavity) = P(Toothache Cavity)P(Catch Cavity) Conditional Independence simplifies joint distributions Often reduces from exponential to linear in n P(X, Y, Z) = P(Z)P(X Z)P(Y Z) Probability, Conditional Independence

30 Naive Bayes Model If X 1,..., X n are independent given Y n P(Y, X 1,..., X n ) = P(Y ) P(X i Y ) i=1 Probability, Conditional Independence

31 Naive Bayes Model If X 1,..., X n are independent given Y P(Y, X 1,..., X n ) = P(Y ) Example: P(Cavity,Toothache, Catch) n P(X i Y ) i=1 = P(Cavity)P(Toothache Cavity)P(Catch Cavity) Probability, Conditional Independence

32 Naive Bayes Model If X 1,..., X n are independent given Y P(Y, X 1,..., X n ) = P(Y ) Example: P(Cavity,Toothache, Catch) More generally n P(X i Y ) i=1 = P(Cavity)P(Toothache Cavity)P(Catch Cavity) P(Cause, Effect 1,..., Effect n ) = P(Cause) n P(Effect i Cause) i=1 Probability, Conditional Independence

33 Naive Bayes Model If X 1,..., X n are independent given Y P(Y, X 1,..., X n ) = P(Y ) Example: P(Cavity,Toothache, Catch) More generally n P(X i Y ) i=1 = P(Cavity)P(Toothache Cavity)P(Catch Cavity) P(Cause, Effect 1,..., Effect n ) = P(Cause) n P(Effect i Cause) i=1 Cavity Cause Toothache Catch Effect 1 Effect n Probability, Conditional Independence

34 Bayesian networks A simple, graphical notation for conditional independence assertions and hence for compact specification of full joint distributions

35 Bayesian networks A simple, graphical notation for conditional independence assertions and hence for compact specification of full joint distributions Syntax A set of nodes, one per variable A directed, acyclic graph (link implies direct influence) A conditional distribution for each node given its parents

36 Bayesian networks A simple, graphical notation for conditional independence assertions and hence for compact specification of full joint distributions Syntax A set of nodes, one per variable A directed, acyclic graph (link implies direct influence) A conditional distribution for each node given its parents Conditional distributions For each X i, P(X i Parents(X i )) In the form of a conditional probability table (CPT) Distribution of X i for each combination of parent values

37 Example Topology of network encodes conditional independence assertions Weather Cavity Toothache Catch Weather is independent of the other variables Toothache, Catch are conditionally independent given Cavity

38 Example I m at work, neighbor John calls to say my alarm is ringing, but neighbor Mary doesn t call. Sometimes it s set off by minor earthquakes. Is there a burglar? Variables: Burglar, Earthquake, Alarm, JohnCalls, MaryCalls

39 Example I m at work, neighbor John calls to say my alarm is ringing, but neighbor Mary doesn t call. Sometimes it s set off by minor earthquakes. Is there a burglar? Variables: Burglar, Earthquake, Alarm, JohnCalls, MaryCalls Network topology reflects causal knowledge

40 Example I m at work, neighbor John calls to say my alarm is ringing, but neighbor Mary doesn t call. Sometimes it s set off by minor earthquakes. Is there a burglar? Variables: Burglar, Earthquake, Alarm, JohnCalls, MaryCalls Network topology reflects causal knowledge A burglar can set the alarm off

41 Example I m at work, neighbor John calls to say my alarm is ringing, but neighbor Mary doesn t call. Sometimes it s set off by minor earthquakes. Is there a burglar? Variables: Burglar, Earthquake, Alarm, JohnCalls, MaryCalls Network topology reflects causal knowledge A burglar can set the alarm off An earthquake can set the alarm off

42 Example I m at work, neighbor John calls to say my alarm is ringing, but neighbor Mary doesn t call. Sometimes it s set off by minor earthquakes. Is there a burglar? Variables: Burglar, Earthquake, Alarm, JohnCalls, MaryCalls Network topology reflects causal knowledge A burglar can set the alarm off An earthquake can set the alarm off The alarm can cause Mary to call

43 Example I m at work, neighbor John calls to say my alarm is ringing, but neighbor Mary doesn t call. Sometimes it s set off by minor earthquakes. Is there a burglar? Variables: Burglar, Earthquake, Alarm, JohnCalls, MaryCalls Network topology reflects causal knowledge A burglar can set the alarm off An earthquake can set the alarm off The alarm can cause Mary to call The alarm can cause John to call

44 Example (Contd.) Burglary P(B).001 Earthquake P(E).002 B T T F F E T F T F P(A B,E) Alarm JohnCalls A T F P(J A) MaryCalls A T F P(M A).70.01

45 Compactness B E A J M A CPT for Boolean X i with k Boolean parents

46 Compactness B E A J M A CPT for Boolean X i with k Boolean parents 2 k rows for the combinations of parent values

47 Compactness B E A J M A CPT for Boolean X i with k Boolean parents 2 k rows for the combinations of parent values Each row requires one number

48 Compactness B E A J M A CPT for Boolean X i with k Boolean parents 2 k rows for the combinations of parent values Each row requires one number Each variable has no more than k parents

49 Compactness B E A J M A CPT for Boolean X i with k Boolean parents 2 k rows for the combinations of parent values Each row requires one number Each variable has no more than k parents The complete network requires O(n 2 k ) numbers

50 Compactness B E A J M A CPT for Boolean X i with k Boolean parents 2 k rows for the combinations of parent values Each row requires one number Each variable has no more than k parents The complete network requires O(n 2 k ) numbers Grows linearly with n

51 Compactness B E A J M A CPT for Boolean X i with k Boolean parents 2 k rows for the combinations of parent values Each row requires one number Each variable has no more than k parents The complete network requires O(n 2 k ) numbers Grows linearly with n Full joint distribution requires O(2 n )

52 Compactness B E A J M A CPT for Boolean X i with k Boolean parents 2 k rows for the combinations of parent values Each row requires one number Each variable has no more than k parents The complete network requires O(n 2 k ) numbers Grows linearly with n Full joint distribution requires O(2 n ) Example: Burglary network

53 Compactness B E A J M A CPT for Boolean X i with k Boolean parents 2 k rows for the combinations of parent values Each row requires one number Each variable has no more than k parents The complete network requires O(n 2 k ) numbers Grows linearly with n Full joint distribution requires O(2 n ) Example: Burglary network Full joint distribution requires = 31 numbers

54 Compactness B E A J M A CPT for Boolean X i with k Boolean parents 2 k rows for the combinations of parent values Each row requires one number Each variable has no more than k parents The complete network requires O(n 2 k ) numbers Grows linearly with n Full joint distribution requires O(2 n ) Example: Burglary network Full joint distribution requires = 31 numbers Bayes net requires 10 numbers

55 Global semantics Full joint distribution

56 Global semantics Full joint distribution Can be written as product of local conditionals

57 Global semantics Full joint distribution Can be written as product of local conditionals Example: P(j, m, a, b, e) = P( b)p( e)p(a b, e)p(j a)p(m a)

58 Global semantics Full joint distribution Can be written as product of local conditionals Example: P(j, m, a, b, e) = P( b)p( e)p(a b, e)p(j a)p(m a) Example: P(j, m, a, b, e) = P(b)P( e)p(a b, e)p(j a)p( m a)

59 Global semantics Full joint distribution Can be written as product of local conditionals Example: P(j, m, a, b, e) = P( b)p( e)p(a b, e)p(j a)p(m a) Example: P(j, m, a, b, e) = P(b)P( e)p(a b, e)p(j a)p( m a) Can we compute P(b j, m)?

60 Local semantics Each node is conditionally independent of its nondescendants given its parents U 1... U m Z 1j X Z nj Y 1... Y n

61 Markov blanket Each node is conditionally independent of all others given its Markov blanket, i.e., parents + children + children s parents U 1... U m Z 1j X Z nj Y 1... Y n

62 Conditional Independence in BNs Which BNs support x 1 x 2 x 3 For (a), x 1, x 2 are dependent, x 3 is a collider

63 Conditional Independence in BNs Which BNs support x 1 x 2 x 3 For (a), x 1, x 2 are dependent, x 3 is a collider For (b)-(d), x 1 x 2 x 3

64 Conditional Independence (Contd.) Which BNs support x y z For (a)-(b), z is not a collider, so x y z

65 Conditional Independence (Contd.) Which BNs support x y z For (a)-(b), z is not a collider, so x y z For (c), z is a collider, so x and y are conditionally dependent

66 Conditional Independence (Contd.) Which BNs support x y z For (a)-(b), z is not a collider, so x y z For (c), z is a collider, so x and y are conditionally dependent For (d), w is a collider, and z is a descendent of w, so x and y are conditionally dependent

67 d-connection, d-separation Definition (d-connection): X, Y, Z be disjoint sets of vertices in a directed graph G. X, Y is d-connected by Z iff an undirected path U between some x X, y Y such that for every collider C on U

68 d-connection, d-separation Definition (d-connection): X, Y, Z be disjoint sets of vertices in a directed graph G. X, Y is d-connected by Z iff an undirected path U between some x X, y Y such that for every collider C on U Either C or a descendent of C is in Z

69 d-connection, d-separation Definition (d-connection): X, Y, Z be disjoint sets of vertices in a directed graph G. X, Y is d-connected by Z iff an undirected path U between some x X, y Y such that for every collider C on U Either C or a descendent of C is in Z No non-collider on U is in Z

70 d-connection, d-separation Definition (d-connection): X, Y, Z be disjoint sets of vertices in a directed graph G. X, Y is d-connected by Z iff an undirected path U between some x X, y Y such that for every collider C on U Either C or a descendent of C is in Z No non-collider on U is in Z Otherwise X and Y are d-separated by Z

71 d-connection, d-separation Definition (d-connection): X, Y, Z be disjoint sets of vertices in a directed graph G. X, Y is d-connected by Z iff an undirected path U between some x X, y Y such that for every collider C on U Either C or a descendent of C is in Z No non-collider on U is in Z Otherwise X and Y are d-separated by Z If Z d-separates X and Y, then X Y Z for all distributions represented by the graph

72 Conditional Independence (Contd.) Examples For (a), a e b; but a, e are dependent given {b, d}

73 Conditional Independence (Contd.) Examples For (a), a e b; but a, e are dependent given {b, d} For (b) a and e are dependent given b; c and e are unconditionally dependent

74 Conditional Independence: More Examples (a) (b) For (a), Is a c e? Is a e b? Is a e c? For (b), Is a e d? Is a e c? Is a c b

75 Example: Car diagnosis Initial evidence: car won t start

76 Example: Car diagnosis Initial evidence: car won t start Testable variables (green), broken, so fix it variables (orange) Hidden variables (gray) ensure sparse structure, reduce parameters battery age alternator broken fanbelt broken battery dead no charging battery meter battery flat no oil no gas fuel line blocked starter broken lights oil light gas gauge car won t start dipstick

77 Example: Car insurance Age GoodStudent RiskAversion SeniorTrain SocioEcon Mileage VehicleYear ExtraCar DrivingSkill MakeModel DrivingHist Antilock DrivQuality Airbag CarValue HomeBase AntiTheft Ruggedness Accident OwnDamage Theft Cushioning OtherCost OwnCost MedicalCost LiabilityCost PropertyCost

78 Inference Burglary P(B).001 Earthquake P(E).002 B T T F F E T F T F P(A B,E) Alarm JohnCalls A T F P(J A) MaryCalls A T F P(M A) How can we compute P(b j, m)?

Bayesian Networks Chapter 14. Mausam (Slides by UW-AI faculty & David Page)

Bayesian Networks Chapter 14. Mausam (Slides by UW-AI faculty & David Page) Bayesian Networks Chapter 14 Mausam (Slides by UW-AI faculty & David Page) Bayes Nets In general, joint distribution P over set of variables (X 1 x... x X n ) requires exponential space for representation

More information

13.3 Inference Using Full Joint Distribution

13.3 Inference Using Full Joint Distribution 191 The probability distribution on a single variable must sum to 1 It is also true that any joint probability distribution on any set of variables must sum to 1 Recall that any proposition a is equivalent

More information

Bayesian Networks. Mausam (Slides by UW-AI faculty)

Bayesian Networks. Mausam (Slides by UW-AI faculty) Bayesian Networks Mausam (Slides by UW-AI faculty) Bayes Nets In general, joint distribution P over set of variables (X 1 x... x X n ) requires exponential space for representation & inference BNs provide

More information

Artificial Intelligence. Conditional probability. Inference by enumeration. Independence. Lesson 11 (From Russell & Norvig)

Artificial Intelligence. Conditional probability. Inference by enumeration. Independence. Lesson 11 (From Russell & Norvig) Artificial Intelligence Conditional probability Conditional or posterior probabilities e.g., cavity toothache) = 0.8 i.e., given that toothache is all I know tation for conditional distributions: Cavity

More information

Logic, Probability and Learning

Logic, Probability and Learning Logic, Probability and Learning Luc De Raedt luc.deraedt@cs.kuleuven.be Overview Logic Learning Probabilistic Learning Probabilistic Logic Learning Closely following : Russell and Norvig, AI: a modern

More information

Artificial Intelligence

Artificial Intelligence Artificial Intelligence ICS461 Fall 2010 Nancy E. Reed nreed@hawaii.edu 1 Lecture #13 Uncertainty Outline Uncertainty Probability Syntax and Semantics Inference Independence and Bayes' Rule Uncertainty

More information

Outline. Uncertainty. Uncertainty. Making decisions under uncertainty. Probability. Methods for handling uncertainty

Outline. Uncertainty. Uncertainty. Making decisions under uncertainty. Probability. Methods for handling uncertainty Outline Uncertainty Chapter 13 Uncertainty Probability Syntax and Semantics Inference Independence and Bayes' Rule Uncertainty Let action A t = leave for airport t minutes before flight Will A t get me

More information

Basics of Probability

Basics of Probability Basics of Probability 1 Sample spaces, events and probabilities Begin with a set Ω the sample space e.g., 6 possible rolls of a die. ω Ω is a sample point/possible world/atomic event A probability space

More information

CS 188: Artificial Intelligence. Probability recap

CS 188: Artificial Intelligence. Probability recap CS 188: Artificial Intelligence Bayes Nets Representation and Independence Pieter Abbeel UC Berkeley Many slides over this course adapted from Dan Klein, Stuart Russell, Andrew Moore Conditional probability

More information

/ Informatica. Intelligent Systems. Agents dealing with uncertainty. Example: driving to the airport. Why logic fails

/ Informatica. Intelligent Systems. Agents dealing with uncertainty. Example: driving to the airport. Why logic fails Intelligent Systems Prof. dr. Paul De Bra Technische Universiteit Eindhoven debra@win.tue.nl Agents dealing with uncertainty Uncertainty Probability Syntax and Semantics Inference Independence and Bayes'

More information

Lecture 2: Introduction to belief (Bayesian) networks

Lecture 2: Introduction to belief (Bayesian) networks Lecture 2: Introduction to belief (Bayesian) networks Conditional independence What is a belief network? Independence maps (I-maps) January 7, 2008 1 COMP-526 Lecture 2 Recall from last time: Conditional

More information

Uncertainty. Chapter 13. Chapter 13 1

Uncertainty. Chapter 13. Chapter 13 1 Uncertainty Chapter 13 Chapter 13 1 Attribution Modified from Stuart Russell s slides (Berkeley) Chapter 13 2 Outline Uncertainty Probability Syntax and Semantics Inference Independence and Bayes Rule

More information

CS 771 Artificial Intelligence. Reasoning under uncertainty

CS 771 Artificial Intelligence. Reasoning under uncertainty CS 771 Artificial Intelligence Reasoning under uncertainty Today Representing uncertainty is useful in knowledge bases Probability provides a coherent framework for uncertainty Review basic concepts in

More information

Variable Elimination 1

Variable Elimination 1 Variable Elimination 1 Inference Exact inference Enumeration Variable elimination Junction trees and belief propagation Approximate inference Loopy belief propagation Sampling methods: likelihood weighting,

More information

Bayesian Networks. Read R&N Ch. 14.1-14.2. Next lecture: Read R&N 18.1-18.4

Bayesian Networks. Read R&N Ch. 14.1-14.2. Next lecture: Read R&N 18.1-18.4 Bayesian Networks Read R&N Ch. 14.1-14.2 Next lecture: Read R&N 18.1-18.4 You will be expected to know Basic concepts and vocabulary of Bayesian networks. Nodes represent random variables. Directed arcs

More information

Data Modeling & Analysis Techniques. Probability & Statistics. Manfred Huber 2011 1

Data Modeling & Analysis Techniques. Probability & Statistics. Manfred Huber 2011 1 Data Modeling & Analysis Techniques Probability & Statistics Manfred Huber 2011 1 Probability and Statistics Probability and statistics are often used interchangeably but are different, related fields

More information

5 Directed acyclic graphs

5 Directed acyclic graphs 5 Directed acyclic graphs (5.1) Introduction In many statistical studies we have prior knowledge about a temporal or causal ordering of the variables. In this chapter we will use directed graphs to incorporate

More information

Marginal Independence and Conditional Independence

Marginal Independence and Conditional Independence Marginal Independence and Conditional Independence Computer Science cpsc322, Lecture 26 (Textbook Chpt 6.1-2) March, 19, 2010 Recap with Example Marginalization Conditional Probability Chain Rule Lecture

More information

Life of A Knowledge Base (KB)

Life of A Knowledge Base (KB) Life of A Knowledge Base (KB) A knowledge base system is a special kind of database management system to for knowledge base management. KB extraction: knowledge extraction using statistical models in NLP/ML

More information

The Basics of Graphical Models

The Basics of Graphical Models The Basics of Graphical Models David M. Blei Columbia University October 3, 2015 Introduction These notes follow Chapter 2 of An Introduction to Probabilistic Graphical Models by Michael Jordan. Many figures

More information

A crash course in probability and Naïve Bayes classification

A crash course in probability and Naïve Bayes classification Probability theory A crash course in probability and Naïve Bayes classification Chapter 9 Random variable: a variable whose possible values are numerical outcomes of a random phenomenon. s: A person s

More information

Artificial Intelligence Mar 27, Bayesian Networks 1 P (T D)P (D) + P (T D)P ( D) =

Artificial Intelligence Mar 27, Bayesian Networks 1 P (T D)P (D) + P (T D)P ( D) = Artificial Intelligence 15-381 Mar 27, 2007 Bayesian Networks 1 Recap of last lecture Probability: precise representation of uncertainty Probability theory: optimal updating of knowledge based on new information

More information

Relational Dynamic Bayesian Networks: a report. Cristina Manfredotti

Relational Dynamic Bayesian Networks: a report. Cristina Manfredotti Relational Dynamic Bayesian Networks: a report Cristina Manfredotti Dipartimento di Informatica, Sistemistica e Comunicazione (D.I.S.Co.) Università degli Studi Milano-Bicocca manfredotti@disco.unimib.it

More information

Probability & Statistics Primer Gregory J. Hakim University of Washington 2 January 2009 v2.0

Probability & Statistics Primer Gregory J. Hakim University of Washington 2 January 2009 v2.0 Probability & Statistics Primer Gregory J. Hakim University of Washington 2 January 2009 v2.0 This primer provides an overview of basic concepts and definitions in probability and statistics. We shall

More information

3. The Junction Tree Algorithms

3. The Junction Tree Algorithms A Short Course on Graphical Models 3. The Junction Tree Algorithms Mark Paskin mark@paskin.org 1 Review: conditional independence Two random variables X and Y are independent (written X Y ) iff p X ( )

More information

Random Variables, Expectation, Distributions

Random Variables, Expectation, Distributions Random Variables, Expectation, Distributions CS 5960/6960: Nonparametric Methods Tom Fletcher January 21, 2009 Review Random Variables Definition A random variable is a function defined on a probability

More information

Probabilistic Graphical Models

Probabilistic Graphical Models Probabilistic Graphical Models Raquel Urtasun and Tamir Hazan TTI Chicago April 4, 2011 Raquel Urtasun and Tamir Hazan (TTI-C) Graphical Models April 4, 2011 1 / 22 Bayesian Networks and independences

More information

Outline. Probability. Random Variables. Joint and Marginal Distributions Conditional Distribution Product Rule, Chain Rule, Bayes Rule.

Outline. Probability. Random Variables. Joint and Marginal Distributions Conditional Distribution Product Rule, Chain Rule, Bayes Rule. Probability 1 Probability Random Variables Outline Joint and Marginal Distributions Conditional Distribution Product Rule, Chain Rule, Bayes Rule Inference Independence 2 Inference in Ghostbusters A ghost

More information

Bayesian vs. Markov Networks

Bayesian vs. Markov Networks Bayesian vs. Markov Networks Le Song Machine Learning II: dvanced Topics CSE 8803ML, Spring 2012 Conditional Independence ssumptions Local Markov ssumption Global Markov ssumption 𝑋 𝑁𝑜𝑛𝑑𝑒𝑠𝑐𝑒𝑛𝑑𝑎𝑛𝑡𝑋 𝑃𝑎𝑋

More information

5. Continuous Random Variables

5. Continuous Random Variables 5. Continuous Random Variables Continuous random variables can take any value in an interval. They are used to model physical characteristics such as time, length, position, etc. Examples (i) Let X be

More information

Probabilistic Networks An Introduction to Bayesian Networks and Influence Diagrams

Probabilistic Networks An Introduction to Bayesian Networks and Influence Diagrams Probabilistic Networks An Introduction to Bayesian Networks and Influence Diagrams Uffe B. Kjærulff Department of Computer Science Aalborg University Anders L. Madsen HUGIN Expert A/S 10 May 2005 2 Contents

More information

Joint Probability Distributions and Random Samples. Week 5, 2011 Stat 4570/5570 Material from Devore s book (Ed 8), and Cengage

Joint Probability Distributions and Random Samples. Week 5, 2011 Stat 4570/5570 Material from Devore s book (Ed 8), and Cengage 5 Joint Probability Distributions and Random Samples Week 5, 2011 Stat 4570/5570 Material from Devore s book (Ed 8), and Cengage Two Discrete Random Variables The probability mass function (pmf) of a single

More information

Expectation Discrete RV - weighted average Continuous RV - use integral to take the weighted average

Expectation Discrete RV - weighted average Continuous RV - use integral to take the weighted average PHP 2510 Expectation, variance, covariance, correlation Expectation Discrete RV - weighted average Continuous RV - use integral to take the weighted average Variance Variance is the average of (X µ) 2

More information

22c:145 Artificial Intelligence

22c:145 Artificial Intelligence 22c:145 Artificial Intelligence Fall 2005 Uncertainty Cesare Tinelli The University of Iowa Copyright 2001-05 Cesare Tinelli and Hantao Zhang. a a These notes are copyrighted material and may not be used

More information

The Joint Probability Distribution (JPD) of a set of n binary variables involve a huge number of parameters

The Joint Probability Distribution (JPD) of a set of n binary variables involve a huge number of parameters DEFINING PROILISTI MODELS The Joint Probability Distribution (JPD) of a set of n binary variables involve a huge number of parameters 2 n (larger than 10 25 for only 100 variables). x y z p(x, y, z) 0

More information

Bayesian probability theory

Bayesian probability theory Bayesian probability theory Bruno A. Olshausen arch 1, 2004 Abstract Bayesian probability theory provides a mathematical framework for peforming inference, or reasoning, using probability. The foundations

More information

Section 6.1 Joint Distribution Functions

Section 6.1 Joint Distribution Functions Section 6.1 Joint Distribution Functions We often care about more than one random variable at a time. DEFINITION: For any two random variables X and Y the joint cumulative probability distribution function

More information

ST 371 (IV): Discrete Random Variables

ST 371 (IV): Discrete Random Variables ST 371 (IV): Discrete Random Variables 1 Random Variables A random variable (rv) is a function that is defined on the sample space of the experiment and that assigns a numerical variable to each possible

More information

For a partition B 1,..., B n, where B i B j = for i. A = (A B 1 ) (A B 2 ),..., (A B n ) and thus. P (A) = P (A B i ) = P (A B i )P (B i )

For a partition B 1,..., B n, where B i B j = for i. A = (A B 1 ) (A B 2 ),..., (A B n ) and thus. P (A) = P (A B i ) = P (A B i )P (B i ) Probability Review 15.075 Cynthia Rudin A probability space, defined by Kolmogorov (1903-1987) consists of: A set of outcomes S, e.g., for the roll of a die, S = {1, 2, 3, 4, 5, 6}, 1 1 2 1 6 for the roll

More information

Intelligent Systems: Reasoning and Recognition. Uncertainty and Plausible Reasoning

Intelligent Systems: Reasoning and Recognition. Uncertainty and Plausible Reasoning Intelligent Systems: Reasoning and Recognition James L. Crowley ENSIMAG 2 / MoSIG M1 Second Semester 2015/2016 Lesson 12 25 march 2015 Uncertainty and Plausible Reasoning MYCIN (continued)...2 Backward

More information

Random variables P(X = 3) = P(X = 3) = 1 8, P(X = 1) = P(X = 1) = 3 8.

Random variables P(X = 3) = P(X = 3) = 1 8, P(X = 1) = P(X = 1) = 3 8. Random variables Remark on Notations 1. When X is a number chosen uniformly from a data set, What I call P(X = k) is called Freq[k, X] in the courseware. 2. When X is a random variable, what I call F ()

More information

The intuitions in examples from last time give us D-separation and inference in belief networks

The intuitions in examples from last time give us D-separation and inference in belief networks CSC384: Lecture 11 Variable Elimination Last time The intuitions in examples from last time give us D-separation and inference in belief networks a simple inference algorithm for networks without Today

More information

Querying Joint Probability Distributions

Querying Joint Probability Distributions Querying Joint Probability Distributions Sargur Srihari srihari@cedar.buffalo.edu 1 Queries of Interest Probabilistic Graphical Models (BNs and MNs) represent joint probability distributions over multiple

More information

Markov random fields and Gibbs measures

Markov random fields and Gibbs measures Chapter Markov random fields and Gibbs measures 1. Conditional independence Suppose X i is a random element of (X i, B i ), for i = 1, 2, 3, with all X i defined on the same probability space (.F, P).

More information

Chapter 4 Lecture Notes

Chapter 4 Lecture Notes Chapter 4 Lecture Notes Random Variables October 27, 2015 1 Section 4.1 Random Variables A random variable is typically a real-valued function defined on the sample space of some experiment. For instance,

More information

Welcome to Stochastic Processes 1. Welcome to Aalborg University No. 1 of 31

Welcome to Stochastic Processes 1. Welcome to Aalborg University No. 1 of 31 Welcome to Stochastic Processes 1 Welcome to Aalborg University No. 1 of 31 Welcome to Aalborg University No. 2 of 31 Course Plan Part 1: Probability concepts, random variables and random processes Lecturer:

More information

Contents. TTM4155: Teletraffic Theory (Teletrafikkteori) Probability Theory Basics. Yuming Jiang. Basic Concepts Random Variables

Contents. TTM4155: Teletraffic Theory (Teletrafikkteori) Probability Theory Basics. Yuming Jiang. Basic Concepts Random Variables TTM4155: Teletraffic Theory (Teletrafikkteori) Probability Theory Basics Yuming Jiang 1 Some figures taken from the web. Contents Basic Concepts Random Variables Discrete Random Variables Continuous Random

More information

Introduction to Probability

Introduction to Probability Introduction to Probability EE 179, Lecture 15, Handout #24 Probability theory gives a mathematical characterization for experiments with random outcomes. coin toss life of lightbulb binary data sequence

More information

P (x) 0. Discrete random variables Expected value. The expected value, mean or average of a random variable x is: xp (x) = v i P (v i )

P (x) 0. Discrete random variables Expected value. The expected value, mean or average of a random variable x is: xp (x) = v i P (v i ) Discrete random variables Probability mass function Given a discrete random variable X taking values in X = {v 1,..., v m }, its probability mass function P : X [0, 1] is defined as: P (v i ) = Pr[X =

More information

What is Statistics? Lecture 1. Introduction and probability review. Idea of parametric inference

What is Statistics? Lecture 1. Introduction and probability review. Idea of parametric inference 0. 1. Introduction and probability review 1.1. What is Statistics? What is Statistics? Lecture 1. Introduction and probability review There are many definitions: I will use A set of principle and procedures

More information

Continuous Random Variables

Continuous Random Variables Probability 2 - Notes 7 Continuous Random Variables Definition. A random variable X is said to be a continuous random variable if there is a function f X (x) (the probability density function or p.d.f.)

More information

Probability distributions

Probability distributions Probability distributions (Notes are heavily adapted from Harnett, Ch. 3; Hayes, sections 2.14-2.19; see also Hayes, Appendix B.) I. Random variables (in general) A. So far we have focused on single events,

More information

CS 331: Artificial Intelligence Fundamentals of Probability II. Joint Probability Distribution. Marginalization. Marginalization

CS 331: Artificial Intelligence Fundamentals of Probability II. Joint Probability Distribution. Marginalization. Marginalization Full Joint Probability Distributions CS 331: Artificial Intelligence Fundamentals of Probability II Toothache Cavity Catch Toothache, Cavity, Catch false false false 0.576 false false true 0.144 false

More information

p if x = 1 1 p if x = 0

p if x = 1 1 p if x = 0 Probability distributions Bernoulli distribution Two possible values (outcomes): 1 (success), 0 (failure). Parameters: p probability of success. Probability mass function: P (x; p) = { p if x = 1 1 p if

More information

Data Mining Chapter 6: Models and Patterns Fall 2011 Ming Li Department of Computer Science and Technology Nanjing University

Data Mining Chapter 6: Models and Patterns Fall 2011 Ming Li Department of Computer Science and Technology Nanjing University Data Mining Chapter 6: Models and Patterns Fall 2011 Ming Li Department of Computer Science and Technology Nanjing University Models vs. Patterns Models A model is a high level, global description of a

More information

Course: Model, Learning, and Inference: Lecture 5

Course: Model, Learning, and Inference: Lecture 5 Course: Model, Learning, and Inference: Lecture 5 Alan Yuille Department of Statistics, UCLA Los Angeles, CA 90095 yuille@stat.ucla.edu Abstract Probability distributions on structured representation.

More information

Probability for Estimation (review)

Probability for Estimation (review) Probability for Estimation (review) In general, we want to develop an estimator for systems of the form: x = f x, u + η(t); y = h x + ω(t); ggggg y, ffff x We will primarily focus on discrete time linear

More information

COS513 LECTURE 5 JUNCTION TREE ALGORITHM

COS513 LECTURE 5 JUNCTION TREE ALGORITHM COS513 LECTURE 5 JUNCTION TREE ALGORITHM D. EIS AND V. KOSTINA The junction tree algorithm is the culmination of the way graph theory and probability combine to form graphical models. After we discuss

More information

MATH 10: Elementary Statistics and Probability Chapter 5: Continuous Random Variables

MATH 10: Elementary Statistics and Probability Chapter 5: Continuous Random Variables MATH 10: Elementary Statistics and Probability Chapter 5: Continuous Random Variables Tony Pourmohamad Department of Mathematics De Anza College Spring 2015 Objectives By the end of this set of slides,

More information

STAT 430/510 Probability Lecture 14: Joint Probability Distribution, Continuous Case

STAT 430/510 Probability Lecture 14: Joint Probability Distribution, Continuous Case STAT 430/510 Probability Lecture 14: Joint Probability Distribution, Continuous Case Pengyuan (Penelope) Wang June 20, 2011 Joint density function of continuous Random Variable When X and Y are two continuous

More information

Model-based Synthesis. Tony O Hagan

Model-based Synthesis. Tony O Hagan Model-based Synthesis Tony O Hagan Stochastic models Synthesising evidence through a statistical model 2 Evidence Synthesis (Session 3), Helsinki, 28/10/11 Graphical modelling The kinds of models that

More information

4. Joint Distributions

4. Joint Distributions Virtual Laboratories > 2. Distributions > 1 2 3 4 5 6 7 8 4. Joint Distributions Basic Theory As usual, we start with a random experiment with probability measure P on an underlying sample space. Suppose

More information

MAS108 Probability I

MAS108 Probability I 1 QUEEN MARY UNIVERSITY OF LONDON 2:30 pm, Thursday 3 May, 2007 Duration: 2 hours MAS108 Probability I Do not start reading the question paper until you are instructed to by the invigilators. The paper

More information

Continuous Random Variables

Continuous Random Variables Continuous Random Variables COMP 245 STATISTICS Dr N A Heard Contents 1 Continuous Random Variables 2 11 Introduction 2 12 Probability Density Functions 3 13 Transformations 5 2 Mean, Variance and Quantiles

More information

Continuous Random Variables and Probability Distributions. Stat 4570/5570 Material from Devore s book (Ed 8) Chapter 4 - and Cengage

Continuous Random Variables and Probability Distributions. Stat 4570/5570 Material from Devore s book (Ed 8) Chapter 4 - and Cengage 4 Continuous Random Variables and Probability Distributions Stat 4570/5570 Material from Devore s book (Ed 8) Chapter 4 - and Cengage Continuous r.v. A random variable X is continuous if possible values

More information

3 Multiple Discrete Random Variables

3 Multiple Discrete Random Variables 3 Multiple Discrete Random Variables 3.1 Joint densities Suppose we have a probability space (Ω, F,P) and now we have two discrete random variables X and Y on it. They have probability mass functions f

More information

STA 256: Statistics and Probability I

STA 256: Statistics and Probability I Al Nosedal. University of Toronto. Fall 2014 1 2 3 4 5 My momma always said: Life was like a box of chocolates. You never know what you re gonna get. Forrest Gump. Experiment, outcome, sample space, and

More information

Set theory, and set operations

Set theory, and set operations 1 Set theory, and set operations Sayan Mukherjee Motivation It goes without saying that a Bayesian statistician should know probability theory in depth to model. Measure theory is not needed unless we

More information

Probability Theory. Elementary rules of probability Sum rule. Product rule. p. 23

Probability Theory. Elementary rules of probability Sum rule. Product rule. p. 23 Probability Theory Uncertainty is key concept in machine learning. Probability provides consistent framework for the quantification and manipulation of uncertainty. Probability of an event is the fraction

More information

Lecture 4: BK inequality 27th August and 6th September, 2007

Lecture 4: BK inequality 27th August and 6th September, 2007 CSL866: Percolation and Random Graphs IIT Delhi Amitabha Bagchi Scribe: Arindam Pal Lecture 4: BK inequality 27th August and 6th September, 2007 4. Preliminaries The FKG inequality allows us to lower bound

More information

Bayesian Updating with Discrete Priors Class 11, 18.05, Spring 2014 Jeremy Orloff and Jonathan Bloom

Bayesian Updating with Discrete Priors Class 11, 18.05, Spring 2014 Jeremy Orloff and Jonathan Bloom 1 Learning Goals Bayesian Updating with Discrete Priors Class 11, 18.05, Spring 2014 Jeremy Orloff and Jonathan Bloom 1. Be able to apply Bayes theorem to compute probabilities. 2. Be able to identify

More information

The Exponential Family

The Exponential Family The Exponential Family David M. Blei Columbia University November 3, 2015 Definition A probability density in the exponential family has this form where p.x j / D h.x/ expf > t.x/ a./g; (1) is the natural

More information

Question 2 Naïve Bayes (16 points)

Question 2 Naïve Bayes (16 points) Question 2 Naïve Bayes (16 points) About 2/3 of your email is spam so you downloaded an open source spam filter based on word occurrences that uses the Naive Bayes classifier. Assume you collected the

More information

Discrete Mathematics and Probability Theory Fall 2009 Satish Rao,David Tse Note 11

Discrete Mathematics and Probability Theory Fall 2009 Satish Rao,David Tse Note 11 CS 70 Discrete Mathematics and Probability Theory Fall 2009 Satish Rao,David Tse Note Conditional Probability A pharmaceutical company is marketing a new test for a certain medical condition. According

More information

Finite and discrete probability distributions

Finite and discrete probability distributions 8 Finite and discrete probability distributions To understand the algorithmic aspects of number theory and algebra, and applications such as cryptography, a firm grasp of the basics of probability theory

More information

Lecture 6: Discrete & Continuous Probability and Random Variables

Lecture 6: Discrete & Continuous Probability and Random Variables Lecture 6: Discrete & Continuous Probability and Random Variables D. Alex Hughes Math Camp September 17, 2015 D. Alex Hughes (Math Camp) Lecture 6: Discrete & Continuous Probability and Random September

More information

Summary of Probability

Summary of Probability Summary of Probability Mathematical Physics I Rules of Probability The probability of an event is called P(A), which is a positive number less than or equal to 1. The total probability for all possible

More information

Q1. [19 pts] Cheating at cs188-blackjack

Q1. [19 pts] Cheating at cs188-blackjack CS 188 Spring 2012 Introduction to Artificial Intelligence Practice Midterm II Solutions Q1. [19 pts] Cheating at cs188-blackjack Cheating dealers have become a serious problem at the cs188-blackjack tables.

More information

University of California, Los Angeles Department of Statistics. Random variables

University of California, Los Angeles Department of Statistics. Random variables University of California, Los Angeles Department of Statistics Statistics Instructor: Nicolas Christou Random variables Discrete random variables. Continuous random variables. Discrete random variables.

More information

The sample space for a pair of die rolls is the set. The sample space for a random number between 0 and 1 is the interval [0, 1].

The sample space for a pair of die rolls is the set. The sample space for a random number between 0 and 1 is the interval [0, 1]. Probability Theory Probability Spaces and Events Consider a random experiment with several possible outcomes. For example, we might roll a pair of dice, flip a coin three times, or choose a random real

More information

Examples and Proofs of Inference in Junction Trees

Examples and Proofs of Inference in Junction Trees Examples and Proofs of Inference in Junction Trees Peter Lucas LIAC, Leiden University February 4, 2016 1 Representation and notation Let P(V) be a joint probability distribution, where V stands for a

More information

Joint Probability Distributions and Random Samples (Devore Chapter Five)

Joint Probability Distributions and Random Samples (Devore Chapter Five) Joint Probability Distributions and Random Samples (Devore Chapter Five) 1016-345-01 Probability and Statistics for Engineers Winter 2010-2011 Contents 1 Joint Probability Distributions 1 1.1 Two Discrete

More information

Jointly Distributed Random Variables

Jointly Distributed Random Variables Jointly Distributed Random Variables COMP 245 STATISTICS Dr N A Heard Contents 1 Jointly Distributed Random Variables 1 1.1 Definition......................................... 1 1.2 Joint cdfs..........................................

More information

Summary of Formulas and Concepts. Descriptive Statistics (Ch. 1-4)

Summary of Formulas and Concepts. Descriptive Statistics (Ch. 1-4) Summary of Formulas and Concepts Descriptive Statistics (Ch. 1-4) Definitions Population: The complete set of numerical information on a particular quantity in which an investigator is interested. We assume

More information

Statistical Inference. Prof. Kate Calder. If the coin is fair (chance of heads = chance of tails) then

Statistical Inference. Prof. Kate Calder. If the coin is fair (chance of heads = chance of tails) then Probability Statistical Inference Question: How often would this method give the correct answer if I used it many times? Answer: Use laws of probability. 1 Example: Tossing a coin If the coin is fair (chance

More information

Uncertainty in AI. Uncertainty I: Probability as Degree of Belief. The (predominantly logic-based) methods covered so far have assorted

Uncertainty in AI. Uncertainty I: Probability as Degree of Belief. The (predominantly logic-based) methods covered so far have assorted We now examine: Uncertainty I: Probability as Degree of Belief how probability theory might be used to represent and reason with knowledge when we are uncertain about the world; how inference in the presence

More information

Bayesian Tutorial (Sheet Updated 20 March)

Bayesian Tutorial (Sheet Updated 20 March) Bayesian Tutorial (Sheet Updated 20 March) Practice Questions (for discussing in Class) Week starting 21 March 2016 1. What is the probability that the total of two dice will be greater than 8, given that

More information

Definition: Suppose that two random variables, either continuous or discrete, X and Y have joint density

Definition: Suppose that two random variables, either continuous or discrete, X and Y have joint density HW MATH 461/561 Lecture Notes 15 1 Definition: Suppose that two random variables, either continuous or discrete, X and Y have joint density and marginal densities f(x, y), (x, y) Λ X,Y f X (x), x Λ X,

More information

ST 371 (VIII): Theory of Joint Distributions

ST 371 (VIII): Theory of Joint Distributions ST 371 (VIII): Theory of Joint Distributions So far we have focused on probability distributions for single random variables. However, we are often interested in probability statements concerning two or

More information

Business Statistics 41000: Probability 1

Business Statistics 41000: Probability 1 Business Statistics 41000: Probability 1 Drew D. Creal University of Chicago, Booth School of Business Week 3: January 24 and 25, 2014 1 Class information Drew D. Creal Email: dcreal@chicagobooth.edu Office:

More information

Introduction to Markov Chain Monte Carlo

Introduction to Markov Chain Monte Carlo Introduction to Markov Chain Monte Carlo Monte Carlo: sample from a distribution to estimate the distribution to compute max, mean Markov Chain Monte Carlo: sampling using local information Generic problem

More information

Important Probability Distributions OPRE 6301

Important Probability Distributions OPRE 6301 Important Probability Distributions OPRE 6301 Important Distributions... Certain probability distributions occur with such regularity in real-life applications that they have been given their own names.

More information

Chapter 10: Introducing Probability

Chapter 10: Introducing Probability Chapter 10: Introducing Probability Randomness and Probability So far, in the first half of the course, we have learned how to examine and obtain data. Now we turn to another very important aspect of Statistics

More information

P(a X b) = f X (x)dx. A p.d.f. must integrate to one: f X (x)dx = 1. Z b

P(a X b) = f X (x)dx. A p.d.f. must integrate to one: f X (x)dx = 1. Z b Continuous Random Variables The probability that a continuous random variable, X, has a value between a and b is computed by integrating its probability density function (p.d.f.) over the interval [a,b]:

More information

Probabilities and Random Variables

Probabilities and Random Variables Probabilities and Random Variables This is an elementary overview of the basic concepts of probability theory. 1 The Probability Space The purpose of probability theory is to model random experiments so

More information

MATH 201. Final ANSWERS August 12, 2016

MATH 201. Final ANSWERS August 12, 2016 MATH 01 Final ANSWERS August 1, 016 Part A 1. 17 points) A bag contains three different types of dice: four 6-sided dice, five 8-sided dice, and six 0-sided dice. A die is drawn from the bag and then rolled.

More information

Exercises with solutions (1)

Exercises with solutions (1) Exercises with solutions (). Investigate the relationship between independence and correlation. (a) Two random variables X and Y are said to be correlated if and only if their covariance C XY is not equal

More information

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.436J/15.085J Fall 2008 Lecture 5 9/17/2008 RANDOM VARIABLES

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.436J/15.085J Fall 2008 Lecture 5 9/17/2008 RANDOM VARIABLES MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.436J/15.085J Fall 2008 Lecture 5 9/17/2008 RANDOM VARIABLES Contents 1. Random variables and measurable functions 2. Cumulative distribution functions 3. Discrete

More information

E3: PROBABILITY AND STATISTICS lecture notes

E3: PROBABILITY AND STATISTICS lecture notes E3: PROBABILITY AND STATISTICS lecture notes 2 Contents 1 PROBABILITY THEORY 7 1.1 Experiments and random events............................ 7 1.2 Certain event. Impossible event............................

More information

4. Joint Distributions of Two Random Variables

4. Joint Distributions of Two Random Variables 4. Joint Distributions of Two Random Variables 4.1 Joint Distributions of Two Discrete Random Variables Suppose the discrete random variables X and Y have supports S X and S Y, respectively. The joint

More information