# Bayes Nets. CS 188: Artificial Intelligence. Inference. Inference by Enumeration. Inference by Enumeration? Example: Enumeration.

Save this PDF as:

Size: px
Start display at page:

Download "Bayes Nets. CS 188: Artificial Intelligence. Inference. Inference by Enumeration. Inference by Enumeration? Example: Enumeration."

## Transcription

1 CS 188: Artificial Intelligence Probabilistic Inference: Enumeration, Variable Elimination, Sampling Pieter Abbeel UC Berkeley Many slides over this course adapted from Dan Klein, Stuart Russell, Andrew Moore Representation Bayes Nets Conditional Independences Probabilistic Inference Enumeration (exact, exponential complexity) Variable elimination (exact, worst-case exponential complexity, often better) Probabilistic inference is NP-complete Sampling (approximate) earning Bayes Nets from Data 2 Inference Inference by Enumeration Inference: calculating some useful quantity from a joint probability distribution Examples: Posterior probability: B A E Given unlimited time, inference in BNs is easy Recipe: State the marginal probabilities you need Figure out A the atomic probabilities you need Calculate and combine them Example: B E Most likely explanation: J M A 4 J M 5 Example: Enumeration Inference by Enumeration? In this simple method, we only need the BN to synthesize the joint entries 6 7 1

2 Variable Elimination Factor Zoo I Why is inference by enumeration so slow? You join up the whole joint distribution before you sum out the hidden variables Idea: interleave joining and marginalizing! Called Variable Elimination Still NP-hard, but usually much faster than inference by enumeration We ll need some new notation to define VE 8 Joint distribution: P(X,Y) Entries P(x,y) for all x, y Sums to 1 Selected joint: P(x,Y) A slice of the joint distribution Entries P(x,y) for fixed x, all y Sums to P(x) Number of capitals = dimensionality of the table W P hot sun 0.4 hot rain 0.1 cold sun 0.2 cold rain 0.3 W P cold sun 0.2 cold rain Factor Zoo II Factor Zoo III Family of conditionals: P(X Y) Multiple conditionals Entries P(x y) for all x, y Sums to Y Single conditional: P(Y x) Entries P(y x) for fixed x, all y Sums to 1 W P hot sun 0.8 hot rain 0.2 cold sun 0.4 cold rain 0.6 W P cold sun 0.4 cold rain 0.6 Specified family: P(y X) Entries P(y x) for fixed y, but for all x Sums to who knows! W P hot rain 0.2 cold rain 0.6 In general, when we write P(Y 1 Y N X 1 X M ) It is a factor, a multi-dimensional array Its values are all P(y 1 y N x 1 x M ) Any assigned X or Y is a dimension missing (selected) from the array Example: raffic Domain Variable Elimination Outline Random Variables R: Raining : raffic : ate for class! R rack objects called factors Initial factors are local CPs (one per node) Any known values are selected E.g. if we know, the initial factors are 12 VE: Alternately join factors and eliminate variables 13 2

3 Operation 1: Join Factors Example: Multiple Joins R First basic operation: joining factors Combining factors: Just like a database join Get all factors over the joining variable Build a new factor over the union of the variables involved Example: Join on R Computation for each entry: pointwise products R, 14 R Join R R, 16 Example: Multiple Joins Operation 2: Eliminate R, Join R,, +r +t +l r +t - l r - t +l r - t - l r +t +l r +t - l r - t +l r - t - l Second basic operation: marginalization ake a factor and sum out a variable Shrinks a factor to a smaller one A projection operation Example: +t t Multiple Elimination P() : Marginalizing Early! R,,, Join R Sum out R +r +t +l r +t - l r - t +l r - t - l r +t +l r +t - l r - t +l r - t - l Sum out R +t +l t - l t +l t - l Sum out +l l R R, +t t

4 Marginalizing Early (aka VE*) +t t 0.83 Join, Sum out +t +l t - l t +l t - l l l * VE is variable elimination If evidence, start with factors that select that evidence No evidence uses these initial factors: Computing Evidence, the initial factors become: We eliminate all vars other than query + evidence 22 Result will be a selected joint of query and evidence E.g. for P( +r), we d end up with: o get our answer, just normalize this! hat s it! +r +l r - l Evidence II Normalize +l l 0.74 General Variable Elimination Query: Start with initial factors: ocal CPs (but instantiated by evidence) While there are still hidden variables (not Q or evidence): Pick a hidden variable H Join all factors mentioning H Eliminate (sum out) H Join all remaining factors and normalize Example Example Choose E Choose A Finish with B Normalize

5 Same Example in Equations Another (bit more abstractly worked out) Variable Elimination Example marginal can be obtained from joint by summing out use Bayes net joint distribution expression use x*(y+z) = xy + xz joining on a, and then summing out gives f_1 x*(y+z) = xy + xz joining on e, and then summing out gives f_2 28 All we are doing is exploiting xy + xz = x(y+z) to improve computational efficiency! Computational complexity critically depends on the largest factor being generated in this process. Size of factor = number of entries in table. In example above (assuming binary) all factors generated are of size as they all only have one variable (Z, Z, and X3 respectively). 29 Variable Elimination Ordering For the query P(X n y 1,,y n ) work through the following two different orderings as done in previous slide: Z, X 1,, X n-1 and X 1,, X n-1, Z. What is the size of the maximum factor generated for each of the orderings? Answer: 2 n versus 2 (assuming binary) In general: the ordering can greatly affect efficiency. Computational and Space Complexity of Variable Elimination he computational and space complexity of variable elimination is determined by the largest factor he elimination ordering can greatly affect the size of the largest factor. E.g., previous slide s example 2 n vs. 2 Does there always exist an ordering that only results in small factors? No! Worst Case Complexity? Consider the 3-SA clause: which can be encoded by the following Bayes net: If we can answer P(z) equal to zero or not, we answered whether the 3-SA problem has a solution. Subtlety: why the cascaded version of the AND rather than feeding all OR clauses into a single AND? Answer: a single AND would have an exponentially large CP, whereas with representation above the Bayes net has small CPs only. Hence inference in Bayes nets is NP-hard. No known efficient probabilistic inference in general. 32 Polytrees A polytree is a directed graph with no undirected cycles For poly-trees you can always find an ordering that is efficient ry it!! Cut-set conditioning for Bayes net inference Choose set of variables such that if removed only a polytree remains 33 hink about how the specifics would work out! 5

6 Representation Bayes Nets Conditional Independences Probabilistic Inference Enumeration (exact, exponential complexity) Variable elimination (exact, worst-case exponential complexity, often better) Probabilistic inference is NP-complete Sampling (approximate) Sampling Simulation has a name: sampling (e.g., predicting the weather, basketball games, ) Basic idea: Draw N samples from a sampling distribution S Compute an approximate posterior probability Show this converges to the true probability P Why sample? earning: get samples from a distribution you don t know Inference: getting a sample is faster than computing the right answer (e.g. with variable elimination) earning Bayes Nets from Data Sampling How do you sample? Simplest way is to use a random number generator to get a continuous value uniformly distributed between 0 and 1 (e.g. random() in Python) Assign each value in the domain of your random variable a sub-interval of [0,1] with a size equal to its probability he sub-intervals cannot overlap 38 Sampling Example Each value in the domain of W has a subinterval of [0,1] with a size equal to its probability W P(W) Sun 0.6 Rain 0.1 Fog 0.3 Meteor 0.0 u is a uniform random value in [0, 1] if 0.0 u<0.6,w = sun if 0.6 u<0.7,w = rain if 0.7 u<1.0,w = fog e.g. if random() returns u = 0.83, then our sample is w = fog 39 Sampling in Bayes Nets Prior Sampling Prior Sampling Rejection Sampling ikelihood Weighting Gibbs Sampling +c +s s c +s s 0.5 Sprinkler +c c 0.5 Cloudy Rain +c +r r c +r r s +r +w w r +w w s +r +w w r +w w 0.99 WetGrass Samples: -c, +s, -r, +w 41 6

7 Prior Sampling o generate one sample from a Bayes net with n variables. Assume variables are named such that ordering X 1, X 2,, X n is consistent with the DAG. Prior Sampling his process generates samples with probability: i.e. the BN s joint probability For i=1, 2,, n Sample x i from P(X i Parents(X i )) End For Return (x 1, x 2,, x n ) et the number of samples of an event be hen 42 I.e., the sampling procedure is consistent 43 Example Rejection Sampling We ll get a bunch of samples from the BN: +c, +s, +r, +w Cloudy C -c, +s, +r, -w Sprinkler S Rain R -c, -s, -r, +w WetGrass W If we want to know P(W) We have counts <+w:4, -w:1> Normalize to get P(W) = <+w:0.8, -w:0.2> his will get closer to the true distribution with more samples Can estimate anything else, too What about P(C +w)? P(C +r, +w)? P(C -r, -w)? Fast: can use fewer samples if less time (what s the drawback?) et s say we want P(C) No point keeping all samples around Just tally counts of C as we go et s say we want P(C +s) Same thing: tally C outcomes, but ignore (reject) samples which don t have S=+s his is called rejection sampling It is also consistent for conditional probabilities (i.e., correct in the limit) Sprinkler S Cloudy C WetGrass W Rain R +c, +s, +r, +w -c, +s, +r, -w -c, -s, -r, +w Rejection Sampling ikelihood Weighting For i=1, 2,, n Sample x i from P(X i Parents(X i )) If x i not consistent with the evidence in the query, exit this for-loop and no sample is generated End For Return (x 1, x 2,, x n ) 46 Problem with rejection sampling: If evidence is unlikely, you reject a lot of samples You don t exploit your evidence as you sample Consider P(B +a) Burglary Alarm -b, -a -b, -a -b, -a -b, -a +b, +a Idea: fix evidence variables and sample the rest -b +a -b, +a Burglary Alarm -b, +a -b, +a Problem: sample distribution not consistent! +b, +a Solution: weight by probability of evidence given parents 48 7

8 ikelihood Weighting ikelihood Weighting +c +s s c +s s 0.5 Sprinkler +s +r +w w r +w w s +r +w w r +w w c c 0.5 Cloudy WetGrass Rain Samples: +c +r r c +r r 0.8 +c, +s, +r, +w 49 Set w = 1.0 For i=1, 2,, n If X i is an evidence variable Set X i = observation x i for X i Set w = w * P(x i Parents(X i )) Else Sample x i from P(X i Parents(X i )) End For Return (x 1, x 2,, x n ), w 50 ikelihood Weighting ikelihood Weighting Sampling distribution if z sampled and e fixed evidence Cloudy C Now, samples have weights S W ogether, weighted sampling distribution is consistent R 51 ikelihood weighting is good We have taken evidence into account as we generate the sample E.g. here, W s value will get picked based on the evidence values of S, R More of our samples will reflect the state of the world suggested by the evidence ikelihood weighting doesn t solve all our problems Evidence influences the choice of downstream variables, but not upstream ones (C isn t more likely to get a value matching the evidence) We would like to consider evidence when we sample every variable à Gibbs sampling S Cloudy C W Rain R 52 Gibbs Sampling Gibbs Sampling Procedure: keep track of a full instantiation x 1, x 2,, x n. Start with an arbitrary instantation consistent with the evidence. Sample one variable at a time, conditioned on all the rest, but keep evidence fixed. Keep repeating this for a long time. Property: in the limit of repeating this infinitely many times the resulting sample is coming from the correct distribution What s the point: both upstream and downstream variables condition on evidence. In contrast: likelihood weighting only conditions on upstream evidence, and hence weights obtained in likelihood weighting can sometimes be very small. Sum of weights over all samples is indicative of how many effective samples were obtained, so want high weight. 53 Say we want to sample P(S R = +r) Step 1: Initialize Set evidence (R = +r) Set all other variables (S, C, W) to random values (e.g. by prior sampling or just uniformly sampling; say S = s, W = +w, C = -c) Steps 2+: Repeat the following for some number of iterations Choose a non-evidence variable (S, W, or C in this case) Sample this variable conditioned on nothing else changing he first time through, if we pick S, we sample from P(S R = +r, W = +w, C =-c) he new sample can only be different in a single variable 54 8

9 Gibbs Sampling Example Want to sample from P(R +s,-c,-w) Shorthand for P(R S=+s,C=-c,W=-w) P (R + s, c, w) = = = = P (R, +s, c, w) P (+s, c, w) P (R, +s, c, w) r P (R = r, +s, c, w) P ( c)p (+s c)p (R c)p ( w + s, R) r P ( c)p (+s c)p (R = r c)p ( w + s, R = r) P (R c)p ( w + s, R) r P (R = r c)p ( w + s, R = r) Many things cancel out -- just a join on R 56 Further Reading* Gibbs sampling is a special case of more general methods called Markov chain Monte Carlo (MCMC) methods Metropolis-Hastings is one of the more famous MCMC methods (in fact, Gibbs sampling is a special case of Metropolis- Hastings) You may read about Monte Carlo methods they re just sampling 57 9

### CS 188: Artificial Intelligence. Probability recap

CS 188: Artificial Intelligence Bayes Nets Representation and Independence Pieter Abbeel UC Berkeley Many slides over this course adapted from Dan Klein, Stuart Russell, Andrew Moore Conditional probability

### Logic, Probability and Learning

Logic, Probability and Learning Luc De Raedt luc.deraedt@cs.kuleuven.be Overview Logic Learning Probabilistic Learning Probabilistic Logic Learning Closely following : Russell and Norvig, AI: a modern

### Artificial Intelligence Mar 27, Bayesian Networks 1 P (T D)P (D) + P (T D)P ( D) =

Artificial Intelligence 15-381 Mar 27, 2007 Bayesian Networks 1 Recap of last lecture Probability: precise representation of uncertainty Probability theory: optimal updating of knowledge based on new information

### CS 331: Artificial Intelligence Fundamentals of Probability II. Joint Probability Distribution. Marginalization. Marginalization

Full Joint Probability Distributions CS 331: Artificial Intelligence Fundamentals of Probability II Toothache Cavity Catch Toothache, Cavity, Catch false false false 0.576 false false true 0.144 false

### 13.3 Inference Using Full Joint Distribution

191 The probability distribution on a single variable must sum to 1 It is also true that any joint probability distribution on any set of variables must sum to 1 Recall that any proposition a is equivalent

### Bayesian Networks Chapter 14. Mausam (Slides by UW-AI faculty & David Page)

Bayesian Networks Chapter 14 Mausam (Slides by UW-AI faculty & David Page) Bayes Nets In general, joint distribution P over set of variables (X 1 x... x X n ) requires exponential space for representation

### Part III: Machine Learning. CS 188: Artificial Intelligence. Machine Learning This Set of Slides. Parameter Estimation. Estimation: Smoothing

CS 188: Artificial Intelligence Lecture 20: Dynamic Bayes Nets, Naïve Bayes Pieter Abbeel UC Berkeley Slides adapted from Dan Klein. Part III: Machine Learning Up until now: how to reason in a model and

### Bayesian Networks. Mausam (Slides by UW-AI faculty)

Bayesian Networks Mausam (Slides by UW-AI faculty) Bayes Nets In general, joint distribution P over set of variables (X 1 x... x X n ) requires exponential space for representation & inference BNs provide

### Introduction to Markov Chain Monte Carlo

Introduction to Markov Chain Monte Carlo Monte Carlo: sample from a distribution to estimate the distribution to compute max, mean Markov Chain Monte Carlo: sampling using local information Generic problem

### Lecture 2: Introduction to belief (Bayesian) networks

Lecture 2: Introduction to belief (Bayesian) networks Conditional independence What is a belief network? Independence maps (I-maps) January 7, 2008 1 COMP-526 Lecture 2 Recall from last time: Conditional

### Machine Learning. CS 188: Artificial Intelligence Naïve Bayes. Example: Digit Recognition. Other Classification Tasks

CS 188: Artificial Intelligence Naïve Bayes Machine Learning Up until now: how use a model to make optimal decisions Machine learning: how to acquire a model from data / experience Learning parameters

### The Basics of Graphical Models

The Basics of Graphical Models David M. Blei Columbia University October 3, 2015 Introduction These notes follow Chapter 2 of An Introduction to Probabilistic Graphical Models by Michael Jordan. Many figures

### STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.cs.toronto.edu/~rsalakhu/ Lecture 6 Three Approaches to Classification Construct

### Lecture 6: The Bayesian Approach

Lecture 6: The Bayesian Approach What Did We Do Up to Now? We are given a model Log-linear model, Markov network, Bayesian network, etc. This model induces a distribution P(X) Learning: estimate a set

### Artificial Intelligence. Conditional probability. Inference by enumeration. Independence. Lesson 11 (From Russell & Norvig)

Artificial Intelligence Conditional probability Conditional or posterior probabilities e.g., cavity toothache) = 0.8 i.e., given that toothache is all I know tation for conditional distributions: Cavity

### Model-based Synthesis. Tony O Hagan

Model-based Synthesis Tony O Hagan Stochastic models Synthesising evidence through a statistical model 2 Evidence Synthesis (Session 3), Helsinki, 28/10/11 Graphical modelling The kinds of models that

### Outline. Probability. Random Variables. Joint and Marginal Distributions Conditional Distribution Product Rule, Chain Rule, Bayes Rule.

Probability 1 Probability Random Variables Outline Joint and Marginal Distributions Conditional Distribution Product Rule, Chain Rule, Bayes Rule Inference Independence 2 Inference in Ghostbusters A ghost

### Bayesian Methods. 1 The Joint Posterior Distribution

Bayesian Methods Every variable in a linear model is a random variable derived from a distribution function. A fixed factor becomes a random variable with possibly a uniform distribution going from a lower

### A crash course in probability and Naïve Bayes classification

Probability theory A crash course in probability and Naïve Bayes classification Chapter 9 Random variable: a variable whose possible values are numerical outcomes of a random phenomenon. s: A person s

### Lecture 7: Continuous Random Variables

Lecture 7: Continuous Random Variables 21 September 2005 1 Our First Continuous Random Variable The back of the lecture hall is roughly 10 meters across. Suppose it were exactly 10 meters, and consider

### Bayesian Networks. Read R&N Ch. 14.1-14.2. Next lecture: Read R&N 18.1-18.4

Bayesian Networks Read R&N Ch. 14.1-14.2 Next lecture: Read R&N 18.1-18.4 You will be expected to know Basic concepts and vocabulary of Bayesian networks. Nodes represent random variables. Directed arcs

### SYSM 6304: Risk and Decision Analysis Lecture 5: Methods of Risk Analysis

SYSM 6304: Risk and Decision Analysis Lecture 5: Methods of Risk Analysis M. Vidyasagar Cecil & Ida Green Chair The University of Texas at Dallas Email: M.Vidyasagar@utdallas.edu October 17, 2015 Outline

### Basics of Probability

Basics of Probability 1 Sample spaces, events and probabilities Begin with a set Ω the sample space e.g., 6 possible rolls of a die. ω Ω is a sample point/possible world/atomic event A probability space

### Probability, Conditional Independence

Probability, Conditional Independence June 19, 2012 Probability, Conditional Independence Probability Sample space Ω of events Each event ω Ω has an associated measure Probability of the event P(ω) Axioms

### Algorithms. Theresa Migler-VonDollen CMPS 5P

Algorithms Theresa Migler-VonDollen CMPS 5P 1 / 32 Algorithms Write a Python function that accepts a list of numbers and a number, x. If x is in the list, the function returns the position in the list

### Information Theory and Coding Prof. S. N. Merchant Department of Electrical Engineering Indian Institute of Technology, Bombay

Information Theory and Coding Prof. S. N. Merchant Department of Electrical Engineering Indian Institute of Technology, Bombay Lecture - 14 Non-Binary Huffman Code and Other Codes In the last class we

### Lecture 11: Graphical Models for Inference

Lecture 11: Graphical Models for Inference So far we have seen two graphical models that are used for inference - the Bayesian network and the Join tree. These two both represent the same joint probability

### Decision Trees and Networks

Lecture 21: Uncertainty 6 Today s Lecture Victor R. Lesser CMPSCI 683 Fall 2010 Decision Trees and Networks Decision Trees A decision tree is an explicit representation of all the possible scenarios from

10-601 Machine Learning http://www.cs.cmu.edu/afs/cs/academic/class/10601-f10/index.html Course data All up-to-date info is on the course web page: http://www.cs.cmu.edu/afs/cs/academic/class/10601-f10/index.html

### Decision Making under Uncertainty

6.825 Techniques in Artificial Intelligence Decision Making under Uncertainty How to make one decision in the face of uncertainty Lecture 19 1 In the next two lectures, we ll look at the question of how

### Learning from Data: Naive Bayes

Semester 1 http://www.anc.ed.ac.uk/ amos/lfd/ Naive Bayes Typical example: Bayesian Spam Filter. Naive means naive. Bayesian methods can be much more sophisticated. Basic assumption: conditional independence.

### Final Exam, Spring 2007

10-701 Final Exam, Spring 2007 1. Personal info: Name: Andrew account: E-mail address: 2. There should be 16 numbered pages in this exam (including this cover sheet). 3. You can use any material you brought:

### Discrete Mathematics, Chapter 3: Algorithms

Discrete Mathematics, Chapter 3: Algorithms Richard Mayr University of Edinburgh, UK Richard Mayr (University of Edinburgh, UK) Discrete Mathematics. Chapter 3 1 / 28 Outline 1 Properties of Algorithms

### 15-381: Artificial Intelligence. Probabilistic Reasoning and Inference

5-38: Artificial Intelligence robabilistic Reasoning and Inference Advantages of probabilistic reasoning Appropriate for complex, uncertain, environments - Will it rain tomorrow? Applies naturally to many

### Marginal Independence and Conditional Independence

Marginal Independence and Conditional Independence Computer Science cpsc322, Lecture 26 (Textbook Chpt 6.1-2) March, 19, 2010 Recap with Example Marginalization Conditional Probability Chain Rule Lecture

### Course: Model, Learning, and Inference: Lecture 5

Course: Model, Learning, and Inference: Lecture 5 Alan Yuille Department of Statistics, UCLA Los Angeles, CA 90095 yuille@stat.ucla.edu Abstract Probability distributions on structured representation.

### Learning. CS461 Artificial Intelligence Pinar Duygulu. Bilkent University, Spring 2007. Slides are mostly adapted from AIMA and MIT Open Courseware

1 Learning CS 461 Artificial Intelligence Pinar Duygulu Bilkent University, Slides are mostly adapted from AIMA and MIT Open Courseware 2 Learning What is learning? 3 Induction David Hume Bertrand Russell

### Counting the number of Sudoku s by importance sampling simulation

Counting the number of Sudoku s by importance sampling simulation Ad Ridder Vrije University, Amsterdam, Netherlands May 31, 2013 Abstract Stochastic simulation can be applied to estimate the number of

### Intelligent Systems: Reasoning and Recognition. Uncertainty and Plausible Reasoning

Intelligent Systems: Reasoning and Recognition James L. Crowley ENSIMAG 2 / MoSIG M1 Second Semester 2015/2016 Lesson 12 25 march 2015 Uncertainty and Plausible Reasoning MYCIN (continued)...2 Backward

### Examples and Proofs of Inference in Junction Trees

Examples and Proofs of Inference in Junction Trees Peter Lucas LIAC, Leiden University February 4, 2016 1 Representation and notation Let P(V) be a joint probability distribution, where V stands for a

### Cell Phone based Activity Detection using Markov Logic Network

Cell Phone based Activity Detection using Markov Logic Network Somdeb Sarkhel sxs104721@utdallas.edu 1 Introduction Mobile devices are becoming increasingly sophisticated and the latest generation of smart

### Searching and Sorting Algorithms

Searching and Sorting Algorithms CS117, Fall 2004 Supplementary Lecture Notes Written by Amy Csizmar Dalal 1 Introduction How do you find someone s phone number in the phone book? How do you find your

### Bayesian Statistics: Indian Buffet Process

Bayesian Statistics: Indian Buffet Process Ilker Yildirim Department of Brain and Cognitive Sciences University of Rochester Rochester, NY 14627 August 2012 Reference: Most of the material in this note

### Compact Representations and Approximations for Compuation in Games

Compact Representations and Approximations for Compuation in Games Kevin Swersky April 23, 2008 Abstract Compact representations have recently been developed as a way of both encoding the strategic interactions

### Lab 8: Introduction to WinBUGS

40.656 Lab 8 008 Lab 8: Introduction to WinBUGS Goals:. Introduce the concepts of Bayesian data analysis.. Learn the basic syntax of WinBUGS. 3. Learn the basics of using WinBUGS in a simple example. Next

### The equation for the 3-input XOR gate is derived as follows

The equation for the 3-input XOR gate is derived as follows The last four product terms in the above derivation are the four 1-minterms in the 3-input XOR truth table. For 3 or more inputs, the XOR gate

### EC 6310: Advanced Econometric Theory

EC 6310: Advanced Econometric Theory July 2008 Slides for Lecture on Bayesian Computation in the Nonlinear Regression Model Gary Koop, University of Strathclyde 1 Summary Readings: Chapter 5 of textbook.

### Information Theory and Coding Prof. S. N. Merchant Department of Electrical Engineering Indian Institute of Technology, Bombay

Information Theory and Coding Prof. S. N. Merchant Department of Electrical Engineering Indian Institute of Technology, Bombay Lecture - 17 Shannon-Fano-Elias Coding and Introduction to Arithmetic Coding

### Combining information from different survey samples - a case study with data collected by world wide web and telephone

Combining information from different survey samples - a case study with data collected by world wide web and telephone Magne Aldrin Norwegian Computing Center P.O. Box 114 Blindern N-0314 Oslo Norway E-mail:

### CS5314 Randomized Algorithms. Lecture 16: Balls, Bins, Random Graphs (Random Graphs, Hamiltonian Cycles)

CS5314 Randomized Algorithms Lecture 16: Balls, Bins, Random Graphs (Random Graphs, Hamiltonian Cycles) 1 Objectives Introduce Random Graph Model used to define a probability space for all graphs with

### Life of A Knowledge Base (KB)

Life of A Knowledge Base (KB) A knowledge base system is a special kind of database management system to for knowledge base management. KB extraction: knowledge extraction using statistical models in NLP/ML

### CS 771 Artificial Intelligence. Reasoning under uncertainty

CS 771 Artificial Intelligence Reasoning under uncertainty Today Representing uncertainty is useful in knowledge bases Probability provides a coherent framework for uncertainty Review basic concepts in

### Bayesian Machine Learning (ML): Modeling And Inference in Big Data. Zhuhua Cai Google, Rice University caizhua@gmail.com

Bayesian Machine Learning (ML): Modeling And Inference in Big Data Zhuhua Cai Google Rice University caizhua@gmail.com 1 Syllabus Bayesian ML Concepts (Today) Bayesian ML on MapReduce (Next morning) Bayesian

### Parallelization Strategies for Multicore Data Analysis

Parallelization Strategies for Multicore Data Analysis Wei-Chen Chen 1 Russell Zaretzki 2 1 University of Tennessee, Dept of EEB 2 University of Tennessee, Dept. Statistics, Operations, and Management

### APP INVENTOR. Test Review

APP INVENTOR Test Review Main Concepts App Inventor Lists Creating Random Numbers Variables Searching and Sorting Data Linear Search Binary Search Selection Sort Quick Sort Abstraction Modulus Division

### Querying Joint Probability Distributions

Querying Joint Probability Distributions Sargur Srihari srihari@cedar.buffalo.edu 1 Queries of Interest Probabilistic Graphical Models (BNs and MNs) represent joint probability distributions over multiple

### Basics of Counting. The product rule. Product rule example. 22C:19, Chapter 6 Hantao Zhang. Sample question. Total is 18 * 325 = 5850

Basics of Counting 22C:19, Chapter 6 Hantao Zhang 1 The product rule Also called the multiplication rule If there are n 1 ways to do task 1, and n 2 ways to do task 2 Then there are n 1 n 2 ways to do

### Exponential Random Graph Models for Social Network Analysis. Danny Wyatt 590AI March 6, 2009

Exponential Random Graph Models for Social Network Analysis Danny Wyatt 590AI March 6, 2009 Traditional Social Network Analysis Covered by Eytan Traditional SNA uses descriptive statistics Path lengths

### Chapter 5: Sequential Circuits (LATCHES)

Chapter 5: Sequential Circuits (LATCHES) Latches We focuses on sequential circuits, where we add memory to the hardware that we ve already seen Our schedule will be very similar to before: We first show

### Using Laws of Probability. Sloan Fellows/Management of Technology Summer 2003

Using Laws of Probability Sloan Fellows/Management of Technology Summer 2003 Uncertain events Outline The laws of probability Random variables (discrete and continuous) Probability distribution Histogram

### Electronic Design Automation Prof. Indranil Sengupta Department of Computer Science and Engineering Indian Institute of Technology, Kharagpur

Electronic Design Automation Prof. Indranil Sengupta Department of Computer Science and Engineering Indian Institute of Technology, Kharagpur Lecture - 10 Synthesis: Part 3 I have talked about two-level

### Metropolis Light Transport. Samuel Donow, Mike Flynn, David Yan CS371 Fall 2014, Morgan McGuire

Metropolis Light Transport Samuel Donow, Mike Flynn, David Yan CS371 Fall 2014, Morgan McGuire Overview of Presentation 1. Description of necessary tools (Path Space, Monte Carlo Integration, Rendering

### Towards running complex models on big data

Towards running complex models on big data Working with all the genomes in the world without changing the model (too much) Daniel Lawson Heilbronn Institute, University of Bristol 2013 1 / 17 Motivation

### Why the Normal Distribution?

Why the Normal Distribution? Raul Rojas Freie Universität Berlin Februar 2010 Abstract This short note explains in simple terms why the normal distribution is so ubiquitous in pattern recognition applications.

### 573 Intelligent Systems

Sri Lanka Institute of Information Technology 3ed July, 2005 Course Outline Course Information Course Outline Reading Material Course Setup Intelligent agents. Agent types, agent environments and characteristics.

### Advanced Computer Graphics. Rendering Equation. Matthias Teschner. Computer Science Department University of Freiburg

Advanced Computer Graphics Rendering Equation Matthias Teschner Computer Science Department University of Freiburg Outline rendering equation Monte Carlo integration sampling of random variables University

### Dirichlet Processes A gentle tutorial

Dirichlet Processes A gentle tutorial SELECT Lab Meeting October 14, 2008 Khalid El-Arini Motivation We are given a data set, and are told that it was generated from a mixture of Gaussian distributions.

### Theory of Computation Prof. Kamala Krithivasan Department of Computer Science and Engineering Indian Institute of Technology, Madras

Theory of Computation Prof. Kamala Krithivasan Department of Computer Science and Engineering Indian Institute of Technology, Madras Lecture No. # 31 Recursive Sets, Recursively Innumerable Sets, Encoding

### An Application of Inverse Reinforcement Learning to Medical Records of Diabetes Treatment

An Application of Inverse Reinforcement Learning to Medical Records of Diabetes Treatment Hideki Asoh 1, Masanori Shiro 1 Shotaro Akaho 1, Toshihiro Kamishima 1, Koiti Hasida 1, Eiji Aramaki 2, and Takahide

### CSCI567 Machine Learning (Fall 2014)

CSCI567 Machine Learning (Fall 2014) Drs. Sha & Liu {feisha,yanliu.cs}@usc.edu September 22, 2014 Drs. Sha & Liu ({feisha,yanliu.cs}@usc.edu) CSCI567 Machine Learning (Fall 2014) September 22, 2014 1 /

### (x + a) n = x n + a Z n [x]. Proof. If n is prime then the map

22. A quick primality test Prime numbers are one of the most basic objects in mathematics and one of the most basic questions is to decide which numbers are prime (a clearly related problem is to find

### Math 2602 Finite and Linear Math Fall 14. Homework 9: Core solutions

Math 2602 Finite and Linear Math Fall 14 Homework 9: Core solutions Section 8.2 on page 264 problems 13b, 27a-27b. Section 8.3 on page 275 problems 1b, 8, 10a-10b, 14. Section 8.4 on page 279 problems

### Chapter 11 Number Theory

Chapter 11 Number Theory Number theory is one of the oldest branches of mathematics. For many years people who studied number theory delighted in its pure nature because there were few practical applications

### Calculating Logarithms By Hand

Calculating Logarithms By Hand W. Blaine Dowler June 14, 010 Abstract This details methods by which we can calculate logarithms by hand. 1 Definition and Basic Properties A logarithm can be defined as

### Artificial Neural Networks and Support Vector Machines. CS 486/686: Introduction to Artificial Intelligence

Artificial Neural Networks and Support Vector Machines CS 486/686: Introduction to Artificial Intelligence 1 Outline What is a Neural Network? - Perceptron learners - Multi-layer networks What is a Support

### Appendix A: Sampling Methods

Appendix A: Sampling Methods What is Sampling? Sampling is used in an @RISK simulation to generate possible values from probability distribution functions. These sets of possible values are then used to

### x if x 0, x if x < 0.

Chapter 3 Sequences In this chapter, we discuss sequences. We say what it means for a sequence to converge, and define the limit of a convergent sequence. We begin with some preliminary results about the

### Conditional and Looping Construct

Chapter 3 Conditional and Looping Construct After studying this lesson, students will be able to: Understand the concept and usage of selection and iteration statements. Know various types of loops available

### Hypothesis Testing for Beginners

Hypothesis Testing for Beginners Michele Piffer LSE August, 2011 Michele Piffer (LSE) Hypothesis Testing for Beginners August, 2011 1 / 53 One year ago a friend asked me to put down some easy-to-read notes

### Parameter estimation for nonlinear models: Numerical approaches to solving the inverse problem. Lecture 12 04/08/2008. Sven Zenker

Parameter estimation for nonlinear models: Numerical approaches to solving the inverse problem Lecture 12 04/08/2008 Sven Zenker Assignment no. 8 Correct setup of likelihood function One fixed set of observation

### Joint Probability Distributions and Random Samples (Devore Chapter Five)

Joint Probability Distributions and Random Samples (Devore Chapter Five) 1016-345-01 Probability and Statistics for Engineers Winter 2010-2011 Contents 1 Joint Probability Distributions 1 1.1 Two Discrete

### Business Statistics 41000: Probability 1

Business Statistics 41000: Probability 1 Drew D. Creal University of Chicago, Booth School of Business Week 3: January 24 and 25, 2014 1 Class information Drew D. Creal Email: dcreal@chicagobooth.edu Office:

### An Introduction to Using WinBUGS for Cost-Effectiveness Analyses in Health Economics

Slide 1 An Introduction to Using WinBUGS for Cost-Effectiveness Analyses in Health Economics Dr. Christian Asseburg Centre for Health Economics Part 1 Slide 2 Talk overview Foundations of Bayesian statistics

### MATH 55: HOMEWORK #2 SOLUTIONS

MATH 55: HOMEWORK # SOLUTIONS ERIC PETERSON * 1. SECTION 1.5: NESTED QUANTIFIERS 1.1. Problem 1.5.8. Determine the truth value of each of these statements if the domain of each variable consists of all

### In economics, the amount of a good x demanded is a function of a person s wealth and the price of that good. In other words,

LABOR NOTES, PART TWO: REVIEW OF MATH 2.1 Univariate calculus Given two sets X andy, a function is a rule that associates each member of X with exactly one member ofy. Intuitively, y is a function of x

### CSC 180 H1F Algorithm Runtime Analysis Lecture Notes Fall 2015

1 Introduction These notes introduce basic runtime analysis of algorithms. We would like to be able to tell if a given algorithm is time-efficient, and to be able to compare different algorithms. 2 Linear

### Probabilistic Graphical Models Homework 1: Due January 29, 2014 at 4 pm

Probabilistic Graphical Models 10-708 Homework 1: Due January 29, 2014 at 4 pm Directions. This homework assignment covers the material presented in Lectures 1-3. You must complete all four problems to

### PS 271B: Quantitative Methods II. Lecture Notes

PS 271B: Quantitative Methods II Lecture Notes Langche Zeng zeng@ucsd.edu The Empirical Research Process; Fundamental Methodological Issues 2 Theory; Data; Models/model selection; Estimation; Inference.

### Tutorial on Markov Chain Monte Carlo

Tutorial on Markov Chain Monte Carlo Kenneth M. Hanson Los Alamos National Laboratory Presented at the 29 th International Workshop on Bayesian Inference and Maximum Entropy Methods in Science and Technology,

### Data Modeling & Analysis Techniques. Probability & Statistics. Manfred Huber 2011 1

Data Modeling & Analysis Techniques Probability & Statistics Manfred Huber 2011 1 Probability and Statistics Probability and statistics are often used interchangeably but are different, related fields

### 5. Geometric Series and Three Applications

5. Geometric Series and Three Applications 5.. Geometric Series. One unsettling thing about working with infinite sums is that it sometimes happens that you know that the sum is finite, but you don t know

### The Schrödinger Equation

The Schrödinger Equation When we talked about the axioms of quantum mechanics, we gave a reduced list. We did not talk about how to determine the eigenfunctions for a given situation, or the time development

### Topic models for Sentiment analysis: A Literature Survey

Topic models for Sentiment analysis: A Literature Survey Nikhilkumar Jadhav 123050033 June 26, 2014 In this report, we present the work done so far in the field of sentiment analysis using topic models.

### Lecture 1 Newton s method.

Lecture 1 Newton s method. 1 Square roots 2 Newton s method. The guts of the method. A vector version. Implementation. The existence theorem. Basins of attraction. The Babylonian algorithm for finding

### Markov Chain Monte Carlo Simulation Made Simple

Markov Chain Monte Carlo Simulation Made Simple Alastair Smith Department of Politics New York University April2,2003 1 Markov Chain Monte Carlo (MCMC) simualtion is a powerful technique to perform numerical

### 3 Some Integer Functions

3 Some Integer Functions A Pair of Fundamental Integer Functions The integer function that is the heart of this section is the modulo function. However, before getting to it, let us look at some very simple

### Introduction to Learning & Decision Trees

Artificial Intelligence: Representation and Problem Solving 5-38 April 0, 2007 Introduction to Learning & Decision Trees Learning and Decision Trees to learning What is learning? - more than just memorizing