# Causal Inference from Graphical Models - I

Save this PDF as:

Size: px
Start display at page:

## Transcription

1 Graduate Lectures Oxford, October 2013

2 Graph terminology A graphical model is a set of distributions, satisfying a set of conditional independence relations encoded by a graph. This encoding is known as a Markov property. In many graphical models, the Markov property is matched by a corresponding factorization property of the associated densities or probability mass functions. This lecture is mostly concerned with graphical models based on directed acyclic graphs as these allow particularly simple causal interpretations. Such models are also known as Bayesian networks, a term coined by Pearl (1986). There is nothing Bayesian about them.

3 Definition and example Local directed Markov property Factorization The global Markov property A directed acyclic graph D over a finite set V is a simple graph with all edges directed and no directed cycles. We use DAG for brevity. Absence of directed cycles means that, following arrows in the graph, it is impossible to return to any point. Bayesian networks have proved fundamental and useful in a wealth of interesting applications, including expert systems, genetics, complex biomedical statistics, causal analysis, and machine learning.

4 Definition and example Local directed Markov property Factorization The global Markov property Example of a directed graphical model

5 Definition and example Local directed Markov property Factorization The global Markov property An probability distribution P of random variables X v, v V satisfies the local Markov property (L) w.r.t. a directed acyclic graph D if α V : α {nd(α) \ pa(α)} pa(α). Here nd(α) are the non-descendants of α. A well-known example is a Markov chain: X 1 X 2 X 3 X 4 X 5 with X i+1 (X 1,..., X i 1 ) X i for i = 3,..., n. X n

6 Definition and example Local directed Markov property Factorization The global Markov property For example, the local Markov property says 4 {1, 3, 5, 6} 2, 5 {1, 4} {2, 3} 3 {2, 4} 1.

7 Definition and example Local directed Markov property Factorization The global Markov property A probability distribution P over X = X V factorizes over a DAG D if its density or probability mass function f has the form (F) : f (x) = v V f (x v x pa(v) ).

8 Definition and example Local directed Markov property Factorization The global Markov property Example of DAG factorization The above graph corresponds to the factorization f (x) = f (x 1 )f (x 2 x 1 )f (x 3 x 1 )f (x 4 x 2 ) f (x 5 x 2, x 3 )f (x 6 x 3, x 5 )f (x 7 x 4, x 5, x 6 ).

9 Definition and example Local directed Markov property Factorization The global Markov property Separation in DAGs A node γ in a trail τ is a collider if edges meet head-to head at γ: A trail τ from α to β in D is active relative to S if both conditions below are satisfied: all its colliders are in S an(s) all its non-colliders are outside S A trail that is not active is blocked. Two subsets A and B of vertices are d-separated by S if all trails from A to B are blocked by S. We write A d B S.

10 Definition and example Local directed Markov property Factorization The global Markov property Separation by example For S = {5}, the trail (4, 2, 5, 3, 6) is active, whereas the trails (4, 2, 5, 6) and (4, 7, 6) are blocked. For S = {3, 5}, they are all blocked.

11 Definition and example Local directed Markov property Factorization The global Markov property Returning to example Hence 4 d 6 3, 5, but it is not true that 4 d 6 5 nor that 4 d 6.

12 Definition and example Local directed Markov property Factorization The global Markov property Equivalence of Markov properties A probability distribution P satisfies the global Markov property (G) w.r.t. D if A d B S A B S. It holds for any DAG D and any distribution P that these three directed Markov properties are equivalent: (G) (L) (F).

13 Motivation Intervention vs. observation Causal interpretation Example is compelling for causal reasons

14 Motivation Intervention vs. observation Causal interpretation The now rather standard causal interpretation of a DAG (Spirtes et al., 1993; Pearl, 2000) emphasizes the distinction between conditioning by observation and conditioning by intervention. We use special notations for this P(X = x Y y) = P{X = x do(y = y)} = p(x y), (1) whereas p(y x) = p(y = y X = x) = P{Y = y is(x = x)}. [Also distinguish p(x y) from P{X = x see(y = y)}. Observation/sampling bias.] A causal interpretation of a Bayesian network involves giving (1) a simple form.

15 Motivation Intervention vs. observation Causal interpretation We say that a BN is causal w.r.t. atomic interventions at B V if it holds for any A B that p(x xa ) = p(x v x pa(v) ) v V \A For A = we obtain standard factorisation. xa =x A Note that conditional distributions p(x v x pa(v) ) are stable under interventions which do not involve x v. Such assumption must be justified in any given context.

16 Motivation Intervention vs. observation Causal interpretation Contrast the formula for intervention conditioning with that for observation conditioning: p(x xa ) = p(x v x pa(v) ) = v V \A v V p(x v x pa(v) ) v A p(x v x pa(v) ) xa =x A x A =x A. whereas p(x xa ) = v V p(x v x pa(v) ) p(x A ) xa =x A.

17 Motivation Intervention vs. observation Causal interpretation An example p(x x5 ) = p(x 1 )p(x 2 x 1 )p(x 3 x 1 )p(x 4 x 2 ) p(x 6 x 3, x5 )p(x 7 x 4, x5, x 6 ) whereas p(x x5 ) p(x 1 )p(x 2 x 1 )p(x 3 x 1 )p(x 4 x 2 ) p(x5 x 2, x 3 )p(x 6 x 3, x5 )p(x 7 x 4, x5, x 6 )

18 General equation systems Intervention by replacement DAG D can also represent structural equation system: X v g v (x pa(v), U v ), v V, (2) where g v are fixed functions and U v are independent random disturbances. Intervention in structural equation system can be made by replacement, i.e. so that X v x v is replacing the corresponding line in program (2). Corresponds to g v and U v being unaffected by the intervention if intervention is not made on node v. Hence the equation is structural.

19 General equation systems Intervention by replacement A linear structural equation system for this network is X 1 α 1 + U 1 X 2 α 2 + β 21 x 1 + U 2 X 3 α 3 + β 31 x 1 + U 3 X 4 α 4 + β 42 x 2 + U 4 X 5 α 5 + β 52 x 2 + β 53 x 3 + U 5 X 6 α 6 + β 63 x 3 + β 65 x 5 + U 6 X 7 α 7 + β 74 x 4 + β 75 x 5 + β 76 x 6 + U 7.

20 General equation systems Intervention by replacement After intervention by replacement, the system changes to X 1 α 1 + U 1 X 2 α 2 + β 21 x 1 + U 2 X 3 α 3 + β 31 x 1 + U 3 X 4 x 4 X 5 α 5 + β 52 x 2 + β 53 x 3 + U 5 X 6 α 6 + β 63 x 3 + β 65 x 5 + U 6 X 7 α 7 + β 74 x4 + β 75 x 5 + β 76 x 6 + U 7.

21 General equation systems Intervention by replacement Justification of causal models by structural equations Intervention by replacement in structural equation system implies D causal for distribution of X v, v V. Occasionally used for justification of CBN. Ambiguity in choice of g v and U v makes this problematic. May take stability of conditional distributions as a primitive rather than structural equations. Structural equations more expressive when choice of g v and U v can be externally justified.

22 Assessment of effects of actions Intervention diagrams LIMIDs r a l b a - treatment with AZT; l - intermediate response (possible lung disease); b - treatment with antibiotics; r - survival after a fixed period. Predict survival if X a 1 and X b 1, assuming stable conditional distributions.

23 Assessment of effects of actions Intervention diagrams LIMIDs G-computation r a l b p(1 r 1 a, 1 b ) = x l p(1 r, x l 1 a, 1 b ) = x l p(1 r x l, 1 a, 1 b )p(x l 1 a ).

24 Assessment of effects of actions Intervention diagrams LIMIDs Augment each node v A where intervention is contemplated with additional parent variable F v. F v has state space X v {φ} and conditional distributions in the intervention diagram are { p p(xv x (x v x pa(v), f v ) = pa(v) ) if f v = φ δ xv,xv if f v = xv, where δ xy is Kronecker s symbol δ xy = { 1 if x = y 0 otherwise. F v is forcing the value of X v when F v φ.

25 Assessment of effects of actions Intervention diagrams LIMIDs It now holds in the extended intervention diagram that p(x) = p (x F v = φ, v A), but also p(x xb ) = P(X = x X B xb ) = P (x F v = xv, v B, F v = φ, v B \ A), In particular it holds that if pa(v) =, then p(x xv ) = p(x v xv ).

26 Assessment of effects of actions Intervention diagrams LIMIDs More generally we can explicitly join decision nodes δ to the DAG as parents of nodes which they affect. Further, each of these can have parents in D or in to indicate that intervention at δ may depend on states of pa(δ). A strategy σ yields a conditional distribution of decisions, given their parents to yield f (x σ) = v V f (x v x pa(v) ) δ σ(x δ x pa(δ) ) where now pa(v) refer to parents in the extended diagram, which must be a DAG to make sense. This formally corresponds to the notion of LIMIDs (Lauritzen and Nilsson, 2001).

27 Assessment of effects of actions Intervention diagrams LIMIDs F C F A A C E F B B D F D LIMID for a causal interpretation of a DAG. Red nodes represent (external) forces or interventions that affect the conditional distributions of their children. Note that interventions can be allowed to depend on other variables (treatment strategies).

28 Lauritzen, S. L. (2001). Causal inference from graphical models. In Barndorff-Nielsen, O. E., Cox, D. R., and Klüppelberg, C., editors, Complex Stochastic Systems, pages Chapman and Hall/CRC Press, London/Boca Raton. Lauritzen, S. L. and Nilsson, D. (2001). Representing and solving decision problems with limited information. Management Science, 47: Pearl, J. (1986). Fusion, propagation and structuring in belief networks. Artificial Intelligence, 29: Pearl, J. (2000). Causality. Cambridge University Press, Cambridge. Spirtes, P., Glymour, C., and Scheines, R. (1993). Causation, Prediction and Search. Springer-Verlag, New York. Reprinted by MIT Press.

### 5 Directed acyclic graphs

5 Directed acyclic graphs (5.1) Introduction In many statistical studies we have prior knowledge about a temporal or causal ordering of the variables. In this chapter we will use directed graphs to incorporate

### The Basics of Graphical Models

The Basics of Graphical Models David M. Blei Columbia University October 3, 2015 Introduction These notes follow Chapter 2 of An Introduction to Probabilistic Graphical Models by Michael Jordan. Many figures

### CS 188: Artificial Intelligence. Probability recap

CS 188: Artificial Intelligence Bayes Nets Representation and Independence Pieter Abbeel UC Berkeley Many slides over this course adapted from Dan Klein, Stuart Russell, Andrew Moore Conditional probability

### Discovering Structure in Continuous Variables Using Bayesian Networks

In: "Advances in Neural Information Processing Systems 8," MIT Press, Cambridge MA, 1996. Discovering Structure in Continuous Variables Using Bayesian Networks Reimar Hofmann and Volker Tresp Siemens AG,

### Probabilistic Networks An Introduction to Bayesian Networks and Influence Diagrams

Probabilistic Networks An Introduction to Bayesian Networks and Influence Diagrams Uffe B. Kjærulff Department of Computer Science Aalborg University Anders L. Madsen HUGIN Expert A/S 10 May 2005 2 Contents

### Examples and Proofs of Inference in Junction Trees

Examples and Proofs of Inference in Junction Trees Peter Lucas LIAC, Leiden University February 4, 2016 1 Representation and notation Let P(V) be a joint probability distribution, where V stands for a

### CAUSALITY: MODELS, REASONING, AND INFERENCE by Judea Pearl Cambridge University Press, 2000

Econometric Theory, 19, 2003, 675 685+ Printed in the United States of America+ DOI: 10+10170S0266466603004109 CAUSALITY: MODELS, REASONING, AND INFERENCE by Judea Pearl Cambridge University Press, 2000

### Probability, Conditional Independence

Probability, Conditional Independence June 19, 2012 Probability, Conditional Independence Probability Sample space Ω of events Each event ω Ω has an associated measure Probability of the event P(ω) Axioms

### Probabilistic Graphical Models Homework 1: Due January 29, 2014 at 4 pm

Probabilistic Graphical Models 10-708 Homework 1: Due January 29, 2014 at 4 pm Directions. This homework assignment covers the material presented in Lectures 1-3. You must complete all four problems to

### Deutsches Institut für Ernährungsforschung Potsdam-Rehbrücke. DAG program (v0.21)

Deutsches Institut für Ernährungsforschung Potsdam-Rehbrücke DAG program (v0.21) Causal diagrams: A computer program to identify minimally sufficient adjustment sets SHORT TUTORIAL Sven Knüppel http://epi.dife.de/dag

### CHAPTER 2 Estimating Probabilities

CHAPTER 2 Estimating Probabilities Machine Learning Copyright c 2016. Tom M. Mitchell. All rights reserved. *DRAFT OF January 24, 2016* *PLEASE DO NOT DISTRIBUTE WITHOUT AUTHOR S PERMISSION* This is a

### Lecture 2: Introduction to belief (Bayesian) networks

Lecture 2: Introduction to belief (Bayesian) networks Conditional independence What is a belief network? Independence maps (I-maps) January 7, 2008 1 COMP-526 Lecture 2 Recall from last time: Conditional

### Artificial Intelligence Mar 27, Bayesian Networks 1 P (T D)P (D) + P (T D)P ( D) =

Artificial Intelligence 15-381 Mar 27, 2007 Bayesian Networks 1 Recap of last lecture Probability: precise representation of uncertainty Probability theory: optimal updating of knowledge based on new information

### DETERMINING THE CONDITIONAL PROBABILITIES IN BAYESIAN NETWORKS

Hacettepe Journal of Mathematics and Statistics Volume 33 (2004), 69 76 DETERMINING THE CONDITIONAL PROBABILITIES IN BAYESIAN NETWORKS Hülya Olmuş and S. Oral Erbaş Received 22 : 07 : 2003 : Accepted 04

### AMS7: WEEK 8. CLASS 1. Correlation Monday May 18th, 2015

AMS7: WEEK 8. CLASS 1 Correlation Monday May 18th, 2015 Type of Data and objectives of the analysis Paired sample data (Bivariate data) Determine whether there is an association between two variables This

### Lecture 11: Graphical Models for Inference

Lecture 11: Graphical Models for Inference So far we have seen two graphical models that are used for inference - the Bayesian network and the Join tree. These two both represent the same joint probability

### Chapter 14 Managing Operational Risks with Bayesian Networks

Chapter 14 Managing Operational Risks with Bayesian Networks Carol Alexander This chapter introduces Bayesian belief and decision networks as quantitative management tools for operational risks. Bayesian

### Stat260: Bayesian Modeling and Inference Lecture Date: February 1, Lecture 3

Stat26: Bayesian Modeling and Inference Lecture Date: February 1, 21 Lecture 3 Lecturer: Michael I. Jordan Scribe: Joshua G. Schraiber 1 Decision theory Recall that decision theory provides a quantification

### Strategy. Theorem Fermat s Last Theorem: If n > 2, then there are no nontrivial integer solutions to x n + y n = z n.

1. Rewrite equation as α n + β n + γ 3 = 0. 1. Rewrite equation as α n + β n + γ 3 = 0. 2. Exactly one of α, β, γ is divisible by 3. 1. Rewrite equation as α n + β n + γ 3 = 0. 2. Exactly one of α, β,

### Intelligent Systems: Reasoning and Recognition. Uncertainty and Plausible Reasoning

Intelligent Systems: Reasoning and Recognition James L. Crowley ENSIMAG 2 / MoSIG M1 Second Semester 2015/2016 Lesson 12 25 march 2015 Uncertainty and Plausible Reasoning MYCIN (continued)...2 Backward

### What Counterfactuals Can Be Tested

352 SHPITSER & PEARL hat Counterfactuals Can Be Tested Ilya Shpitser, Judea Pearl Cognitive Systems Laboratory Department of Computer Science niversity of California, Los Angeles Los Angeles, CA. 90095

### Teaching Causal and Statistical Reasoning

Teaching Causal and Statistical Reasoning Richard Scheines and many others Philosophy, Carnegie Mellon University Outline Today 1. The Curriculum 2. The Online Course Modules Case Studies Support Materials

### Summary of Probability

Summary of Probability Mathematical Physics I Rules of Probability The probability of an event is called P(A), which is a positive number less than or equal to 1. The total probability for all possible

### Advanced Spatial Statistics Fall 2012 NCSU. Fuentes Lecture notes

Advanced Spatial Statistics Fall 2012 NCSU Fuentes Lecture notes Areal unit data 2 Areal Modelling Areal unit data Key Issues Is there spatial pattern? Spatial pattern implies that observations from units

### 13.3 Inference Using Full Joint Distribution

191 The probability distribution on a single variable must sum to 1 It is also true that any joint probability distribution on any set of variables must sum to 1 Recall that any proposition a is equivalent

### Artificial Intelligence. Conditional probability. Inference by enumeration. Independence. Lesson 11 (From Russell & Norvig)

Artificial Intelligence Conditional probability Conditional or posterior probabilities e.g., cavity toothache) = 0.8 i.e., given that toothache is all I know tation for conditional distributions: Cavity

### Topology and Convergence by: Daniel Glasscock, May 2012

Topology and Convergence by: Daniel Glasscock, May 2012 These notes grew out of a talk I gave at The Ohio State University. The primary reference is [1]. A possible error in the proof of Theorem 1 in [1]

### Continuous random variables

Continuous random variables So far we have been concentrating on discrete random variables, whose distributions are not continuous. Now we deal with the so-called continuous random variables. A random

### Causal Discovery Using A Bayesian Local Causal Discovery Algorithm

MEDINFO 2004 M. Fieschi et al. (Eds) Amsterdam: IOS Press 2004 IMIA. All rights reserved Causal Discovery Using A Bayesian Local Causal Discovery Algorithm Subramani Mani a,b, Gregory F. Cooper a mani@cs.uwm.edu

### Causal Independence for Probability Assessment and Inference Using Bayesian Networks

Causal Independence for Probability Assessment and Inference Using Bayesian Networks David Heckerman and John S. Breese Copyright c 1993 IEEE. Reprinted from D. Heckerman, J. Breese. Causal Independence

### Compression algorithm for Bayesian network modeling of binary systems

Compression algorithm for Bayesian network modeling of binary systems I. Tien & A. Der Kiureghian University of California, Berkeley ABSTRACT: A Bayesian network (BN) is a useful tool for analyzing the

### Markov random fields and Gibbs measures

Chapter Markov random fields and Gibbs measures 1. Conditional independence Suppose X i is a random element of (X i, B i ), for i = 1, 2, 3, with all X i defined on the same probability space (.F, P).

### Chapter 28. Bayesian Networks

Chapter 28. Bayesian Networks The Quest for Artificial Intelligence, Nilsson, N. J., 2009. Lecture Notes on Artificial Intelligence, Spring 2012 Summarized by Kim, Byoung-Hee and Lim, Byoung-Kwon Biointelligence

### Neural Nets. General Model Building

Neural Nets To give you an idea of how new this material is, let s do a little history lesson. The origins are typically dated back to the early 1940 s and work by two physiologists, McCulloch and Pitts.

### Monte Carlo and Empirical Methods for Stochastic Inference (MASM11/FMS091)

Monte Carlo and Empirical Methods for Stochastic Inference (MASM11/FMS091) Magnus Wiktorsson Centre for Mathematical Sciences Lund University, Sweden Lecture 6 Sequential Monte Carlo methods II February

### Model-based Synthesis. Tony O Hagan

Model-based Synthesis Tony O Hagan Stochastic models Synthesising evidence through a statistical model 2 Evidence Synthesis (Session 3), Helsinki, 28/10/11 Graphical modelling The kinds of models that

### Final Exam, Spring 2007

10-701 Final Exam, Spring 2007 1. Personal info: Name: Andrew account: E-mail address: 2. There should be 16 numbered pages in this exam (including this cover sheet). 3. You can use any material you brought:

### Lecture 6: The Bayesian Approach

Lecture 6: The Bayesian Approach What Did We Do Up to Now? We are given a model Log-linear model, Markov network, Bayesian network, etc. This model induces a distribution P(X) Learning: estimate a set

### GeNIeRate: An Interactive Generator of Diagnostic Bayesian Network Models

GeNIeRate: An Interactive Generator of Diagnostic Bayesian Network Models Pieter C. Kraaijeveld Man Machine Interaction Group Delft University of Technology Mekelweg 4, 2628 CD Delft, the Netherlands p.c.kraaijeveld@ewi.tudelft.nl

### The Chinese Restaurant Process

COS 597C: Bayesian nonparametrics Lecturer: David Blei Lecture # 1 Scribes: Peter Frazier, Indraneel Mukherjee September 21, 2007 In this first lecture, we begin by introducing the Chinese Restaurant Process.

ρ σ ρ (0,1)σ ρ ν Ω δ δ τ σ Α τ σ Α σ Α σ Α σ Α τσ Α σ Α σ Α σ Α σ σ Welfare gain for coalition in percent 1 σ A = 20 0 coal. size 2 coal. size 8 1 0 2 4 6 8 10 12 14 16 1 σ A = 40 0 1 0 2 4 6 8 10 12

### Quantum Mechanics I: Basic Principles

Quantum Mechanics I: Basic Principles Michael A. Nielsen University of Queensland I ain t no physicist but I know what matters - Popeye the Sailor Goal of this and the next lecture: to introduce all the

### Probabilistic Graphical Models

Probabilistic Graphical Models Raquel Urtasun and Tamir Hazan TTI Chicago April 4, 2011 Raquel Urtasun and Tamir Hazan (TTI-C) Graphical Models April 4, 2011 1 / 22 Bayesian Networks and independences

### OLS in Matrix Form. Let y be an n 1 vector of observations on the dependent variable.

OLS in Matrix Form 1 The True Model Let X be an n k matrix where we have observations on k independent variables for n observations Since our model will usually contain a constant term, one of the columns

### COS513 LECTURE 5 JUNCTION TREE ALGORITHM

COS513 LECTURE 5 JUNCTION TREE ALGORITHM D. EIS AND V. KOSTINA The junction tree algorithm is the culmination of the way graph theory and probability combine to form graphical models. After we discuss

### Outline. Database Theory. Perfect Query Optimization. Turing Machines. Question. Definition VU , SS Trakhtenbrot s Theorem

Database Theory Database Theory Outline Database Theory VU 181.140, SS 2016 4. Trakhtenbrot s Theorem Reinhard Pichler Institut für Informationssysteme Arbeitsbereich DBAI Technische Universität Wien 4.

### LCAO-MO Correlation Diagrams

LCAO-MO Correlation Diagrams (Linear Combination of Atomic Orbitals to yield Molecular Orbitals) For (Second Row) Homonuclear Diatomic Molecules (X 2 ) - the following LCAO-MO s are generated: LCAO MO

### AP Statistics 1998 Scoring Guidelines

AP Statistics 1998 Scoring Guidelines These materials are intended for non-commercial use by AP teachers for course and exam preparation; permission for any other use must be sought from the Advanced Placement

### Bayesian models of cognition

Bayesian models of cognition Thomas L. Griffiths Department of Psychology University of California, Berkeley Charles Kemp and Joshua B. Tenenbaum Department of Brain and Cognitive Sciences Massachusetts

### Tensor Factorization for Multi-Relational Learning

Tensor Factorization for Multi-Relational Learning Maximilian Nickel 1 and Volker Tresp 2 1 Ludwig Maximilian University, Oettingenstr. 67, Munich, Germany nickel@dbs.ifi.lmu.de 2 Siemens AG, Corporate

### The Delta Method and Applications

Chapter 5 The Delta Method and Applications 5.1 Linear approximations of functions In the simplest form of the central limit theorem, Theorem 4.18, we consider a sequence X 1, X,... of independent and

### ALGEBRA. sequence, term, nth term, consecutive, rule, relationship, generate, predict, continue increase, decrease finite, infinite

ALGEBRA Pupils should be taught to: Generate and describe sequences As outcomes, Year 7 pupils should, for example: Use, read and write, spelling correctly: sequence, term, nth term, consecutive, rule,

### TRANSFORMATIONS OF RANDOM VARIABLES

TRANSFORMATIONS OF RANDOM VARIABLES 1. INTRODUCTION 1.1. Definition. We are often interested in the probability distributions or densities of functions of one or more random variables. Suppose we have

### Machine Learning and Data Analysis overview. Department of Cybernetics, Czech Technical University in Prague. http://ida.felk.cvut.

Machine Learning and Data Analysis overview Jiří Kléma Department of Cybernetics, Czech Technical University in Prague http://ida.felk.cvut.cz psyllabus Lecture Lecturer Content 1. J. Kléma Introduction,

### FORMAL LANGUAGES, AUTOMATA AND COMPUTATION

FORMAL LANGUAGES, AUTOMATA AND COMPUTATION REDUCIBILITY ( LECTURE 16) SLIDES FOR 15-453 SPRING 2011 1 / 20 THE LANDSCAPE OF THE CHOMSKY HIERARCHY ( LECTURE 16) SLIDES FOR 15-453 SPRING 2011 2 / 20 REDUCIBILITY

### Pooling and Meta-analysis. Tony O Hagan

Pooling and Meta-analysis Tony O Hagan Pooling Synthesising prior information from several experts 2 Multiple experts The case of multiple experts is important When elicitation is used to provide expert

### IE1204 Digital Design F12: Asynchronous Sequential Circuits (Part 1)

IE1204 Digital Design F12: Asynchronous Sequential Circuits (Part 1) Elena Dubrova KTH / ICT / ES dubrova@kth.se BV pp. 584-640 This lecture IE1204 Digital Design, HT14 2 Asynchronous Sequential Machines

### The Conquest of U.S. Inflation: Learning and Robustness to Model Uncertainty

The Conquest of U.S. Inflation: Learning and Robustness to Model Uncertainty Timothy Cogley and Thomas Sargent University of California, Davis and New York University and Hoover Institution Conquest of

### Bayesian Updating with Discrete Priors Class 11, 18.05, Spring 2014 Jeremy Orloff and Jonathan Bloom

1 Learning Goals Bayesian Updating with Discrete Priors Class 11, 18.05, Spring 2014 Jeremy Orloff and Jonathan Bloom 1. Be able to apply Bayes theorem to compute probabilities. 2. Be able to identify

### Lecture 16 : Relations and Functions DRAFT

CS/Math 240: Introduction to Discrete Mathematics 3/29/2011 Lecture 16 : Relations and Functions Instructor: Dieter van Melkebeek Scribe: Dalibor Zelený DRAFT In Lecture 3, we described a correspondence

### A counterexample to a result on the tree graph of a graph

AUSTRALASIAN JOURNAL OF COMBINATORICS Volume 63(3) (2015), Pages 368 373 A counterexample to a result on the tree graph of a graph Ana Paulina Figueroa Departamento de Matemáticas Instituto Tecnológico

### Monitoring the Behaviour of Credit Card Holders with Graphical Chain Models

Journal of Business Finance & Accounting, 30(9) & (10), Nov./Dec. 2003, 0306-686X Monitoring the Behaviour of Credit Card Holders with Graphical Chain Models ELENA STANGHELLINI* 1. INTRODUCTION Consumer

### A crash course in probability and Naïve Bayes classification

Probability theory A crash course in probability and Naïve Bayes classification Chapter 9 Random variable: a variable whose possible values are numerical outcomes of a random phenomenon. s: A person s

### Introduction to Markov Chain Monte Carlo

Introduction to Markov Chain Monte Carlo Monte Carlo: sample from a distribution to estimate the distribution to compute max, mean Markov Chain Monte Carlo: sampling using local information Generic problem

### Causal Reasoning through Intervention

Observation vs. Intervention 1 Causal Reasoning through Intervention York Hagmayer 1, Steven A. Sloman 2, David A. Lagnado 3, and Michael R. Waldmann 1 1 University of Göttingen, Germany, 2 Brown University,

### The Big 50 Revision Guidelines for S1

The Big 50 Revision Guidelines for S1 If you can understand all of these you ll do very well 1. Know what is meant by a statistical model and the Modelling cycle of continuous refinement 2. Understand

### Epidemiologists typically seek to answer causal questions using statistical data:

c16.qxd 2/8/06 12:43 AM Page 387 Y CHAPTER SIXTEEN USING CAUSAL DIAGRAMS TO UNDERSTAND COMMON PROBLEMS IN SOCIAL EPIDEMIOLOGY M. Maria Glymour Epidemiologists typically seek to answer causal questions

### Credal Networks for Operational Risk Measurement and Management

Credal Networks for Operational Risk Measurement and Management Alessandro Antonucci, Alberto Piatti, and Marco Zaffalon Istituto Dalle Molle di Studi sull Intelligenza Artificiale Via Cantonale, Galleria

### Outline. Correlation & Regression, III. Review. Relationship between r and regression

Outline Correlation & Regression, III 9.07 4/6/004 Relationship between correlation and regression, along with notes on the correlation coefficient Effect size, and the meaning of r Other kinds of correlation

### Representing Normative Arguments in Genetic Counseling

Representing Normative Arguments in Genetic Counseling Nancy Green Department of Mathematical Sciences University of North Carolina at Greensboro Greensboro, NC 27402 USA nlgreen@uncg.edu Abstract Our

### 3. Continuous Random Variables

3. Continuous Random Variables A continuous random variable is one which can take any value in an interval (or union of intervals) The values that can be taken by such a variable cannot be listed. Such

### For eg:- The yield of a new paddy variety will be 3500 kg per hectare scientific hypothesis. In Statistical language if may be stated as the random

Lecture.9 Test of significance Basic concepts null hypothesis alternative hypothesis level of significance Standard error and its importance steps in testing Test of Significance Objective To familiarize

### THE UNIVERSITY OF AUCKLAND

COMPSCI 369 THE UNIVERSITY OF AUCKLAND FIRST SEMESTER, 2011 MID-SEMESTER TEST Campus: City COMPUTER SCIENCE Computational Science (Time allowed: 50 minutes) NOTE: Attempt all questions Use of calculators

C H A P T E R 3 Simulation of Business Processes The stock and ow diagram which have been reviewed in the preceding two chapters show more about the process structure than the causal loop diagrams studied

### Question 2 Naïve Bayes (16 points)

Question 2 Naïve Bayes (16 points) About 2/3 of your email is spam so you downloaded an open source spam filter based on word occurrences that uses the Naive Bayes classifier. Assume you collected the

### Logic, Probability and Learning

Logic, Probability and Learning Luc De Raedt luc.deraedt@cs.kuleuven.be Overview Logic Learning Probabilistic Learning Probabilistic Logic Learning Closely following : Russell and Norvig, AI: a modern

### TREES IN SET THEORY SPENCER UNGER

TREES IN SET THEORY SPENCER UNGER 1. Introduction Although I was unaware of it while writing the first two talks, my talks in the graduate student seminar have formed a coherent series. This talk can be

### Data Mining Chapter 6: Models and Patterns Fall 2011 Ming Li Department of Computer Science and Technology Nanjing University

Data Mining Chapter 6: Models and Patterns Fall 2011 Ming Li Department of Computer Science and Technology Nanjing University Models vs. Patterns Models A model is a high level, global description of a

### SYSM 6304: Risk and Decision Analysis Lecture 5: Methods of Risk Analysis

SYSM 6304: Risk and Decision Analysis Lecture 5: Methods of Risk Analysis M. Vidyasagar Cecil & Ida Green Chair The University of Texas at Dallas Email: M.Vidyasagar@utdallas.edu October 17, 2015 Outline

ALGEBRA MODULE 8 QUADRATIC EQUATIONS GRADE 9 TEACHER DOCUMENT Malati staff involved in developing these materials: Rolene Liebenberg Liora Linchevski Marlene Sasman Alwyn Olivier Richard Bingo Lukhele

### Automata and Languages

Automata and Languages Computational Models: An idealized mathematical model of a computer. A computational model may be accurate in some ways but not in others. We shall be defining increasingly powerful

### An Algorithm to Handle Structural Uncertainties in Learning Bayesian Network

An Algorithm to Handle Structural Uncertainties in Learning Bayesian Network Cleuber Moreira Fernandes Politec Informática Ltda SIG quadra 4 lote 173, Zip Code 70.610-440, Brasília, DF, Brazil cleuber@compeduc.net

### Notes 10: An Equation Based Model of the Macroeconomy

Notes 10: An Equation Based Model of the Macroeconomy In this note, I am going to provide a simple mathematical framework for 8 of the 9 major curves in our class (excluding only the labor supply curve).

### Bayesian Networks. Read R&N Ch. 14.1-14.2. Next lecture: Read R&N 18.1-18.4

Bayesian Networks Read R&N Ch. 14.1-14.2 Next lecture: Read R&N 18.1-18.4 You will be expected to know Basic concepts and vocabulary of Bayesian networks. Nodes represent random variables. Directed arcs

### Large-Sample Learning of Bayesian Networks is NP-Hard

Journal of Machine Learning Research 5 (2004) 1287 1330 Submitted 3/04; Published 10/04 Large-Sample Learning of Bayesian Networks is NP-Hard David Maxwell Chickering David Heckerman Christopher Meek Microsoft

### Causal Inference and Major League Baseball

Causal Inference and Major League Baseball Jamie Thornton Department of Mathematical Sciences Montana State University May 4, 2012 A writing project submitted in partial fulfillment of the requirements

### Basics of Statistical Machine Learning

CS761 Spring 2013 Advanced Machine Learning Basics of Statistical Machine Learning Lecturer: Xiaojin Zhu jerryzhu@cs.wisc.edu Modern machine learning is rooted in statistics. You will find many familiar

### Definition The covariance of X and Y, denoted by cov(x, Y ) is defined by. cov(x, Y ) = E(X µ 1 )(Y µ 2 ).

Correlation Regression Bivariate Normal Suppose that X and Y are r.v. s with joint density f(x y) and suppose that the means of X and Y are respectively µ 1 µ 2 and the variances are 1 2. Definition The

### Bayesian Statistics: Indian Buffet Process

Bayesian Statistics: Indian Buffet Process Ilker Yildirim Department of Brain and Cognitive Sciences University of Rochester Rochester, NY 14627 August 2012 Reference: Most of the material in this note

Mathematics 0N1 Solutions 1 1. Write the following sets in list form. 1(i) The set of letters in the word banana. {a, b, n}. 1(ii) {x : x 2 + 3x 10 = 0}. 3(iv) C A. True 3(v) B = {e, e, f, c}. True 3(vi)

### Center for Causal Discovery (CCD) of Biomedical Knowledge from Big Data University of Pittsburgh Carnegie Mellon University Pittsburgh Supercomputing

Center for Causal Discovery (CCD) of Biomedical Knowledge from Big Data University of Pittsburgh Carnegie Mellon University Pittsburgh Supercomputing Center Yale University PIs: Ivet Bahar, Jeremy Berg,

### GROUPS SUBGROUPS. Definition 1: An operation on a set G is a function : G G G.

Definition 1: GROUPS An operation on a set G is a function : G G G. Definition 2: A group is a set G which is equipped with an operation and a special element e G, called the identity, such that (i) the

### Cell Phone based Activity Detection using Markov Logic Network

Cell Phone based Activity Detection using Markov Logic Network Somdeb Sarkhel sxs104721@utdallas.edu 1 Introduction Mobile devices are becoming increasingly sophisticated and the latest generation of smart

Module 223 Major A: Concepts, methods and design in Epidemiology Module : 223 UE coordinator Concepts, methods and design in Epidemiology Dates December 15 th to 19 th, 2014 Credits/ECTS UE description

### Life of A Knowledge Base (KB)

Life of A Knowledge Base (KB) A knowledge base system is a special kind of database management system to for knowledge base management. KB extraction: knowledge extraction using statistical models in NLP/ML

### (a) Let x and y be the number of pounds of seed and corn that the chicken rancher must buy. Give the inequalities that x and y must satisfy.

MA 4-4 Practice Exam Justify your answers and show all relevant work. The exam paper will not be graded, put all your work in the blue book provided. Problem A chicken rancher concludes that his flock

### Chapter 2 Introduction to Sequence Analysis for Human Behavior Understanding

Chapter 2 Introduction to Sequence Analysis for Human Behavior Understanding Hugues Salamin and Alessandro Vinciarelli 2.1 Introduction Human sciences recognize sequence analysis as a key aspect of any

### Model Simulation in Rational Software Architect: Business Process Simulation

Model Simulation in Rational Software Architect: Business Process Simulation Mattias Mohlin Senior Software Architect IBM The BPMN (Business Process Model and Notation) is the industry standard notation