# DAGs, I-Maps, Factorization, d-separation, Minimal I-Maps, Bayesian Networks. Slides by Nir Friedman

Save this PDF as:

Size: px
Start display at page:

Download "DAGs, I-Maps, Factorization, d-separation, Minimal I-Maps, Bayesian Networks. Slides by Nir Friedman"

## Transcription

1 DAGs, I-Maps, Factorization, d-separation, Minimal I-Maps, Bayesian Networks Slides by Nir Friedman.

2 Probability Distributions Let X 1,,X n be random variables Let P be a joint distribution over X 1,,X n If the variables are binary, then we need O(2 n ) parameters to describe P Can we do better? Key idea: use properties of independence

3 Independent Random Variables Two variables X and Y are independent if P(X = x Y = y) = P(X = x) for all values x,y That is, learning the values of Y does not change prediction of X If X and Y are independent then P(X,Y) = P(X Y)P(Y) = P(X)P(Y) In general, if X 1,,X n are independent, then P(X 1,,X n )= P(X 1 )...P(X n ) Requires O(n) parameters

4 Conditional Independence Unfortunately, most of random variables of interest are not independent of each other A more suitable notion is that of conditional independence Two variables X and Y are conditionally independent given Z if P(X = x Y = y,z=z) = P(X = x Z=z) for all values x,y,z That is, learning the values of Y does not change prediction of X once we know the value of Z notation: Ind( X ; Y Z )

5 Example: Family trees Noisy stochastic process: Example: Pedigree Homer Marge A node represents an individual s genotype Bart Lisa Maggie Modeling assumptions: Ancestors can effect descendants' genotype only by passing genetic materials through intermediate generations

6 Markov Assumption Ancestor We now make this independence assumption more precise for directed acyclic graphs (DAGs) Y 1 Y 2 Parent Each random variable X, is independent of its nondescendents, given its parents Pa(X) Formally, Ind(X; NonDesc(X) Pa(X)) X Non-descendent Descendent

7 Markov Assumption Example Earthquake Burglary In this example: Ind( E; B ) Ind( B; E, R ) Ind( R; A, B, C E ) Ind( A; R B,E ) Ind( C; B, E, R A) Radio Alarm Call

8 I-Maps A DAG G is an I-Map of a distribution P if the all Markov assumptions implied by G are satisfied by P (Assuming G and P both use the same set of random variables) Examples: X Y X Y x y P(x,y) x y P(x,y)

9 Factorization Given that G is an I-Map of P, can we simplify the representation of P? Example: X Y Since Ind(X;Y), we have that P(X Y) = P(X) Applying the chain rule P(X,Y) = P(X Y) P(Y) = P(X) P(Y) Thus, we have a simpler representation of P(X,Y)

10 Factorization Theorem Thm: if G is an I-Map of P, then Proof: By chain rule: P ( X,..., = 1 Xn ) P ( Xi Pa( Xi )) i P ( X = 1,..., Xn ) P ( Xi X1,..., X i 1) wlog. X 1,,X n is an ordering consistent with G From assumption: Pa( X ) { X K, X } { X 1, i K, X i i 1 1, i 1 } Pa( X ) NonDesc( X ) i i Since G is an I-Map, Ind(X i ; NonDesc(X i ) Pa(X i )) Ind ( Xi ;{ X1, K, Xi 1 } Pa( Xi ) Pa( Xi )) Hence, We conclude, P(X i X 1,,X i-1 ) = P(X i Pa(X i ) )

11 Factorization Example Earthquake Burglary Radio Alarm Call P(C,A,R,E,B) = P(B)P(E B)P(R E,B)P(A R,B,E)P(C A,R,B,E) versus P(C,A,R,E,B) = P(B) P(E) P(R E) P(A B,E) P(C A)

12 Consequences We can write P in terms of local conditional probabilities If G is sparse, that is, Pa(X i ) < k, each conditional probability can be specified compactly e.g. for binary variables, these require O(2 k ) params. representation of P is compact linear in number of variables

13 Conditional Independencies Let Markov(G) be the set of Markov Independencies implied by G The decomposition theorem shows G is an I-Map of P P ( X,..., = 1 Xn ) P ( Xi Pai ) i υ We can also show the opposite: Thm: P ( X,..., = 1 Xn ) P ( Xi Pai ) i G is an I-Map of P

14 Proof (Outline) Example: X Y Z ) ( ) ( ) ( ) ( ) ( ), ( ),, ( ), ( X Y P X P X Z P X Y P X P Y X P Z Y X P Y X Z P = = ) ( X Z = P

15 Implied Independencies Does a graph G imply additional independencies as a consequence of Markov(G) We can define a logic of independence statements We already seen some axioms: Ind( X ; Y Z ) Ind( Y; X Z ) λ Ind( X ; Y 1, Y 2 Z ) Ind( X; Y 1 Z ) We can continue this list..

16 d-seperation A procedure d-sep(x; Y Z, G) that given a DAG G, and sets X, Y, and Z returns either yes or no Goal: d-sep(x; Y Z, G) = yes iff Ind(X;Y Z) follows from Markov(G)

17 Paths Intuition: dependency must flow along paths in the graph A path is a sequence of neighboring variables Examples: R E A B υ C A E R Earthquake Radio Burglary Alarm Call

18 Paths blockage We want to know when a path is active -- creates dependency between end nodes blocked -- cannot create dependency end nodes We want to classify situations in which paths are active given the evidence.

19 Path Blockage Three cases: Common cause Blocked E Unblocked Active E R A R A

20 Path Blockage Three cases: Common cause Blocked E Unblocked Active E Intermediate cause A C A C

21 Path Blockage Three cases: Common cause Blocked Unblocked Active E B Intermediate cause E B A Common Effect A E C B C A C

22 Path Blockage -- General Case A path is active, given evidence Z, if Whenever we have the configuration A C B or one of its descendents are in Z B No other nodes in the path are in Z A path is blocked, given evidence Z, if it is not active.

23 Example d-sep(r,b) = yes E B R A C

24 Example d-sep(r,b) = yes d-sep(r,b A) = no E B R A C

25 Example d-sep(r,b) = yes d-sep(r,b A) = no d-sep(r,b E,A) = yes E B R A C

26 d-separation X is d-separated from Y, given Z, if all paths from a node in X to a node in Y are blocked, given Z. Checking d-separation can be done efficiently (linear time in number of edges) Bottom-up phase: Mark all nodes whose descendents are in Z X to Y phase: Traverse (BFS) all edges on paths from X to Y and check if they are blocked

27 Soundness Thm: If G is an I-Map of P d-sep( X; Y Z, G ) = yes then P satisfies Ind( X; Y Z ) Informally, Any independence reported by d-separation is satisfied by underlying distribution

28 Completeness Thm: If d-sep( X; Y Z, G ) = no then there is a distribution P such that G is an I-Map of P P does not satisfy Ind( X; Y Z ) Informally, Any independence not reported by d-separation might be violated by the by the underlying distribution We cannot determine this by examining the graph structure alone

29 I-Maps revisited The fact that G is I-Map of P might not be that useful For example, complete DAGs A DAG is G is complete is we cannot add an arc without creating a cycle X 1 X 2 X 1 X 2 X 3 X 4 X 3 X 4 These DAGs do not imply any independencies Thus, they are I-Maps of any distribution

30 Minimal I-Maps A DAG G is a minimal I-Map of P if G is an I-Map of P If G G, then G is not an I-Map of P Removing any arc from G introduces (conditional) independencies that do not hold in P

31 Minimal I-Map Example X 1 X 2 If is a minimal I-Map X 3 X 4 Then, these are not I-Maps: X 1 X 2 X 1 X 2 X 3 X 4 X 3 X 4 X 1 X 2 X 1 X 2 X 3 X 4 X 3 X 4

32 Constructing minimal I-Maps The factorization theorem suggests an algorithm Fix an ordering X 1,,X n For each i, select Pa i to be a minimal subset of {X 1,,X i-1 }, such that Ind(X i ; {X 1,,X i-1 } - Pa i Pa i ) Clearly, the resulting graph is a minimal I-Map.

33 Non-uniqueness of minimal I-Map Unfortunately, there may be several minimal I- Maps for the same distribution Applying I-Map construction procedure with different orders can lead to different structures Order: C, R, A, E, B Original I-Map E B E B R A R A C C

34 P-Maps A DAG G is P-Map (perfect map) of a distribution P if Ind(X; Y Z) if and only if d-sep(x; Y Z, G) = yes Notes: A P-Map captures all the independencies in the distribution P-Maps are unique, up to DAG equivalence

35 P-Maps Unfortunately, some distributions do not have a P-Map Example: P ( A, B, C ) = if A B C if A B C = = 0 1 A minimal I-Map: A B C This is not a P-Map since Ind(A;C) but d-sep(a;c) = no

36 Bayesian Networks A Bayesian network specifies a probability distribution via two components: A DAG G A collection of conditional probability distributions P(X i Pa i ) The joint distribution P is defined by the factorization P ( X,..., = 1 Xn ) P ( Xi Pai ) i Additional requirement: G is a minimal I-Map of P

37 Summary We explored DAGs as a representation of conditional independencies: Markov independencies of a DAG Tight correspondence between Markov(G) and the factorization defined by G d-separation, a sound & complete procedure for computing the consequences of the independencies Notion of minimal I-Map P-Maps This theory is the basis of Bayesian networks

### CS 188: Artificial Intelligence. Probability recap

CS 188: Artificial Intelligence Bayes Nets Representation and Independence Pieter Abbeel UC Berkeley Many slides over this course adapted from Dan Klein, Stuart Russell, Andrew Moore Conditional probability

### Lecture 2: Introduction to belief (Bayesian) networks

Lecture 2: Introduction to belief (Bayesian) networks Conditional independence What is a belief network? Independence maps (I-maps) January 7, 2008 1 COMP-526 Lecture 2 Recall from last time: Conditional

### 13.3 Inference Using Full Joint Distribution

191 The probability distribution on a single variable must sum to 1 It is also true that any joint probability distribution on any set of variables must sum to 1 Recall that any proposition a is equivalent

### The Joint Probability Distribution (JPD) of a set of n binary variables involve a huge number of parameters

DEFINING PROILISTI MODELS The Joint Probability Distribution (JPD) of a set of n binary variables involve a huge number of parameters 2 n (larger than 10 25 for only 100 variables). x y z p(x, y, z) 0

### Bayesian Networks Chapter 14. Mausam (Slides by UW-AI faculty & David Page)

Bayesian Networks Chapter 14 Mausam (Slides by UW-AI faculty & David Page) Bayes Nets In general, joint distribution P over set of variables (X 1 x... x X n ) requires exponential space for representation

### Probability, Conditional Independence

Probability, Conditional Independence June 19, 2012 Probability, Conditional Independence Probability Sample space Ω of events Each event ω Ω has an associated measure Probability of the event P(ω) Axioms

### Probabilistic Graphical Models

Probabilistic Graphical Models Raquel Urtasun and Tamir Hazan TTI Chicago April 4, 2011 Raquel Urtasun and Tamir Hazan (TTI-C) Graphical Models April 4, 2011 1 / 22 Bayesian Networks and independences

### The Basics of Graphical Models

The Basics of Graphical Models David M. Blei Columbia University October 3, 2015 Introduction These notes follow Chapter 2 of An Introduction to Probabilistic Graphical Models by Michael Jordan. Many figures

### 5 Directed acyclic graphs

5 Directed acyclic graphs (5.1) Introduction In many statistical studies we have prior knowledge about a temporal or causal ordering of the variables. In this chapter we will use directed graphs to incorporate

### Bayesian Networks. Mausam (Slides by UW-AI faculty)

Bayesian Networks Mausam (Slides by UW-AI faculty) Bayes Nets In general, joint distribution P over set of variables (X 1 x... x X n ) requires exponential space for representation & inference BNs provide

### Life of A Knowledge Base (KB)

Life of A Knowledge Base (KB) A knowledge base system is a special kind of database management system to for knowledge base management. KB extraction: knowledge extraction using statistical models in NLP/ML

### Logic, Probability and Learning

Logic, Probability and Learning Luc De Raedt luc.deraedt@cs.kuleuven.be Overview Logic Learning Probabilistic Learning Probabilistic Logic Learning Closely following : Russell and Norvig, AI: a modern

### Artificial Intelligence Mar 27, Bayesian Networks 1 P (T D)P (D) + P (T D)P ( D) =

Artificial Intelligence 15-381 Mar 27, 2007 Bayesian Networks 1 Recap of last lecture Probability: precise representation of uncertainty Probability theory: optimal updating of knowledge based on new information

### A crash course in probability and Naïve Bayes classification

Probability theory A crash course in probability and Naïve Bayes classification Chapter 9 Random variable: a variable whose possible values are numerical outcomes of a random phenomenon. s: A person s

### Artificial Intelligence. Conditional probability. Inference by enumeration. Independence. Lesson 11 (From Russell & Norvig)

Artificial Intelligence Conditional probability Conditional or posterior probabilities e.g., cavity toothache) = 0.8 i.e., given that toothache is all I know tation for conditional distributions: Cavity

### Intelligent Systems: Reasoning and Recognition. Uncertainty and Plausible Reasoning

Intelligent Systems: Reasoning and Recognition James L. Crowley ENSIMAG 2 / MoSIG M1 Second Semester 2015/2016 Lesson 12 25 march 2015 Uncertainty and Plausible Reasoning MYCIN (continued)...2 Backward

### Random access protocols for channel access. Markov chains and their stability. Laurent Massoulié.

Random access protocols for channel access Markov chains and their stability laurent.massoulie@inria.fr Aloha: the first random access protocol for channel access [Abramson, Hawaii 70] Goal: allow machines

### Large-Sample Learning of Bayesian Networks is NP-Hard

Journal of Machine Learning Research 5 (2004) 1287 1330 Submitted 3/04; Published 10/04 Large-Sample Learning of Bayesian Networks is NP-Hard David Maxwell Chickering David Heckerman Christopher Meek Microsoft

### Bayesian Networks. Read R&N Ch. 14.1-14.2. Next lecture: Read R&N 18.1-18.4

Bayesian Networks Read R&N Ch. 14.1-14.2 Next lecture: Read R&N 18.1-18.4 You will be expected to know Basic concepts and vocabulary of Bayesian networks. Nodes represent random variables. Directed arcs

### 3. The Junction Tree Algorithms

A Short Course on Graphical Models 3. The Junction Tree Algorithms Mark Paskin mark@paskin.org 1 Review: conditional independence Two random variables X and Y are independent (written X Y ) iff p X ( )

### Probabilistic Networks An Introduction to Bayesian Networks and Influence Diagrams

Probabilistic Networks An Introduction to Bayesian Networks and Influence Diagrams Uffe B. Kjærulff Department of Computer Science Aalborg University Anders L. Madsen HUGIN Expert A/S 10 May 2005 2 Contents

### , each of which contains a unique key value, say k i , R 2. such that k i equals K (or to determine that no such record exists in the collection).

The Search Problem 1 Suppose we have a collection of records, say R 1, R 2,, R N, each of which contains a unique key value, say k i. Given a particular key value, K, the search problem is to locate the

### 2.3 Scheduling jobs on identical parallel machines

2.3 Scheduling jobs on identical parallel machines There are jobs to be processed, and there are identical machines (running in parallel) to which each job may be assigned Each job = 1,,, must be processed

### Lecture 16 : Relations and Functions DRAFT

CS/Math 240: Introduction to Discrete Mathematics 3/29/2011 Lecture 16 : Relations and Functions Instructor: Dieter van Melkebeek Scribe: Dalibor Zelený DRAFT In Lecture 3, we described a correspondence

### Marginal Independence and Conditional Independence

Marginal Independence and Conditional Independence Computer Science cpsc322, Lecture 26 (Textbook Chpt 6.1-2) March, 19, 2010 Recap with Example Marginalization Conditional Probability Chain Rule Lecture

### Examples and Proofs of Inference in Junction Trees

Examples and Proofs of Inference in Junction Trees Peter Lucas LIAC, Leiden University February 4, 2016 1 Representation and notation Let P(V) be a joint probability distribution, where V stands for a

### Data Structures Fibonacci Heaps, Amortized Analysis

Chapter 4 Data Structures Fibonacci Heaps, Amortized Analysis Algorithm Theory WS 2012/13 Fabian Kuhn Fibonacci Heaps Lacy merge variant of binomial heaps: Do not merge trees as long as possible Structure:

### Krishna Institute of Engineering & Technology, Ghaziabad Department of Computer Application MCA-213 : DATA STRUCTURES USING C

Tutorial#1 Q 1:- Explain the terms data, elementary item, entity, primary key, domain, attribute and information? Also give examples in support of your answer? Q 2:- What is a Data Type? Differentiate

### Indexing Multiple- Inheritance Hierarchies

Indexing Multiple- Inheritance Hierarchies Faculty of Electronics and Information Technology Warsaw University of Technology Jacek Lewandowski prof. Henryk Rybiński Layout The context of research Introduction

### Study Manual. Probabilistic Reasoning. Silja Renooij. August 2015

Study Manual Probabilistic Reasoning 2015 2016 Silja Renooij August 2015 General information This study manual was designed to help guide your self studies. As such, it does not include material that is

### A Sublinear Bipartiteness Tester for Bounded Degree Graphs

A Sublinear Bipartiteness Tester for Bounded Degree Graphs Oded Goldreich Dana Ron February 5, 1998 Abstract We present a sublinear-time algorithm for testing whether a bounded degree graph is bipartite

### Markov random fields and Gibbs measures

Chapter Markov random fields and Gibbs measures 1. Conditional independence Suppose X i is a random element of (X i, B i ), for i = 1, 2, 3, with all X i defined on the same probability space (.F, P).

### Graphical Models for Recovering Probabilistic and Causal Queries from Missing Data

Graphical Models for Recovering Probabilistic and Causal Queries from Missing Data Karthika Mohan and Judea Pearl Cognitive Systems Laboratory Computer Science Department University of California, Los

### Full and Complete Binary Trees

Full and Complete Binary Trees Binary Tree Theorems 1 Here are two important types of binary trees. Note that the definitions, while similar, are logically independent. Definition: a binary tree T is full

### The Optimality of Naive Bayes

The Optimality of Naive Bayes Harry Zhang Faculty of Computer Science University of New Brunswick Fredericton, New Brunswick, Canada email: hzhang@unbca E3B 5A3 Abstract Naive Bayes is one of the most

### Model-based Synthesis. Tony O Hagan

Model-based Synthesis Tony O Hagan Stochastic models Synthesising evidence through a statistical model 2 Evidence Synthesis (Session 3), Helsinki, 28/10/11 Graphical modelling The kinds of models that

### Tagging with Hidden Markov Models

Tagging with Hidden Markov Models Michael Collins 1 Tagging Problems In many NLP problems, we would like to model pairs of sequences. Part-of-speech (POS) tagging is perhaps the earliest, and most famous,

### A Comparison of Novel and State-of-the-Art Polynomial Bayesian Network Learning Algorithms

A Comparison of Novel and State-of-the-Art Polynomial Bayesian Network Learning Algorithms Laura E. Brown Ioannis Tsamardinos Constantin F. Aliferis Vanderbilt University, Department of Biomedical Informatics

### Cost Model: Work, Span and Parallelism. 1 The RAM model for sequential computation:

CSE341T 08/31/2015 Lecture 3 Cost Model: Work, Span and Parallelism In this lecture, we will look at how one analyze a parallel program written using Cilk Plus. When we analyze the cost of an algorithm

### Using Bayesian Networks to Analyze Expression Data ABSTRACT

JOURNAL OF COMPUTATIONAL BIOLOGY Volume 7, Numbers 3/4, 2 Mary Ann Liebert, Inc. Pp. 6 62 Using Bayesian Networks to Analyze Expression Data NIR FRIEDMAN, MICHAL LINIAL, 2 IFTACH NACHMAN, 3 and DANA PE

### Discrete Mathematics for CS Fall 2006 Papadimitriou & Vazirani Lecture 22

CS 70 Discrete Mathematics for CS Fall 2006 Papadimitriou & Vazirani Lecture 22 Introduction to Discrete Probability Probability theory has its origins in gambling analyzing card games, dice, roulette

### Mathematics for Computer Science/Software Engineering. Notes for the course MSM1F3 Dr. R. A. Wilson

Mathematics for Computer Science/Software Engineering Notes for the course MSM1F3 Dr. R. A. Wilson October 1996 Chapter 1 Logic Lecture no. 1. We introduce the concept of a proposition, which is a statement

### Supervised Learning (Big Data Analytics)

Supervised Learning (Big Data Analytics) Vibhav Gogate Department of Computer Science The University of Texas at Dallas Practical advice Goal of Big Data Analytics Uncover patterns in Data. Can be used

### The Goldberg Rao Algorithm for the Maximum Flow Problem

The Goldberg Rao Algorithm for the Maximum Flow Problem COS 528 class notes October 18, 2006 Scribe: Dávid Papp Main idea: use of the blocking flow paradigm to achieve essentially O(min{m 2/3, n 1/2 }

### CS 598CSC: Combinatorial Optimization Lecture date: 2/4/2010

CS 598CSC: Combinatorial Optimization Lecture date: /4/010 Instructor: Chandra Chekuri Scribe: David Morrison Gomory-Hu Trees (The work in this section closely follows [3]) Let G = (V, E) be an undirected

### Lecture 17 : Equivalence and Order Relations DRAFT

CS/Math 240: Introduction to Discrete Mathematics 3/31/2011 Lecture 17 : Equivalence and Order Relations Instructor: Dieter van Melkebeek Scribe: Dalibor Zelený DRAFT Last lecture we introduced the notion

### 3. Equivalence Relations. Discussion

3. EQUIVALENCE RELATIONS 33 3. Equivalence Relations 3.1. Definition of an Equivalence Relations. Definition 3.1.1. A relation R on a set A is an equivalence relation if and only if R is reflexive, symmetric,

### Class notes Program Analysis course given by Prof. Mooly Sagiv Computer Science Department, Tel Aviv University second lecture 8/3/2007

Constant Propagation Class notes Program Analysis course given by Prof. Mooly Sagiv Computer Science Department, Tel Aviv University second lecture 8/3/2007 Osnat Minz and Mati Shomrat Introduction This

### Language Modeling. Chapter 1. 1.1 Introduction

Chapter 1 Language Modeling (Course notes for NLP by Michael Collins, Columbia University) 1.1 Introduction In this chapter we will consider the the problem of constructing a language model from a set

### Lemma 5.2. Let S be a set. (1) Let f and g be two permutations of S. Then the composition of f and g is a permutation of S.

Definition 51 Let S be a set bijection f : S S 5 Permutation groups A permutation of S is simply a Lemma 52 Let S be a set (1) Let f and g be two permutations of S Then the composition of f and g is a

### Social Media Mining. Graph Essentials

Graph Essentials Graph Basics Measures Graph and Essentials Metrics 2 2 Nodes and Edges A network is a graph nodes, actors, or vertices (plural of vertex) Connections, edges or ties Edge Node Measures

### Notes: Chapter 2 Section 2.2: Proof by Induction

Notes: Chapter 2 Section 2.2: Proof by Induction Basic Induction. To prove: n, a W, n a, S n. (1) Prove the base case - S a. (2) Let k a and prove that S k S k+1 Example 1. n N, n i = n(n+1) 2. Example

### A binary heap is a complete binary tree, where each node has a higher priority than its children. This is called heap-order property

CmSc 250 Intro to Algorithms Chapter 6. Transform and Conquer Binary Heaps 1. Definition A binary heap is a complete binary tree, where each node has a higher priority than its children. This is called

### Lecture 1: Course overview, circuits, and formulas

Lecture 1: Course overview, circuits, and formulas Topics in Complexity Theory and Pseudorandomness (Spring 2013) Rutgers University Swastik Kopparty Scribes: John Kim, Ben Lund 1 Course Information Swastik

### Data Structure [Question Bank]

Unit I (Analysis of Algorithms) 1. What are algorithms and how they are useful? 2. Describe the factor on best algorithms depends on? 3. Differentiate: Correct & Incorrect Algorithms? 4. Write short note:

### Chapter ML:IV. IV. Statistical Learning. Probability Basics Bayes Classification Maximum a-posteriori Hypotheses

Chapter ML:IV IV. Statistical Learning Probability Basics Bayes Classification Maximum a-posteriori Hypotheses ML:IV-1 Statistical Learning STEIN 2005-2015 Area Overview Mathematics Statistics...... Stochastics

### Lecture 11: Graphical Models for Inference

Lecture 11: Graphical Models for Inference So far we have seen two graphical models that are used for inference - the Bayesian network and the Join tree. These two both represent the same joint probability

### Modern Optimization Methods for Big Data Problems MATH11146 The University of Edinburgh

Modern Optimization Methods for Big Data Problems MATH11146 The University of Edinburgh Peter Richtárik Week 3 Randomized Coordinate Descent With Arbitrary Sampling January 27, 2016 1 / 30 The Problem

### Introduction to Scheduling Theory

Introduction to Scheduling Theory Arnaud Legrand Laboratoire Informatique et Distribution IMAG CNRS, France arnaud.legrand@imag.fr November 8, 2004 1/ 26 Outline 1 Task graphs from outer space 2 Scheduling

### Analysis of Algorithms, I

Analysis of Algorithms, I CSOR W4231.002 Eleni Drinea Computer Science Department Columbia University Thursday, February 26, 2015 Outline 1 Recap 2 Representing graphs 3 Breadth-first search (BFS) 4 Applications

### An Introduction to the Use of Bayesian Network to Analyze Gene Expression Data

n Introduction to the Use of ayesian Network to nalyze Gene Expression Data Cristina Manfredotti Dipartimento di Informatica, Sistemistica e Comunicazione (D.I.S.Co. Università degli Studi Milano-icocca

### 1 if 1 x 0 1 if 0 x 1

Chapter 3 Continuity In this chapter we begin by defining the fundamental notion of continuity for real valued functions of a single real variable. When trying to decide whether a given function is or

### Evaluation of Machine Learning Techniques for Green Energy Prediction

arxiv:1406.3726v1 [cs.lg] 14 Jun 2014 Evaluation of Machine Learning Techniques for Green Energy Prediction 1 Objective Ankur Sahai University of Mainz, Germany We evaluate Machine Learning techniques

### p if x = 1 1 p if x = 0

Probability distributions Bernoulli distribution Two possible values (outcomes): 1 (success), 0 (failure). Parameters: p probability of success. Probability mass function: P (x; p) = { p if x = 1 1 p if

### Distributed Computing over Communication Networks: Maximal Independent Set

Distributed Computing over Communication Networks: Maximal Independent Set What is a MIS? MIS An independent set (IS) of an undirected graph is a subset U of nodes such that no two nodes in U are adjacent.

### Relational Dynamic Bayesian Networks: a report. Cristina Manfredotti

Relational Dynamic Bayesian Networks: a report Cristina Manfredotti Dipartimento di Informatica, Sistemistica e Comunicazione (D.I.S.Co.) Università degli Studi Milano-Bicocca manfredotti@disco.unimib.it

### Question 2 Naïve Bayes (16 points)

Question 2 Naïve Bayes (16 points) About 2/3 of your email is spam so you downloaded an open source spam filter based on word occurrences that uses the Naive Bayes classifier. Assume you collected the

### Context-Specific Independence in Bayesian Networks. P(z [ x) for all values x, y and z of variables X, Y and

Context-Specific Independence in Bayesian Networks 115 Context-Specific Independence in Bayesian Networks Craig Boutilier Dept. of Computer Science University of British Columbia Vancouver, BC V6T 1Z4

### A Non-Linear Schema Theorem for Genetic Algorithms

A Non-Linear Schema Theorem for Genetic Algorithms William A Greene Computer Science Department University of New Orleans New Orleans, LA 70148 bill@csunoedu 504-280-6755 Abstract We generalize Holland

### MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.436J/15.085J Fall 2008 Lecture 5 9/17/2008 RANDOM VARIABLES

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.436J/15.085J Fall 2008 Lecture 5 9/17/2008 RANDOM VARIABLES Contents 1. Random variables and measurable functions 2. Cumulative distribution functions 3. Discrete

### Decision Trees and Networks

Lecture 21: Uncertainty 6 Today s Lecture Victor R. Lesser CMPSCI 683 Fall 2010 Decision Trees and Networks Decision Trees A decision tree is an explicit representation of all the possible scenarios from

### Data Structures and Algorithms Written Examination

Data Structures and Algorithms Written Examination 22 February 2013 FIRST NAME STUDENT NUMBER LAST NAME SIGNATURE Instructions for students: Write First Name, Last Name, Student Number and Signature where

### FIND ALL LONGER AND SHORTER BOUNDARY DURATION VECTORS UNDER PROJECT TIME AND BUDGET CONSTRAINTS

Journal of the Operations Research Society of Japan Vol. 45, No. 3, September 2002 2002 The Operations Research Society of Japan FIND ALL LONGER AND SHORTER BOUNDARY DURATION VECTORS UNDER PROJECT TIME

### An Algorithm to Handle Structural Uncertainties in Learning Bayesian Network

An Algorithm to Handle Structural Uncertainties in Learning Bayesian Network Cleuber Moreira Fernandes Politec Informática Ltda SIG quadra 4 lote 173, Zip Code 70.610-440, Brasília, DF, Brazil cleuber@compeduc.net

### Project and Production Management Prof. Arun Kanda Department of Mechanical Engineering Indian Institute of Technology, Delhi

Project and Production Management Prof. Arun Kanda Department of Mechanical Engineering Indian Institute of Technology, Delhi Lecture - 15 Limited Resource Allocation Today we are going to be talking about

### Technical Report No. 5 April 18, Bayesian Networks. Michal Horný

Technical Report No. 5 April 18, 2014 Bayesian Networks Michal Horný mhorny@bu.edu This paper was published in fulfillment of the requirements for PM931: Directed Study in Health Policy and Management

### Shortest Inspection-Path. Queries in Simple Polygons

Shortest Inspection-Path Queries in Simple Polygons Christian Knauer, Günter Rote B 05-05 April 2005 Shortest Inspection-Path Queries in Simple Polygons Christian Knauer, Günter Rote Institut für Informatik,

### POWER SETS AND RELATIONS

POWER SETS AND RELATIONS L. MARIZZA A. BAILEY 1. The Power Set Now that we have defined sets as best we can, we can consider a sets of sets. If we were to assume nothing, except the existence of the empty

### CHAPTER 2 Estimating Probabilities

CHAPTER 2 Estimating Probabilities Machine Learning Copyright c 2016. Tom M. Mitchell. All rights reserved. *DRAFT OF January 24, 2016* *PLEASE DO NOT DISTRIBUTE WITHOUT AUTHOR S PERMISSION* This is a

### Elementary Number Theory We begin with a bit of elementary number theory, which is concerned

CONSTRUCTION OF THE FINITE FIELDS Z p S. R. DOTY Elementary Number Theory We begin with a bit of elementary number theory, which is concerned solely with questions about the set of integers Z = {0, ±1,

### INDISTINGUISHABILITY OF ABSOLUTELY CONTINUOUS AND SINGULAR DISTRIBUTIONS

INDISTINGUISHABILITY OF ABSOLUTELY CONTINUOUS AND SINGULAR DISTRIBUTIONS STEVEN P. LALLEY AND ANDREW NOBEL Abstract. It is shown that there are no consistent decision rules for the hypothesis testing problem

### Lecture 4: BK inequality 27th August and 6th September, 2007

CSL866: Percolation and Random Graphs IIT Delhi Amitabha Bagchi Scribe: Arindam Pal Lecture 4: BK inequality 27th August and 6th September, 2007 4. Preliminaries The FKG inequality allows us to lower bound

### Automata-based Verification - I

CS3172: Advanced Algorithms Automata-based Verification - I Howard Barringer Room KB2.20: email: howard.barringer@manchester.ac.uk March 2006 Supporting and Background Material Copies of key slides (already

### 2.3. Finding polynomial functions. An Introduction:

2.3. Finding polynomial functions. An Introduction: As is usually the case when learning a new concept in mathematics, the new concept is the reverse of the previous one. Remember how you first learned

### Monte Carlo and Empirical Methods for Stochastic Inference (MASM11/FMS091)

Monte Carlo and Empirical Methods for Stochastic Inference (MASM11/FMS091) Magnus Wiktorsson Centre for Mathematical Sciences Lund University, Sweden Lecture 6 Sequential Monte Carlo methods II February

### Generating models of a matched formula with a polynomial delay

Generating models of a matched formula with a polynomial delay Petr Savicky Institute of Computer Science, Academy of Sciences of Czech Republic, Pod Vodárenskou Věží 2, 182 07 Praha 8, Czech Republic

### UNIVERSITY OF LYON DOCTORAL SCHOOL OF COMPUTER SCIENCES AND MATHEMATICS P H D T H E S I S. Specialty : Computer Science. Author

UNIVERSITY OF LYON DOCTORAL SCHOOL OF COMPUTER SCIENCES AND MATHEMATICS P H D T H E S I S Specialty : Computer Science Author Sérgio Rodrigues de Morais on November 16, 29 Bayesian Network Structure Learning

### 8.1 Min Degree Spanning Tree

CS880: Approximations Algorithms Scribe: Siddharth Barman Lecturer: Shuchi Chawla Topic: Min Degree Spanning Tree Date: 02/15/07 In this lecture we give a local search based algorithm for the Min Degree

### Homework 5 Solutions

Homework 5 Solutions 4.2: 2: a. 321 = 256 + 64 + 1 = (01000001) 2 b. 1023 = 512 + 256 + 128 + 64 + 32 + 16 + 8 + 4 + 2 + 1 = (1111111111) 2. Note that this is 1 less than the next power of 2, 1024, which

### princeton univ. F 13 cos 521: Advanced Algorithm Design Lecture 6: Provable Approximation via Linear Programming Lecturer: Sanjeev Arora

princeton univ. F 13 cos 521: Advanced Algorithm Design Lecture 6: Provable Approximation via Linear Programming Lecturer: Sanjeev Arora Scribe: One of the running themes in this course is the notion of

### Bounded Treewidth in Knowledge Representation and Reasoning 1

Bounded Treewidth in Knowledge Representation and Reasoning 1 Reinhard Pichler Institut für Informationssysteme Arbeitsbereich DBAI Technische Universität Wien Luminy, October 2010 1 Joint work with G.

### Introduction to Markov Chain Monte Carlo

Introduction to Markov Chain Monte Carlo Monte Carlo: sample from a distribution to estimate the distribution to compute max, mean Markov Chain Monte Carlo: sampling using local information Generic problem

10-601 Machine Learning http://www.cs.cmu.edu/afs/cs/academic/class/10601-f10/index.html Course data All up-to-date info is on the course web page: http://www.cs.cmu.edu/afs/cs/academic/class/10601-f10/index.html

### ! Solve problem to optimality. ! Solve problem in poly-time. ! Solve arbitrary instances of the problem. !-approximation algorithm.

Approximation Algorithms Chapter Approximation Algorithms Q Suppose I need to solve an NP-hard problem What should I do? A Theory says you're unlikely to find a poly-time algorithm Must sacrifice one of

### The Certainty-Factor Model

The Certainty-Factor Model David Heckerman Departments of Computer Science and Pathology University of Southern California HMR 204, 2025 Zonal Ave Los Angeles, CA 90033 dheck@sumex-aim.stanford.edu 1 Introduction

### Maximum Hitting Time for Random Walks on Graphs. Graham Brightwell, Cambridge University Peter Winkler*, Emory University

Maximum Hitting Time for Random Walks on Graphs Graham Brightwell, Cambridge University Peter Winkler*, Emory University Abstract. For x and y vertices of a connected graph G, let T G (x, y denote the

### Analysis of Algorithms I: Binary Search Trees

Analysis of Algorithms I: Binary Search Trees Xi Chen Columbia University Hash table: A data structure that maintains a subset of keys from a universe set U = {0, 1,..., p 1} and supports all three dictionary

### Monte Carlo-based statistical methods (MASM11/FMS091)

Monte Carlo-based statistical methods (MASM11/FMS091) Magnus Wiktorsson Centre for Mathematical Sciences Lund University, Sweden Lecture 6 Sequential Monte Carlo methods II February 7, 2014 M. Wiktorsson