# 3. The Junction Tree Algorithms

Save this PDF as:

Size: px
Start display at page:

## Transcription

1 A Short Course on Graphical Models 3. The Junction Tree Algorithms Mark Paskin 1

2 Review: conditional independence Two random variables X and Y are independent (written X Y ) iff p X ( ) = p X Y (, y) for all y If X Y then Y gives us no information about X. X and Y are conditionally independent given Z (written X Y Z) iff p X Z (, z) = p X Y Z (, y, z) for all y and z If X Y Z then Y gives us no new information about X once we know Z. We can obtain compact, factorized representations of densities by using the chain rule in combination with conditional independence assumptions. The Variable Elimination algorithm uses the distributivity of over + to perform inference efficiently in factorized densities. 2

3 Review: graphical models Bayesian network undirected graphical model p A p B A p C A p D B p E C p F BE 1 Z ψ A ψ AB ψ AC ψ BD ψ CE ψ BEF D D B B A F A F C E C E d-separation cond. indep. graph separation cond. indep. Moralization converts a Bayesian network into an undirected graphical model (but it does not preserve all of the conditional independence properties). 3

4 A notation for sets of random variables It is helpful when working with large, complex models to have a good notation for sets of random variables. Let X = (X i : i V ) be a vector random variable with density p. For each A V, let X A = (Xi : i A). For A, B V, let p A = pxa and p A B = pxa X B. Example. If V = {a, b, c} and A = {a, c} then X = X a X b X c and X A = X a X c where X a, X b, and X c are random variables. 4

5 A notation for assignments We also need a notation for dealing flexibly with functions of many arguments. An assignment to A is a set of index-value pairs u = {(i, x i ) : i A}, one per index i A, where x i is in the range of X i. Let X A be the set of assignments to X A (with X = X V ). Building new assignments from given assignments: Given assignments u and v to disjoint subsets A and B, respectively, their union u v is an assignment to A B. If u is an assignment to A then the restriction of u to B V is u B = {(i, xi ) u : i B}, an assignment to A B. If u = {(i, x i ) : i A} is an assignment and f is a function, then f(u) = f(x i : i A) 5

6 Examples of the assignment notation 1. If p is the joint density of X then the marginal density of X A is p A (v) = p(v u), v X A u X A where A = V \A is the complement of A. 2. If p takes the form of a normalized product of potentials, we can write it as p(u) = 1 Z C C ψ C (u C ), u X where C is a set of subsets of V, and each ψ C is a potential function that depends only upon X C. The Markov graph of p has clique set C. 6

7 Review: the inference problem Input: a vector random variable X = (X i : i V ); a joint density for X of the form p(u) = 1 Z ψ C (u C ) C C an evidence assignment w to E; and some query variables X Q. Output: p Q E (,w), the conditional density of X Q given the evidence w. 7

8 Dealing with evidence From the definition of conditional probability, we have: p E E (u,w) = p(u w) p E (w) = 1 Z C C ψ C(u C w C ) p E (w) For fixed evidence w on X E, this is another normalized product of potentials: p E w (u) = 1 Z ψ C (u C ) C C where Z = Z p E (w), C = C\E, and ψ C (u) = ψ C (u w C ). Thus, to deal with evidence, we simply instantiate it in all clique potentials. 8

9 The reformulated inference problem Given a joint density for X = (X i : i V ) of the form p(u) = 1 Z compute the marginal density of X Q : p Q (v) = C C u X Q p(v u) = u X Q 1 Z C C ψ C (u C ) ψ C (v C u C ) 9

10 Review: Variable Elimination For each i Q, push in the sum over X i and compute it: p Q (v) = 1 Z = 1 Z = 1 Z = 1 Z u X Q C C u X Q\{i} w X {i} u X Q\{i} u X Q\{i} ψ C (v C u C ) C C i C C C i C C C ψ C (v C u C ) ψ C (v C u C w C ) w X {i} C C i C ψ C (v C u C ) ψ Ei (v Ei u Ei ) This creates a new elimination clique E i = C C i C ψ C (v C u C w) C\{i}. At the end we have p Q = 1 Z ψ Q and we normalize to obtain p Q (and Z). 10

11 From Variable Elimination to the junction tree algorithms Variable Elimination is query sensitive: we must specify the query variables in advance. This means each time we run a different query, we must re-run the entire algorithm. The junction tree algorithms generalize Variable Elimination to avoid this; they compile the density into a data structure that supports the simultaneous execution of a large class of queries. 11

12 Junction trees d {b, d} b {b} a c e f {a, b, c} {b, c} {b, c, e} {b, e} {b, e, f} G T A cluster graph T is a junction tree for G if it has these three properties: 1. singly connected: there is exactly one path between each pair of clusters. 2. covering: for each clique A of G there is some cluster C such that A C. 3. running intersection: for each pair of clusters B and C that contain i, each cluster on the unique path between B and C also contains i. 12

13 Building junction trees To build a junction tree: 1. Choose an ordering of the nodes and use Node Elimination to obtain a set of elimination cliques. 2. Build a complete cluster graph over the maximal elimination cliques. 3. Weight each edge {B, C} by B C and compute a maximum-weight spanning tree. This spanning tree is a junction tree for G (see Cowell et al., 1999). Different junction trees are obtained with different elimination orders and different maximum-weight spanning trees. Finding the junction tree with the smallest clusters is an NP-hard problem. 13

14 An example of building junction trees 1. Compute the elimination cliques (the order here is f, d, e, c, b, a). d d b b b b a f a a a b c e c e c e c a 2. Form the complete cluster graph over the maximal elimination cliques and find a maximum-weight spanning tree. {b, d} 1 1 {b, e, f} {b, d} {b} {b, c, e} 2 {a, b, c} {a, b, c} {b, c} {b, c, e} {b, e} {b, e, f} 14

15 Decomposable densities A factorized density p(u) = 1 Z C C ψ C (u C ) is decomposable if there is a junction tree with cluster set C. To convert a factorized density p to a decomposable density: 1. Build a junction tree T for the Markov graph of p. 2. Create a potential ψ C for each cluster C of T and initialize it to unity. 3. Multiply each potential ψ of p into the cluster potential of one cluster that covers its variables. Note: this is possible only because of the covering property. 15

16 The junction tree inference algorithms The junction tree algorithms take as input a decomposable density and its junction tree. They have the same distributed structure: Each cluster starts out knowing only its local potential and its neighbors. Each cluster sends one message (potential function) to each neighbor. By combining its local potential with the messages it receives, each cluster is able to compute the marginal density of its variables. 16

17 The message passing protocol The junction tree algorithms obey the message passing protocol: Cluster B is allowed to send a message to a neighbor C only after it has received messages from all neighbors except C. One admissible schedule is obtained by choosing one cluster R to be the root, so the junction tree is directed. Execute Collect(R) and then Distribute(R): 1. Collect(C): For each child B of C, recursively call Collect(B) and then pass a message from B to C. 2. Distribute(C): For each child B of C, pass a message to B and then recursively call Distribute(B). COLLECT messages DISTRIBUTE messages root

18 The Shafer Shenoy Algorithm The message sent from B to C is defined as µ BC (u) = ψ B (u v) v X B\C (A,B) E A C µ AB (u A v A ) Procedurally, cluster B computes the product of its local potential ψ B and the messages from all clusters except C, marginalizes out all variables that are not in C, and then sends the result to C. Note: µ BC is well-defined because the junction tree is singly connected. The cluster belief at C is defined as β C (u) = ψ C (u) (B,C) E µ BC (u B ) This is the product of the cluster s local potential and the messages received from all of its neighbors. We will show that β C p C. 18

19 Correctness: Shafer Shenoy is Variable Elimination in all directions at once The cluster belief β C is computed by alternatingly multiplying cluster potentials together and summing out variables. This computation is of the same basic form as Variable Elimination. To prove that β C p C, we must prove that no sum is pushed in too far. This follows directly from the running intersection property: C i is summed out when computing this message the running intersection property guarantees the clusters containing i constitute a connected subgraph clusters containing i messages with i messages without i 19

20 The hugin Algorithm Give each cluster C and each separator S a potential function over its variables. Initialize: φ C (u) = ψ C (u) φ S (u) = 1 To pass a message from B to C over separator S, update φ S(u) = φ B (u v) v X B\S φ C(u) = φ C (u) φ S (u S) φ S (u S ) After all messages have been passed, φ C p C for all clusters C. 20

21 Correctness: hugin is a time-efficient version of Shafer Shenoy Each time the Shafer Shenoy algorithm sends a message or computes its cluster belief, it multiplies together messages. To avoid performing these multiplications repeatedly, the hugin algorithm caches in φ C the running product of ψ C and the messages received so far. When B sends a message to C, it divides out the message C sent to B from this running product. 21

22 Summary: the junction tree algorithms Compile time: 1. Build the junction tree T: (a) Obtain a set of maximal elimination cliques with Node Elimination. (b) Build a weighted, complete cluster graph over these cliques. (c) Choose T to be a maximum-weight spanning tree. 2. Make the density decomposable with respect to T. Run time: 1. Instantiate evidence in the potentials of the density. 2. Pass messages according to the message passing protocol. 3. Normalize the cluster beliefs/potentials to obtain conditional densities. 22

23 Complexity of junction tree algorithms Junction tree algorithms represent, multiply, and marginalize potentials: tabular Gaussian storing ψ C O(k C ) O( C 2 ) computing ψ B C = ψ B ψ C O(k B C ) O( B C 2 ) computing ψ C\B (u) = v X B ψ C (u v) O(k C ) O( B 3 C 2 ) The number of clusters in a junction tree and therefore the number of messages computed is O( V ). Thus, the time and space complexity is dominated by the size of the largest cluster in the junction tree, or the width of the junction tree: In tabular densities, the complexity is exponential in the width. In Gaussian densities, the complexity is cubic in the width. 23

24 Generalized Distributive Law The general problem solved by the junction tree algorithms is the sum-of-products problem: compute p Q (v) ψ C (v C u C ) u X Q C C The property used by the junction tree algorithms is the distributivity of over +; more generally, we need a commutative semiring: [0, ) (+, 0) (, 1) sum-product [0, ) (max, 0) (, 1) max-product (, ] (min, ) (+, 0) min-sum {T, F} (, F) (, T) Boolean Many other problems are of this form, including maximum a posteriori inference, the Hadamard transform, and matrix chain multiplication. 24

25 Summary The junction tree algorithms generalize Variable Elimination to the efficient, simultaneous execution of a large class of queries. The algorithms take the form of message passing on a graph called a junction tree, whose nodes are clusters, or sets, of variables. Each cluster starts with one potential of the factorized density. By combining this potential with the potentials it receives from its neighbors, it can compute the marginal over its variables. Two junction tree algorithms are the Shafer Shenoy algorithm and the hugin algorithm, which avoids repeated multiplications. The complexity of the algorithms scales with the width of the junction tree. The algorithms can be generalized to solve other problems by using other commutative semirings. 25

### COS513 LECTURE 5 JUNCTION TREE ALGORITHM

COS513 LECTURE 5 JUNCTION TREE ALGORITHM D. EIS AND V. KOSTINA The junction tree algorithm is the culmination of the way graph theory and probability combine to form graphical models. After we discuss

### Probabilistic Graphical Models

Probabilistic Graphical Models Raquel Urtasun and Tamir Hazan TTI Chicago April 4, 2011 Raquel Urtasun and Tamir Hazan (TTI-C) Graphical Models April 4, 2011 1 / 22 Bayesian Networks and independences

### Course: Model, Learning, and Inference: Lecture 5

Course: Model, Learning, and Inference: Lecture 5 Alan Yuille Department of Statistics, UCLA Los Angeles, CA 90095 yuille@stat.ucla.edu Abstract Probability distributions on structured representation.

### 5 Directed acyclic graphs

5 Directed acyclic graphs (5.1) Introduction In many statistical studies we have prior knowledge about a temporal or causal ordering of the variables. In this chapter we will use directed graphs to incorporate

### Minimum Spanning Trees

Minimum Spanning Trees Algorithms and 18.304 Presentation Outline 1 Graph Terminology Minimum Spanning Trees 2 3 Outline Graph Terminology Minimum Spanning Trees 1 Graph Terminology Minimum Spanning Trees

### The Basics of Graphical Models

The Basics of Graphical Models David M. Blei Columbia University October 3, 2015 Introduction These notes follow Chapter 2 of An Introduction to Probabilistic Graphical Models by Michael Jordan. Many figures

### Examples and Proofs of Inference in Junction Trees

Examples and Proofs of Inference in Junction Trees Peter Lucas LIAC, Leiden University February 4, 2016 1 Representation and notation Let P(V) be a joint probability distribution, where V stands for a

### 2. (a) Explain the strassen s matrix multiplication. (b) Write deletion algorithm, of Binary search tree. [8+8]

Code No: R05220502 Set No. 1 1. (a) Describe the performance analysis in detail. (b) Show that f 1 (n)+f 2 (n) = 0(max(g 1 (n), g 2 (n)) where f 1 (n) = 0(g 1 (n)) and f 2 (n) = 0(g 2 (n)). [8+8] 2. (a)

### Lecture Notes on Spanning Trees

Lecture Notes on Spanning Trees 15-122: Principles of Imperative Computation Frank Pfenning Lecture 26 April 26, 2011 1 Introduction In this lecture we introduce graphs. Graphs provide a uniform model

### Part 2: Community Detection

Chapter 8: Graph Data Part 2: Community Detection Based on Leskovec, Rajaraman, Ullman 2014: Mining of Massive Datasets Big Data Management and Analytics Outline Community Detection - Social networks -

### Markov random fields and Gibbs measures

Chapter Markov random fields and Gibbs measures 1. Conditional independence Suppose X i is a random element of (X i, B i ), for i = 1, 2, 3, with all X i defined on the same probability space (.F, P).

### Tutorial on variational approximation methods. Tommi S. Jaakkola MIT AI Lab

Tutorial on variational approximation methods Tommi S. Jaakkola MIT AI Lab tommi@ai.mit.edu Tutorial topics A bit of history Examples of variational methods A brief intro to graphical models Variational

### Factor Graphs and the Sum-Product Algorithm

498 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 47, NO. 2, FEBRUARY 2001 Factor Graphs and the Sum-Product Algorithm Frank R. Kschischang, Senior Member, IEEE, Brendan J. Frey, Member, IEEE, and Hans-Andrea

### Well-Separated Pair Decomposition for the Unit-disk Graph Metric and its Applications

Well-Separated Pair Decomposition for the Unit-disk Graph Metric and its Applications Jie Gao Department of Computer Science Stanford University Joint work with Li Zhang Systems Research Center Hewlett-Packard

### The Joint Probability Distribution (JPD) of a set of n binary variables involve a huge number of parameters

DEFINING PROILISTI MODELS The Joint Probability Distribution (JPD) of a set of n binary variables involve a huge number of parameters 2 n (larger than 10 25 for only 100 variables). x y z p(x, y, z) 0

### p if x = 1 1 p if x = 0

Probability distributions Bernoulli distribution Two possible values (outcomes): 1 (success), 0 (failure). Parameters: p probability of success. Probability mass function: P (x; p) = { p if x = 1 1 p if

### Graphical Models as Block-Tree Graphs

Graphical Models as Block-Tree Graphs 1 Divyanshu Vats and José M. F. Moura arxiv:1007.0563v2 [stat.ml] 13 Nov 2010 Abstract We introduce block-tree graphs as a framework for deriving efficient algorithms

### Logic, Probability and Learning

Logic, Probability and Learning Luc De Raedt luc.deraedt@cs.kuleuven.be Overview Logic Learning Probabilistic Learning Probabilistic Logic Learning Closely following : Russell and Norvig, AI: a modern

### Lecture 7: Approximation via Randomized Rounding

Lecture 7: Approximation via Randomized Rounding Often LPs return a fractional solution where the solution x, which is supposed to be in {0, } n, is in [0, ] n instead. There is a generic way of obtaining

### Lecture 11: Graphical Models for Inference

Lecture 11: Graphical Models for Inference So far we have seen two graphical models that are used for inference - the Bayesian network and the Join tree. These two both represent the same joint probability

### 2.3 Scheduling jobs on identical parallel machines

2.3 Scheduling jobs on identical parallel machines There are jobs to be processed, and there are identical machines (running in parallel) to which each job may be assigned Each job = 1,,, must be processed

### Artificial Intelligence Mar 27, Bayesian Networks 1 P (T D)P (D) + P (T D)P ( D) =

Artificial Intelligence 15-381 Mar 27, 2007 Bayesian Networks 1 Recap of last lecture Probability: precise representation of uncertainty Probability theory: optimal updating of knowledge based on new information

### Bayesian Networks Chapter 14. Mausam (Slides by UW-AI faculty & David Page)

Bayesian Networks Chapter 14 Mausam (Slides by UW-AI faculty & David Page) Bayes Nets In general, joint distribution P over set of variables (X 1 x... x X n ) requires exponential space for representation

### 5.1 Bipartite Matching

CS787: Advanced Algorithms Lecture 5: Applications of Network Flow In the last lecture, we looked at the problem of finding the maximum flow in a graph, and how it can be efficiently solved using the Ford-Fulkerson

### 3. Equivalence Relations. Discussion

3. EQUIVALENCE RELATIONS 33 3. Equivalence Relations 3.1. Definition of an Equivalence Relations. Definition 3.1.1. A relation R on a set A is an equivalence relation if and only if R is reflexive, symmetric,

### GRAPH THEORY and APPLICATIONS. Trees

GRAPH THEORY and APPLICATIONS Trees Properties Tree: a connected graph with no cycle (acyclic) Forest: a graph with no cycle Paths are trees. Star: A tree consisting of one vertex adjacent to all the others.

### Graph Theory Notes. Vadim Lozin. Institute of Mathematics University of Warwick

Graph Theory Notes Vadim Lozin Institute of Mathematics University of Warwick 1 Introduction A graph G = (V, E) consists of two sets V and E. The elements of V are called the vertices and the elements

### Why graph clustering is useful?

Graph Clustering Why graph clustering is useful? Distance matrices are graphs as useful as any other clustering Identification of communities in social networks Webpage clustering for better data management

### Probability, Conditional Independence

Probability, Conditional Independence June 19, 2012 Probability, Conditional Independence Probability Sample space Ω of events Each event ω Ω has an associated measure Probability of the event P(ω) Axioms

### Solutions to Exercises 8

Discrete Mathematics Lent 2009 MA210 Solutions to Exercises 8 (1) Suppose that G is a graph in which every vertex has degree at least k, where k 1, and in which every cycle contains at least 4 vertices.

### R-trees. R-Trees: A Dynamic Index Structure For Spatial Searching. R-Tree. Invariants

R-Trees: A Dynamic Index Structure For Spatial Searching A. Guttman R-trees Generalization of B+-trees to higher dimensions Disk-based index structure Occupancy guarantee Multiple search paths Insertions

### 10. Graph Matrices Incidence Matrix

10 Graph Matrices Since a graph is completely determined by specifying either its adjacency structure or its incidence structure, these specifications provide far more efficient ways of representing a

### Trees and Fundamental Circuits

Trees and Fundamental Circuits Tree A connected graph without any circuits. o must have at least one vertex. o definition implies that it must be a simple graph. o only finite trees are being considered

### Distributed Computing over Communication Networks: Maximal Independent Set

Distributed Computing over Communication Networks: Maximal Independent Set What is a MIS? MIS An independent set (IS) of an undirected graph is a subset U of nodes such that no two nodes in U are adjacent.

### OPTIMAL DESIGN OF DISTRIBUTED SENSOR NETWORKS FOR FIELD RECONSTRUCTION

OPTIMAL DESIGN OF DISTRIBUTED SENSOR NETWORKS FOR FIELD RECONSTRUCTION Sérgio Pequito, Stephen Kruzick, Soummya Kar, José M. F. Moura, A. Pedro Aguiar Department of Electrical and Computer Engineering

### Boolean Algebra Part 1

Boolean Algebra Part 1 Page 1 Boolean Algebra Objectives Understand Basic Boolean Algebra Relate Boolean Algebra to Logic Networks Prove Laws using Truth Tables Understand and Use First Basic Theorems

### Jade Yu Cheng ICS 311 Homework 10 Oct 7, 2008

Jade Yu Cheng ICS 311 Homework 10 Oct 7, 2008 Question for lecture 13 Problem 23-3 on p. 577 Bottleneck spanning tree A bottleneck spanning tree of an undirected graph is a spanning tree of whose largest

### Small Maximal Independent Sets and Faster Exact Graph Coloring

Small Maximal Independent Sets and Faster Exact Graph Coloring David Eppstein Univ. of California, Irvine Dept. of Information and Computer Science The Exact Graph Coloring Problem: Given an undirected

### Math 443/543 Graph Theory Notes 4: Connector Problems

Math 443/543 Graph Theory Notes 4: Connector Problems David Glickenstein September 19, 2012 1 Trees and the Minimal Connector Problem Here is the problem: Suppose we have a collection of cities which we

### Mining Social Network Graphs

Mining Social Network Graphs Debapriyo Majumdar Data Mining Fall 2014 Indian Statistical Institute Kolkata November 13, 17, 2014 Social Network No introduc+on required Really? We s7ll need to understand

### On the k-path cover problem for cacti

On the k-path cover problem for cacti Zemin Jin and Xueliang Li Center for Combinatorics and LPMC Nankai University Tianjin 300071, P.R. China zeminjin@eyou.com, x.li@eyou.com Abstract In this paper we

### SYSM 6304: Risk and Decision Analysis Lecture 5: Methods of Risk Analysis

SYSM 6304: Risk and Decision Analysis Lecture 5: Methods of Risk Analysis M. Vidyasagar Cecil & Ida Green Chair The University of Texas at Dallas Email: M.Vidyasagar@utdallas.edu October 17, 2015 Outline

### Lecture 1: Course overview, circuits, and formulas

Lecture 1: Course overview, circuits, and formulas Topics in Complexity Theory and Pseudorandomness (Spring 2013) Rutgers University Swastik Kopparty Scribes: John Kim, Ben Lund 1 Course Information Swastik

### Structured Learning and Prediction in Computer Vision. Contents

Foundations and Trends R in Computer Graphics and Vision Vol. 6, Nos. 3 4 (2010) 185 365 c 2011 S. Nowozin and C. H. Lampert DOI: 10.1561/0600000033 Structured Learning and Prediction in Computer Vision

### Notes on Matrix Multiplication and the Transitive Closure

ICS 6D Due: Wednesday, February 25, 2015 Instructor: Sandy Irani Notes on Matrix Multiplication and the Transitive Closure An n m matrix over a set S is an array of elements from S with n rows and m columns.

### Multi-Class and Structured Classification

Multi-Class and Structured Classification [slides prises du cours cs294-10 UC Berkeley (2006 / 2009)] [ p y( )] http://www.cs.berkeley.edu/~jordan/courses/294-fall09 Basic Classification in ML Input Output

### COT5405 Analysis of Algorithms Homework 3 Solutions

COT0 Analysis of Algorithms Homework 3 Solutions. Prove or give a counter example: (a) In the textbook, we have two routines for graph traversal - DFS(G) and BFS(G,s) - where G is a graph and s is any

### A Note on Maximum Independent Sets in Rectangle Intersection Graphs

A Note on Maximum Independent Sets in Rectangle Intersection Graphs Timothy M. Chan School of Computer Science University of Waterloo Waterloo, Ontario N2L 3G1, Canada tmchan@uwaterloo.ca September 12,

### Outline. NP-completeness. When is a problem easy? When is a problem hard? Today. Euler Circuits

Outline NP-completeness Examples of Easy vs. Hard problems Euler circuit vs. Hamiltonian circuit Shortest Path vs. Longest Path 2-pairs sum vs. general Subset Sum Reducing one problem to another Clique

### 1 = (a 0 + b 0 α) 2 + + (a m 1 + b m 1 α) 2. for certain elements a 0,..., a m 1, b 0,..., b m 1 of F. Multiplying out, we obtain

Notes on real-closed fields These notes develop the algebraic background needed to understand the model theory of real-closed fields. To understand these notes, a standard graduate course in algebra is

### CMSC 451: Graph Properties, DFS, BFS, etc.

CMSC 451: Graph Properties, DFS, BFS, etc. Slides By: Carl Kingsford Department of Computer Science University of Maryland, College Park Based on Chapter 3 of Algorithm Design by Kleinberg & Tardos. Graphs

### SEQUENTIAL VALUATION NETWORKS: A NEW GRAPHICAL TECHNIQUE FOR ASYMMETRIC DECISION PROBLEMS

SEQUENTIAL VALUATION NETWORKS: A NEW GRAPHICAL TECHNIQUE FOR ASYMMETRIC DECISION PROBLEMS Riza Demirer and Prakash P. Shenoy University of Kansas, School of Business 1300 Sunnyside Ave, Summerfield Hall

### Data Mining. Cluster Analysis: Advanced Concepts and Algorithms

Data Mining Cluster Analysis: Advanced Concepts and Algorithms Tan,Steinbach, Kumar Introduction to Data Mining 4/18/2004 1 More Clustering Methods Prototype-based clustering Density-based clustering Graph-based

### Planar Graph and Trees

Dr. Nahid Sultana December 16, 2012 Tree Spanning Trees Minimum Spanning Trees Maps and Regions Eulers Formula Nonplanar graph Dual Maps and the Four Color Theorem Tree Spanning Trees Minimum Spanning

### Lecture 15 An Arithmetic Circuit Lowerbound and Flows in Graphs

CSE599s: Extremal Combinatorics November 21, 2011 Lecture 15 An Arithmetic Circuit Lowerbound and Flows in Graphs Lecturer: Anup Rao 1 An Arithmetic Circuit Lower Bound An arithmetic circuit is just like

### CS 188: Artificial Intelligence. Probability recap

CS 188: Artificial Intelligence Bayes Nets Representation and Independence Pieter Abbeel UC Berkeley Many slides over this course adapted from Dan Klein, Stuart Russell, Andrew Moore Conditional probability

### Boolean Representations and Combinatorial Equivalence

Chapter 2 Boolean Representations and Combinatorial Equivalence This chapter introduces different representations of Boolean functions. It then discuss the applications of these representations for proving

### Graph Mining and Social Network Analysis

Graph Mining and Social Network Analysis Data Mining and Text Mining (UIC 583 @ Politecnico di Milano) References Jiawei Han and Micheline Kamber, "Data Mining: Concepts and Techniques", The Morgan Kaufmann

### Homework MA 725 Spring, 2012 C. Huneke SELECTED ANSWERS

Homework MA 725 Spring, 2012 C. Huneke SELECTED ANSWERS 1.1.25 Prove that the Petersen graph has no cycle of length 7. Solution: There are 10 vertices in the Petersen graph G. Assume there is a cycle C

### Probabilistic Graphical Models Homework 1: Due January 29, 2014 at 4 pm

Probabilistic Graphical Models 10-708 Homework 1: Due January 29, 2014 at 4 pm Directions. This homework assignment covers the material presented in Lectures 1-3. You must complete all four problems to

### Uniform Multicommodity Flow through the Complete Graph with Random Edge-capacities

Combinatorics, Probability and Computing (2005) 00, 000 000. c 2005 Cambridge University Press DOI: 10.1017/S0000000000000000 Printed in the United Kingdom Uniform Multicommodity Flow through the Complete

### CS 598CSC: Combinatorial Optimization Lecture date: 2/4/2010

CS 598CSC: Combinatorial Optimization Lecture date: /4/010 Instructor: Chandra Chekuri Scribe: David Morrison Gomory-Hu Trees (The work in this section closely follows [3]) Let G = (V, E) be an undirected

### INFORMATION from a single node (entity) can reach other

1 Network Infusion to Infer Information Sources in Networks Soheil Feizi, Muriel Médard, Gerald Quon, Manolis Kellis, and Ken Duffy arxiv:166.7383v1 [cs.si] 23 Jun 216 Abstract Several significant models

### Exponential time algorithms for graph coloring

Exponential time algorithms for graph coloring Uriel Feige Lecture notes, March 14, 2011 1 Introduction Let [n] denote the set {1,..., k}. A k-labeling of vertices of a graph G(V, E) is a function V [k].

### Basics of Statistical Machine Learning

CS761 Spring 2013 Advanced Machine Learning Basics of Statistical Machine Learning Lecturer: Xiaojin Zhu jerryzhu@cs.wisc.edu Modern machine learning is rooted in statistics. You will find many familiar

### Data Structures in Java. Session 16 Instructor: Bert Huang

Data Structures in Java Session 16 Instructor: Bert Huang http://www.cs.columbia.edu/~bert/courses/3134 Announcements Homework 4 due next class Remaining grades: hw4, hw5, hw6 25% Final exam 30% Midterm

### Graph definition Degree, in, out degree, oriented graph. Complete, regular, bipartite graph. Graph representation, connectivity, adjacency.

Mária Markošová Graph definition Degree, in, out degree, oriented graph. Complete, regular, bipartite graph. Graph representation, connectivity, adjacency. Isomorphism of graphs. Paths, cycles, trials.

### Problem Set 7 Solutions

8 8 Introduction to Algorithms May 7, 2004 Massachusetts Institute of Technology 6.046J/18.410J Professors Erik Demaine and Shafi Goldwasser Handout 25 Problem Set 7 Solutions This problem set is due in

### Logistic Regression. Jia Li. Department of Statistics The Pennsylvania State University. Logistic Regression

Logistic Regression Department of Statistics The Pennsylvania State University Email: jiali@stat.psu.edu Logistic Regression Preserve linear classification boundaries. By the Bayes rule: Ĝ(x) = arg max

### Lecture notes from Foundations of Markov chain Monte Carlo methods University of Chicago, Spring 2002 Lecture 1, March 29, 2002

Lecture notes from Foundations of Markov chain Monte Carlo methods University of Chicago, Spring 2002 Lecture 1, March 29, 2002 Eric Vigoda Scribe: Varsha Dani & Tom Hayes 1.1 Introduction The aim of this

### Lecture 3: Linear Programming Relaxations and Rounding

Lecture 3: Linear Programming Relaxations and Rounding 1 Approximation Algorithms and Linear Relaxations For the time being, suppose we have a minimization problem. Many times, the problem at hand can

### Graphical Models, Exponential Families, and Variational Inference

Foundations and Trends R in Machine Learning Vol. 1, Nos. 1 2 (2008) 1 305 c 2008 M. J. Wainwright and M. I. Jordan DOI: 10.1561/2200000001 Graphical Models, Exponential Families, and Variational Inference

### Linear Programming. Widget Factory Example. Linear Programming: Standard Form. Widget Factory Example: Continued.

Linear Programming Widget Factory Example Learning Goals. Introduce Linear Programming Problems. Widget Example, Graphical Solution. Basic Theory:, Vertices, Existence of Solutions. Equivalent formulations.

### / Approximation Algorithms Lecturer: Michael Dinitz Topic: Steiner Tree and TSP Date: 01/29/15 Scribe: Katie Henry

600.469 / 600.669 Approximation Algorithms Lecturer: Michael Dinitz Topic: Steiner Tree and TSP Date: 01/29/15 Scribe: Katie Henry 2.1 Steiner Tree Definition 2.1.1 In the Steiner Tree problem the input

### A linear combination is a sum of scalars times quantities. Such expressions arise quite frequently and have the form

Section 1.3 Matrix Products A linear combination is a sum of scalars times quantities. Such expressions arise quite frequently and have the form (scalar #1)(quantity #1) + (scalar #2)(quantity #2) +...

### Introduction. The Quine-McCluskey Method Handout 5 January 21, 2016. CSEE E6861y Prof. Steven Nowick

CSEE E6861y Prof. Steven Nowick The Quine-McCluskey Method Handout 5 January 21, 2016 Introduction The Quine-McCluskey method is an exact algorithm which finds a minimum-cost sum-of-products implementation

### 6.852: Distributed Algorithms Fall, 2009. Class 2

.8: Distributed Algorithms Fall, 009 Class Today s plan Leader election in a synchronous ring: Lower bound for comparison-based algorithms. Basic computation in general synchronous networks: Leader election

### Statistical machine learning, high dimension and big data

Statistical machine learning, high dimension and big data S. Gaïffas 1 14 mars 2014 1 CMAP - Ecole Polytechnique Agenda for today Divide and Conquer principle for collaborative filtering Graphical modelling,

### Elementary Number Theory We begin with a bit of elementary number theory, which is concerned

CONSTRUCTION OF THE FINITE FIELDS Z p S. R. DOTY Elementary Number Theory We begin with a bit of elementary number theory, which is concerned solely with questions about the set of integers Z = {0, ±1,

### Scheduling. Open shop, job shop, flow shop scheduling. Related problems. Open shop, job shop, flow shop scheduling

Scheduling Basic scheduling problems: open shop, job shop, flow job The disjunctive graph representation Algorithms for solving the job shop problem Computational complexity of the job shop problem Open

### PES Institute of Technology-BSC QUESTION BANK

PES Institute of Technology-BSC Faculty: Mrs. R.Bharathi CS35: Data Structures Using C QUESTION BANK UNIT I -BASIC CONCEPTS 1. What is an ADT? Briefly explain the categories that classify the functions

### Divide-and-Conquer. Three main steps : 1. divide; 2. conquer; 3. merge.

Divide-and-Conquer Three main steps : 1. divide; 2. conquer; 3. merge. 1 Let I denote the (sub)problem instance and S be its solution. The divide-and-conquer strategy can be described as follows. Procedure

### DATA ANALYSIS II. Matrix Algorithms

DATA ANALYSIS II Matrix Algorithms Similarity Matrix Given a dataset D = {x i }, i=1,..,n consisting of n points in R d, let A denote the n n symmetric similarity matrix between the points, given as where

### Approximation Algorithms

Approximation Algorithms or: How I Learned to Stop Worrying and Deal with NP-Completeness Ong Jit Sheng, Jonathan (A0073924B) March, 2012 Overview Key Results (I) General techniques: Greedy algorithms

### COMBINATORIAL PROPERTIES OF THE HIGMAN-SIMS GRAPH. 1. Introduction

COMBINATORIAL PROPERTIES OF THE HIGMAN-SIMS GRAPH ZACHARY ABEL 1. Introduction In this survey we discuss properties of the Higman-Sims graph, which has 100 vertices, 1100 edges, and is 22 regular. In fact

### A crash course in probability and Naïve Bayes classification

Probability theory A crash course in probability and Naïve Bayes classification Chapter 9 Random variable: a variable whose possible values are numerical outcomes of a random phenomenon. s: A person s

Cluster Analysis: Advanced Concepts and dalgorithms Dr. Hui Xiong Rutgers University Introduction to Data Mining 08/06/2006 1 Introduction to Data Mining 08/06/2006 1 Outline Prototype-based Fuzzy c-means

### Outline 2.1 Graph Isomorphism 2.2 Automorphisms and Symmetry 2.3 Subgraphs, part 1

GRAPH THEORY LECTURE STRUCTURE AND REPRESENTATION PART A Abstract. Chapter focuses on the question of when two graphs are to be regarded as the same, on symmetries, and on subgraphs.. discusses the concept

### Machine Learning and Data Analysis overview. Department of Cybernetics, Czech Technical University in Prague. http://ida.felk.cvut.

Machine Learning and Data Analysis overview Jiří Kléma Department of Cybernetics, Czech Technical University in Prague http://ida.felk.cvut.cz psyllabus Lecture Lecturer Content 1. J. Kléma Introduction,

### 6.045: Automata, Computability, and Complexity Or, Great Ideas in Theoretical Computer Science Spring, 2010. Class 4 Nancy Lynch

6.045: Automata, Computability, and Complexity Or, Great Ideas in Theoretical Computer Science Spring, 2010 Class 4 Nancy Lynch Today Two more models of computation: Nondeterministic Finite Automata (NFAs)

### CS268: Geometric Algorithms Handout #5 Design and Analysis Original Handout #15 Stanford University Tuesday, 25 February 1992

CS268: Geometric Algorithms Handout #5 Design and Analysis Original Handout #15 Stanford University Tuesday, 25 February 1992 Original Lecture #6: 28 January 1991 Topics: Triangulating Simple Polygons

### Linear Programming. March 14, 2014

Linear Programming March 1, 01 Parts of this introduction to linear programming were adapted from Chapter 9 of Introduction to Algorithms, Second Edition, by Cormen, Leiserson, Rivest and Stein [1]. 1

### 13.3 Inference Using Full Joint Distribution

191 The probability distribution on a single variable must sum to 1 It is also true that any joint probability distribution on any set of variables must sum to 1 Recall that any proposition a is equivalent

### Exponential Random Graph Models for Social Network Analysis. Danny Wyatt 590AI March 6, 2009

Exponential Random Graph Models for Social Network Analysis Danny Wyatt 590AI March 6, 2009 Traditional Social Network Analysis Covered by Eytan Traditional SNA uses descriptive statistics Path lengths

### Spatial Statistics Chapter 3 Basics of areal data and areal data modeling

Spatial Statistics Chapter 3 Basics of areal data and areal data modeling Recall areal data also known as lattice data are data Y (s), s D where D is a discrete index set. This usually corresponds to data

### R u t c o r Research R e p o r t. Boolean functions with a simple certificate for CNF complexity. Ondřej Čepeka Petr Kučera b Petr Savický c

R u t c o r Research R e p o r t Boolean functions with a simple certificate for CNF complexity Ondřej Čepeka Petr Kučera b Petr Savický c RRR 2-2010, January 24, 2010 RUTCOR Rutgers Center for Operations

### 1 Solving LPs: The Simplex Algorithm of George Dantzig

Solving LPs: The Simplex Algorithm of George Dantzig. Simplex Pivoting: Dictionary Format We illustrate a general solution procedure, called the simplex algorithm, by implementing it on a very simple example.

### Definition 11.1. Given a graph G on n vertices, we define the following quantities:

Lecture 11 The Lovász ϑ Function 11.1 Perfect graphs We begin with some background on perfect graphs. graphs. First, we define some quantities on Definition 11.1. Given a graph G on n vertices, we define

### Practical Graph Mining with R. 5. Link Analysis

Practical Graph Mining with R 5. Link Analysis Outline Link Analysis Concepts Metrics for Analyzing Networks PageRank HITS Link Prediction 2 Link Analysis Concepts Link A relationship between two entities

### Why? A central concept in Computer Science. Algorithms are ubiquitous.

Analysis of Algorithms: A Brief Introduction Why? A central concept in Computer Science. Algorithms are ubiquitous. Using the Internet (sending email, transferring files, use of search engines, online