# Probabilistic Graphical Models Final Exam

Save this PDF as:

Size: px
Start display at page:

## Transcription

1 Introduction to Probabilistic Graphical Models Final Exam, March Probabilistic Graphical Models Final Exam Please fill your name and I.D.: Name:... I.D.:... Duration: 3 hours. Guidelines: 1. The test is composed of five questions. The credit for each question is 25 points. You should accumulate as many points as you can from the possible 125 points. Notice that you will gain credit for each section you solve, regardless of whether you answer the complete question or not. Your grade in the test will be the number of points you obtained (where grades above 100 will be rounded to 100). 2. Concise answers are preferred. But, make sure that your arguments are clear and well supported. 3. Notations and definitions: (a) Val(X) set of possible values of RV (Random Variable) X. (b) Upper case letters denote RVs (e.g., X, Y, Z). (c) Upper case bold letters denote set of RVs (e.g., X, Y). (d) Lower case letters denote RV values (e.g., x, y, z). (e) Lower case bold letters denote RV set values (e.g., x). (f) Values for categorical RVs with V al(x) = k : x 1, x 2,..., x k. (g) Marginal distribution over X: P(X). (h) P a x - Parents of RV X. (i) P = (A B) - P satisfies : A is independent of B (j) P = (A c B) - under context c, P satisfies that A is independent of B (k) A positive distribution: a distribution P where x P (X = x) > 0 (l) Entropy of a discrete variable Z: H(Z) = z p(z) log p(z). (m) Entropy of a discrete variable Z given evidence e : H(Z e) = z:z =e p(z e) log p(z e). (n) The Kullback Leibler distance, KL, between two distribution defined on the RV S: KL(P r(s), P r (S)) = P r(s) s P r(s) log P r (s). (o) The Kullback Leibler distance, KL, between two distribution defined on the RV S, under evidence e: KL(P r(s e), P r (S e)) = P r(s e) s:s =e P r(s e) log P r (s e). Good luck!

2 Introduction to Probabilistic Graphical Models Final Exam, March Questions: 1. Learning Naive Bayes network. In this question we will deal with learning a Naive Bayes classification network. The network, illustrated in Figure 1, is composed of a discrete variable C with domain {c 1 c K }, and N boolean variables: X = X 1 X N. C X 1 X 2 X X 3 X N Figure 1: Naive Bayes network (a) [5 points] Assume you have M I.I.D instances of complete data: D = d 1 d M, where d i = {c[i], x[i]} is the i-th instance of the data in which C = c[i] and X = x[i]. We define also the notation of x j [i] to be the value of X j in data instance i. Write the formulas of the maximum likelihood estimate (MLE) for the network parameters i.e. the parameters Θ that maximize the likelihood of the data L(D : Θ). More specifically write the formulas of Θ Xi C and Θ C. Remark: write the final solution without the entire derivation (e.g., no derivatives and no integrals). (b) [10 points] Now, we would like to compute the Bayesian estimators for the parametrs incorporating a Dirichlet prior over the different variables. Recall that the Dirichlet prior is defined by: P (Θ) = 1 Z k Θ α k 1 k where Z = k i=1 Γ(α i) Γ( k i=1 α i) and Γ(x) = Write the following computations using Dirichlet prior over c and X: i. P (X j [M + 1] = x 1 D, C[M + 1] = c 1 ) ii. P (C[M + 1] = c 1 D) 0 t x 1 e t dt Remark: write the final solution without the entire derivation (e.g., no derivatives and no integrals). (c) [10 points] In real applications the value of the classification variable C is often hidden. In this setting one option for parameter estimation employs the EM algorithm learned in class. i. [2 points] Briefly describe what is the type of computation that is done in the E-step and the M-step of an EM algorithm. ii. [8 points] Write the explicit formulas for the computation of the E-step and the M-step when learning the above network where C is hidden.

3 Introduction to Probabilistic Graphical Models Final Exam, March Context Specific Independence (CSI) (a) [15 points] Consider a BN B with a variable X that has a tree-cpd. Let c be some context - a specific assignment of values for a subset C of X s parents (Pa X ). i.e., c V al(c) for C Pa X. Let Z C be another variable in B s.t. Z Pa X. Propose a (simple) assumption on the distributions at the leaves of X s tree-cpd under which it holds that if P B = (X c Z Pa X Z, c), then T c (the tree-cpd of X restricted to the branches consistent with c) does not test Z. You may assume that all of the variables in B are discrete binary variables. (b) [10 points] Prove the following statement, or disprove it by finding a counterexample: CSI-sep (CSI - context specific independence) statements are monotonic in the context; i.e., let c be an assignment to some set of variables C in a BN, and let C C, c - an assignment to C that is consistent with c, then if X and Y are some variables in the BN that are CSI-separated given c, they are also CSI-separated given c. Reminder - CSI-sep: Let B be a BN, c be a context, and let X, Y be sets of variables. if CSI sep(b, c, X, Y) = true then P B = (X c Y c).

4 Introduction to Probabilistic Graphical Models Final Exam, March I-equivalence In this question you will prove the following theorem about I-equivalent networks: If G and G are two I-equivalent networks, then there is a sequence of networks G = G 1...G k = G that are all I-equivalent, such that the difference between any network G i to Gi + 1 is a single edge reversal. (a) [2 points] i. Define I-equivalence between two networks. ii. State the two structural properties that must hold in any two I-equivalent Bayesian networks. (b) [3 points] Consider the network structures shown in Figure 2. Which of them belong to the same I-equivalence class and which do not? Figure 2: 4 different network structures (c) [5 points] Choose two I-equivalent graphs from Figure 2, and mark them as G and G. i. Show a series of networks G = G 1...G k = G all I-equivalent such that the difference between any G i to Gi + 1 is a single edge reversal. ii. Show a series of networks G = G 1...G k = G such that the difference between any G i to Gi + 1 is a single edge reversal, but the graphs are not all I-equivalent. (d) [5 points] Consider a network G and a single edge reversal in this network that results in a new network G, such that X Y G and X Y G. State a general rule such that if it holds for an edge X Y G then reversing this edge will result in an I-equivalent network G (Hint: Consider the parents of nodes X and Y ). Prove your claim. (e) [5 points] Consider two I-equivalent networks G and G. Let R be the set of all edges X Y G that have a different direction in G (i.e., X Y G and X Y G ). Show that there must exist at least one edge in R that the rule you defined in the previous section holds for. (f) [5 points] Use the above section to show that if G and G are two I-equivalent networks, then you can find a sequence of networks G = G 1...G k = G such that the difference between any G i to Gi + 1 is a single edge reversal.

5 Introduction to Probabilistic Graphical Models Final Exam, March Approximate Inference by Edge Deletion [25 points] In this question we consider the problem of deleting edges from a Bayesian network for the purpose of simplifying models in probabilistic inference. Let B be a Bayesian network over variables χ with node X having parents Y and U. The network B which results from deleting edge Y X from B given evidence e is defined as follows: - B has the same structure as B except that edge Y X is removed. - The CPT (conditional probability table) for variable X in B is given by: θ x u. = y θ x y,u P r(y e). - The CPTs for variables other than X in B are the same as those in N. (a) [8 points] Can the transformation described above be performed in polynomial time in general? be precise. For the following parts we will use the definitions of the entropy and the KL distance from the notations and definitions section. Assume that all distributions are positive and that the RVs which are instantiated in e do not contain X,Y or U. (b) [8 points] Let B and B be two Bayesian networks as given in the definition. Prove that, KL(P r(χ e), P r (χ e)) = log P r (e) P r(e) P r(y, u e) y,u x ( θ x u ) P r(x y, u, e)log θx y, u (c) [9 points] Let B and B be two Bayesian networks as given in the definition. Use the previous result to prove that, KL(P r(χ e), P r (χ e)) log P r (e) P r(e) + H(Y e)

6 Introduction to Probabilistic Graphical Models Final Exam, March Generalized Belief Propagation (GBP) [25 points] In class we saw that we can improve on the basic loopy belief propagation algorithm by propagating messages on a general cluster graph K, resulting in the generalized belief propagation (GBP) algorithm. Recall that such a graph is specified both by the scope of its factors (the variables over which the factor are defined), and by the scope of the edges between these factors. In this question we are interested in applying GBP to directed models. To do so we need to transform a general Bayesian network B into an undirected general cluster graph. Reminder: Recall the following definitions of sepset and running intersection in cluster graph and in general cluster graph. Given two of the graph nodes C i, C j, the main difference between the definitions of sepset in cluster graph and in a general cluster graph is that while in a cluster graph the sepset S i,j is defined by S i,j = C i C j, in a general cluster graph this definition is relaxed to S i,j C i C j. In other words, while in a cluster graph the sepset is associated with the variables in the intersection of the two neighboring nodes, in general cluster graph it is associated with a subset of these variables. running intersection in a cluster graph - if x C i and x C j then x is in each cluster in the (unique) path between C i and C j. running intersection in a general cluster graph - for each X C i and X C j, there is exactly one path between C i and C j for which X S for each subset S along the path. (a) [10 points] Below are two schemes for converting a Bayesian network B to a cluster graph K. For each of these two schemes, either show (by proving the necessary properties) that it produces a valid general cluster graph for a general Bayesian network, or disprove this result by showing a counter-example. Note that exactly one of these schemes produces a valid general cluster graph for GBP inference. Scheme 1: For each node X i in B, define a factor φ i over the family of X i in B (the node X i and its parents). Connect φ i and φ j if X j is a parent of X i in B. The scope of such an edge is the intersection of the clusters. Scheme 2: For each node X i in B, define a factor φ i over the family of X i in B (the node X i and its parents). Connect φ i and φ j if X j is a parent of X i in B. The scope of such an edge is {X j }. In both cases, the initial potential in each factor φ i is simply the CPD corresponding to X i given its parents in B. (b) [10 points] Construct an alternative scheme to the ones proposed in (a) that uses a spanning tree algorithm. (You can assume the existence of a spanning tree algorithm without defining it, but must be precise on how it is used.) Your scheme must transform any Bayesian network into a valid cluster graph. Reminder: A spanning tree algorithm is an algorithm that takes a graph G as an input, and finds a subset of the edges that form a tree which spans all G s nodes. Hint: You can show how you use a spanning tree algorithm to fix the result of the scheme in (a) that outputs a non-valid general cluster graph. (c) [5 points] Show an example where the method you suggested in (b) could give a superior result to that of the one legal method of (a). Hint: You can actually show an example where (b) can give exact marginals, but the one legal method of (a) is only guaranteed to be an approximation.

### The Basics of Graphical Models

The Basics of Graphical Models David M. Blei Columbia University October 3, 2015 Introduction These notes follow Chapter 2 of An Introduction to Probabilistic Graphical Models by Michael Jordan. Many figures

### Probabilistic Graphical Models

Probabilistic Graphical Models Raquel Urtasun and Tamir Hazan TTI Chicago April 4, 2011 Raquel Urtasun and Tamir Hazan (TTI-C) Graphical Models April 4, 2011 1 / 22 Bayesian Networks and independences

### 3. The Junction Tree Algorithms

A Short Course on Graphical Models 3. The Junction Tree Algorithms Mark Paskin mark@paskin.org 1 Review: conditional independence Two random variables X and Y are independent (written X Y ) iff p X ( )

### A crash course in probability and Naïve Bayes classification

Probability theory A crash course in probability and Naïve Bayes classification Chapter 9 Random variable: a variable whose possible values are numerical outcomes of a random phenomenon. s: A person s

### Logic, Probability and Learning

Logic, Probability and Learning Luc De Raedt luc.deraedt@cs.kuleuven.be Overview Logic Learning Probabilistic Learning Probabilistic Logic Learning Closely following : Russell and Norvig, AI: a modern

### Course: Model, Learning, and Inference: Lecture 5

Course: Model, Learning, and Inference: Lecture 5 Alan Yuille Department of Statistics, UCLA Los Angeles, CA 90095 yuille@stat.ucla.edu Abstract Probability distributions on structured representation.

### Robotics 2 Clustering & EM. Giorgio Grisetti, Cyrill Stachniss, Kai Arras, Maren Bennewitz, Wolfram Burgard

Robotics 2 Clustering & EM Giorgio Grisetti, Cyrill Stachniss, Kai Arras, Maren Bennewitz, Wolfram Burgard 1 Clustering (1) Common technique for statistical data analysis to detect structure (machine learning,

### Examples and Proofs of Inference in Junction Trees

Examples and Proofs of Inference in Junction Trees Peter Lucas LIAC, Leiden University February 4, 2016 1 Representation and notation Let P(V) be a joint probability distribution, where V stands for a

### Incorporating Evidence in Bayesian networks with the Select Operator

Incorporating Evidence in Bayesian networks with the Select Operator C.J. Butz and F. Fang Department of Computer Science, University of Regina Regina, Saskatchewan, Canada SAS 0A2 {butz, fang11fa}@cs.uregina.ca

### Artificial Intelligence Mar 27, Bayesian Networks 1 P (T D)P (D) + P (T D)P ( D) =

Artificial Intelligence 15-381 Mar 27, 2007 Bayesian Networks 1 Recap of last lecture Probability: precise representation of uncertainty Probability theory: optimal updating of knowledge based on new information

### Lecture 2: Introduction to belief (Bayesian) networks

Lecture 2: Introduction to belief (Bayesian) networks Conditional independence What is a belief network? Independence maps (I-maps) January 7, 2008 1 COMP-526 Lecture 2 Recall from last time: Conditional

### CS 188: Artificial Intelligence. Probability recap

CS 188: Artificial Intelligence Bayes Nets Representation and Independence Pieter Abbeel UC Berkeley Many slides over this course adapted from Dan Klein, Stuart Russell, Andrew Moore Conditional probability

### CHAPTER 2 Estimating Probabilities

CHAPTER 2 Estimating Probabilities Machine Learning Copyright c 2016. Tom M. Mitchell. All rights reserved. *DRAFT OF January 24, 2016* *PLEASE DO NOT DISTRIBUTE WITHOUT AUTHOR S PERMISSION* This is a

### Social Media Mining. Data Mining Essentials

Introduction Data production rate has been increased dramatically (Big Data) and we are able store much more data than before E.g., purchase data, social media data, mobile phone data Businesses and customers

### Message-passing sequential detection of multiple change points in networks

Message-passing sequential detection of multiple change points in networks Long Nguyen, Arash Amini Ram Rajagopal University of Michigan Stanford University ISIT, Boston, July 2012 Nguyen/Amini/Rajagopal

### 1 if 1 x 0 1 if 0 x 1

Chapter 3 Continuity In this chapter we begin by defining the fundamental notion of continuity for real valued functions of a single real variable. When trying to decide whether a given function is or

### Bayesian networks - Time-series models - Apache Spark & Scala

Bayesian networks - Time-series models - Apache Spark & Scala Dr John Sandiford, CTO Bayes Server Data Science London Meetup - November 2014 1 Contents Introduction Bayesian networks Latent variables Anomaly

### Lecture 11: Graphical Models for Inference

Lecture 11: Graphical Models for Inference So far we have seen two graphical models that are used for inference - the Bayesian network and the Join tree. These two both represent the same joint probability

### Question 2 Naïve Bayes (16 points)

Question 2 Naïve Bayes (16 points) About 2/3 of your email is spam so you downloaded an open source spam filter based on word occurrences that uses the Naive Bayes classifier. Assume you collected the

### The Goldberg Rao Algorithm for the Maximum Flow Problem

The Goldberg Rao Algorithm for the Maximum Flow Problem COS 528 class notes October 18, 2006 Scribe: Dávid Papp Main idea: use of the blocking flow paradigm to achieve essentially O(min{m 2/3, n 1/2 }

### Decision Trees and Networks

Lecture 21: Uncertainty 6 Today s Lecture Victor R. Lesser CMPSCI 683 Fall 2010 Decision Trees and Networks Decision Trees A decision tree is an explicit representation of all the possible scenarios from

### princeton univ. F 13 cos 521: Advanced Algorithm Design Lecture 6: Provable Approximation via Linear Programming Lecturer: Sanjeev Arora

princeton univ. F 13 cos 521: Advanced Algorithm Design Lecture 6: Provable Approximation via Linear Programming Lecturer: Sanjeev Arora Scribe: One of the running themes in this course is the notion of

### Customer Classification And Prediction Based On Data Mining Technique

Customer Classification And Prediction Based On Data Mining Technique Ms. Neethu Baby 1, Mrs. Priyanka L.T 2 1 M.E CSE, Sri Shakthi Institute of Engineering and Technology, Coimbatore 2 Assistant Professor

### An Introduction to Data Mining

An Introduction to Intel Beijing wei.heng@intel.com January 17, 2014 Outline 1 DW Overview What is Notable Application of Conference, Software and Applications Major Process in 2 Major Tasks in Detail

### Why the Normal Distribution?

Why the Normal Distribution? Raul Rojas Freie Universität Berlin Februar 2010 Abstract This short note explains in simple terms why the normal distribution is so ubiquitous in pattern recognition applications.

### Lecture 3: Linear Programming Relaxations and Rounding

Lecture 3: Linear Programming Relaxations and Rounding 1 Approximation Algorithms and Linear Relaxations For the time being, suppose we have a minimization problem. Many times, the problem at hand can

### Data Mining Chapter 6: Models and Patterns Fall 2011 Ming Li Department of Computer Science and Technology Nanjing University

Data Mining Chapter 6: Models and Patterns Fall 2011 Ming Li Department of Computer Science and Technology Nanjing University Models vs. Patterns Models A model is a high level, global description of a

### Answers to some of the exercises.

Answers to some of the exercises. Chapter 2. Ex.2.1 (a) There are several ways to do this. Here is one possibility. The idea is to apply the k-center algorithm first to D and then for each center in D

### Statistical Machine Learning from Data

Samy Bengio Statistical Machine Learning from Data 1 Statistical Machine Learning from Data Gaussian Mixture Models Samy Bengio IDIAP Research Institute, Martigny, Switzerland, and Ecole Polytechnique

### Stat260: Bayesian Modeling and Inference Lecture Date: February 1, Lecture 3

Stat26: Bayesian Modeling and Inference Lecture Date: February 1, 21 Lecture 3 Lecturer: Michael I. Jordan Scribe: Joshua G. Schraiber 1 Decision theory Recall that decision theory provides a quantification

10-601 Machine Learning http://www.cs.cmu.edu/afs/cs/academic/class/10601-f10/index.html Course data All up-to-date info is on the course web page: http://www.cs.cmu.edu/afs/cs/academic/class/10601-f10/index.html

### Information Theory and Coding Prof. S. N. Merchant Department of Electrical Engineering Indian Institute of Technology, Bombay

Information Theory and Coding Prof. S. N. Merchant Department of Electrical Engineering Indian Institute of Technology, Bombay Lecture - 17 Shannon-Fano-Elias Coding and Introduction to Arithmetic Coding

### SYSM 6304: Risk and Decision Analysis Lecture 5: Methods of Risk Analysis

SYSM 6304: Risk and Decision Analysis Lecture 5: Methods of Risk Analysis M. Vidyasagar Cecil & Ida Green Chair The University of Texas at Dallas Email: M.Vidyasagar@utdallas.edu October 17, 2015 Outline

### Tutorial on variational approximation methods. Tommi S. Jaakkola MIT AI Lab

Tutorial on variational approximation methods Tommi S. Jaakkola MIT AI Lab tommi@ai.mit.edu Tutorial topics A bit of history Examples of variational methods A brief intro to graphical models Variational

### 10-810 /02-710 Computational Genomics. Clustering expression data

10-810 /02-710 Computational Genomics Clustering expression data What is Clustering? Organizing data into clusters such that there is high intra-cluster similarity low inter-cluster similarity Informally,

### Lecture 7: NP-Complete Problems

IAS/PCMI Summer Session 2000 Clay Mathematics Undergraduate Program Basic Course on Computational Complexity Lecture 7: NP-Complete Problems David Mix Barrington and Alexis Maciel July 25, 2000 1. Circuit

### CORRELATED TO THE SOUTH CAROLINA COLLEGE AND CAREER-READY FOUNDATIONS IN ALGEBRA

We Can Early Learning Curriculum PreK Grades 8 12 INSIDE ALGEBRA, GRADES 8 12 CORRELATED TO THE SOUTH CAROLINA COLLEGE AND CAREER-READY FOUNDATIONS IN ALGEBRA April 2016 www.voyagersopris.com Mathematical

### Big Data, Machine Learning, Causal Models

Big Data, Machine Learning, Causal Models Sargur N. Srihari University at Buffalo, The State University of New York USA Int. Conf. on Signal and Image Processing, Bangalore January 2014 1 Plan of Discussion

### STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.cs.toronto.edu/~rsalakhu/ Lecture 6 Three Approaches to Classification Construct

### Basics of Statistical Machine Learning

CS761 Spring 2013 Advanced Machine Learning Basics of Statistical Machine Learning Lecturer: Xiaojin Zhu jerryzhu@cs.wisc.edu Modern machine learning is rooted in statistics. You will find many familiar

### 2.3 Scheduling jobs on identical parallel machines

2.3 Scheduling jobs on identical parallel machines There are jobs to be processed, and there are identical machines (running in parallel) to which each job may be assigned Each job = 1,,, must be processed

### 2. (a) Explain the strassen s matrix multiplication. (b) Write deletion algorithm, of Binary search tree. [8+8]

Code No: R05220502 Set No. 1 1. (a) Describe the performance analysis in detail. (b) Show that f 1 (n)+f 2 (n) = 0(max(g 1 (n), g 2 (n)) where f 1 (n) = 0(g 1 (n)) and f 2 (n) = 0(g 2 (n)). [8+8] 2. (a)

### Bayesian Networks Chapter 14. Mausam (Slides by UW-AI faculty & David Page)

Bayesian Networks Chapter 14 Mausam (Slides by UW-AI faculty & David Page) Bayes Nets In general, joint distribution P over set of variables (X 1 x... x X n ) requires exponential space for representation

### Why? A central concept in Computer Science. Algorithms are ubiquitous.

Analysis of Algorithms: A Brief Introduction Why? A central concept in Computer Science. Algorithms are ubiquitous. Using the Internet (sending email, transferring files, use of search engines, online

### Discuss the size of the instance for the minimum spanning tree problem.

3.1 Algorithm complexity The algorithms A, B are given. The former has complexity O(n 2 ), the latter O(2 n ), where n is the size of the instance. Let n A 0 be the size of the largest instance that can

### Christfried Webers. Canberra February June 2015

c Statistical Group and College of Engineering and Computer Science Canberra February June (Many figures from C. M. Bishop, "Pattern Recognition and ") 1of 829 c Part VIII Linear Classification 2 Logistic

### LECTURE 4. Last time: Lecture outline

LECTURE 4 Last time: Types of convergence Weak Law of Large Numbers Strong Law of Large Numbers Asymptotic Equipartition Property Lecture outline Stochastic processes Markov chains Entropy rate Random

### Recursive Algorithms. Recursion. Motivating Example Factorial Recall the factorial function. { 1 if n = 1 n! = n (n 1)! if n > 1

Recursion Slides by Christopher M Bourke Instructor: Berthe Y Choueiry Fall 007 Computer Science & Engineering 35 Introduction to Discrete Mathematics Sections 71-7 of Rosen cse35@cseunledu Recursive Algorithms

### Graphs without proper subgraphs of minimum degree 3 and short cycles

Graphs without proper subgraphs of minimum degree 3 and short cycles Lothar Narins, Alexey Pokrovskiy, Tibor Szabó Department of Mathematics, Freie Universität, Berlin, Germany. August 22, 2014 Abstract

### Sufficient Statistics and Exponential Family. 1 Statistics and Sufficient Statistics. Math 541: Statistical Theory II. Lecturer: Songfeng Zheng

Math 541: Statistical Theory II Lecturer: Songfeng Zheng Sufficient Statistics and Exponential Family 1 Statistics and Sufficient Statistics Suppose we have a random sample X 1,, X n taken from a distribution

### Data Mining Algorithms Part 1. Dejan Sarka

Data Mining Algorithms Part 1 Dejan Sarka Join the conversation on Twitter: @DevWeek #DW2015 Instructor Bio Dejan Sarka (dsarka@solidq.com) 30 years of experience SQL Server MVP, MCT, 13 books 7+ courses

### Bayes and Naïve Bayes. cs534-machine Learning

Bayes and aïve Bayes cs534-machine Learning Bayes Classifier Generative model learns Prediction is made by and where This is often referred to as the Bayes Classifier, because of the use of the Bayes rule

### L10: Probability, statistics, and estimation theory

L10: Probability, statistics, and estimation theory Review of probability theory Bayes theorem Statistics and the Normal distribution Least Squares Error estimation Maximum Likelihood estimation Bayesian

### (67902) Topics in Theory and Complexity Nov 2, 2006. Lecture 7

(67902) Topics in Theory and Complexity Nov 2, 2006 Lecturer: Irit Dinur Lecture 7 Scribe: Rani Lekach 1 Lecture overview This Lecture consists of two parts In the first part we will refresh the definition

### Distributed Computing over Communication Networks: Maximal Independent Set

Distributed Computing over Communication Networks: Maximal Independent Set What is a MIS? MIS An independent set (IS) of an undirected graph is a subset U of nodes such that no two nodes in U are adjacent.

### Final Project Report

CPSC545 by Introduction to Data Mining Prof. Martin Schultz & Prof. Mark Gerstein Student Name: Yu Kor Hugo Lam Student ID : 904907866 Due Date : May 7, 2007 Introduction Final Project Report Pseudogenes

### Classification - Examples

Lecture 2 Scheduling 1 Classification - Examples 1 r j C max given: n jobs with processing times p 1,...,p n and release dates r 1,...,r n jobs have to be scheduled without preemption on one machine taking

### Lecture 15 An Arithmetic Circuit Lowerbound and Flows in Graphs

CSE599s: Extremal Combinatorics November 21, 2011 Lecture 15 An Arithmetic Circuit Lowerbound and Flows in Graphs Lecturer: Anup Rao 1 An Arithmetic Circuit Lower Bound An arithmetic circuit is just like

### Structured Learning and Prediction in Computer Vision. Contents

Foundations and Trends R in Computer Graphics and Vision Vol. 6, Nos. 3 4 (2010) 185 365 c 2011 S. Nowozin and C. H. Lampert DOI: 10.1561/0600000033 Structured Learning and Prediction in Computer Vision

### Lecture 1: Systems of Linear Equations

MTH Elementary Matrix Algebra Professor Chao Huang Department of Mathematics and Statistics Wright State University Lecture 1 Systems of Linear Equations ² Systems of two linear equations with two variables

### Linear Threshold Units

Linear Threshold Units w x hx (... w n x n w We assume that each feature x j and each weight w j is a real number (we will relax this later) We will study three different algorithms for learning linear

### Reinforcement Learning

Reinforcement Learning LU 2 - Markov Decision Problems and Dynamic Programming Dr. Martin Lauer AG Maschinelles Lernen und Natürlichsprachliche Systeme Albert-Ludwigs-Universität Freiburg martin.lauer@kit.edu

### Lecture 1: Course overview, circuits, and formulas

Lecture 1: Course overview, circuits, and formulas Topics in Complexity Theory and Pseudorandomness (Spring 2013) Rutgers University Swastik Kopparty Scribes: John Kim, Ben Lund 1 Course Information Swastik

### (x + a) n = x n + a Z n [x]. Proof. If n is prime then the map

22. A quick primality test Prime numbers are one of the most basic objects in mathematics and one of the most basic questions is to decide which numbers are prime (a clearly related problem is to find

### Approximation Algorithms

Approximation Algorithms or: How I Learned to Stop Worrying and Deal with NP-Completeness Ong Jit Sheng, Jonathan (A0073924B) March, 2012 Overview Key Results (I) General techniques: Greedy algorithms

### Study Manual. Probabilistic Reasoning. Silja Renooij. August 2015

Study Manual Probabilistic Reasoning 2015 2016 Silja Renooij August 2015 General information This study manual was designed to help guide your self studies. As such, it does not include material that is

### 5 Directed acyclic graphs

5 Directed acyclic graphs (5.1) Introduction In many statistical studies we have prior knowledge about a temporal or causal ordering of the variables. In this chapter we will use directed graphs to incorporate

### Exam Introduction Mathematical Finance and Insurance

Exam Introduction Mathematical Finance and Insurance Date: January 8, 2013. Duration: 3 hours. This is a closed-book exam. The exam does not use scrap cards. Simple calculators are allowed. The questions

### Maximum Likelihood Graph Structure Estimation with Degree Distributions

Maximum Likelihood Graph Structure Estimation with Distributions Bert Huang Computer Science Department Columbia University New York, NY 17 bert@cs.columbia.edu Tony Jebara Computer Science Department

### TU e. Advanced Algorithms: experimentation project. The problem: load balancing with bounded look-ahead. Input: integer m 2: number of machines

The problem: load balancing with bounded look-ahead Input: integer m 2: number of machines integer k 0: the look-ahead numbers t 1,..., t n : the job sizes Problem: assign jobs to machines machine to which

### Learning Organizational Principles in Human Environments

Learning Organizational Principles in Human Environments Outline Motivation: Object Allocation Problem Organizational Principles in Kitchen Environments Datasets Learning Organizational Principles Features

### Analysis of Algorithms I: Optimal Binary Search Trees

Analysis of Algorithms I: Optimal Binary Search Trees Xi Chen Columbia University Given a set of n keys K = {k 1,..., k n } in sorted order: k 1 < k 2 < < k n we wish to build an optimal binary search

### Probability, Conditional Independence

Probability, Conditional Independence June 19, 2012 Probability, Conditional Independence Probability Sample space Ω of events Each event ω Ω has an associated measure Probability of the event P(ω) Axioms

### Large-Sample Learning of Bayesian Networks is NP-Hard

Journal of Machine Learning Research 5 (2004) 1287 1330 Submitted 3/04; Published 10/04 Large-Sample Learning of Bayesian Networks is NP-Hard David Maxwell Chickering David Heckerman Christopher Meek Microsoft

### Bayesian Networks. Mausam (Slides by UW-AI faculty)

Bayesian Networks Mausam (Slides by UW-AI faculty) Bayes Nets In general, joint distribution P over set of variables (X 1 x... x X n ) requires exponential space for representation & inference BNs provide

### INFORMATION from a single node (entity) can reach other

1 Network Infusion to Infer Information Sources in Networks Soheil Feizi, Muriel Médard, Gerald Quon, Manolis Kellis, and Ken Duffy arxiv:166.7383v1 [cs.si] 23 Jun 216 Abstract Several significant models

### ONLINE DEGREE-BOUNDED STEINER NETWORK DESIGN. Sina Dehghani Saeed Seddighin Ali Shafahi Fall 2015

ONLINE DEGREE-BOUNDED STEINER NETWORK DESIGN Sina Dehghani Saeed Seddighin Ali Shafahi Fall 2015 ONLINE STEINER FOREST PROBLEM An initially given graph G. s 1 s 2 A sequence of demands (s i, t i ) arriving

### LABEL PROPAGATION ON GRAPHS. SEMI-SUPERVISED LEARNING. ----Changsheng Liu 10-30-2014

LABEL PROPAGATION ON GRAPHS. SEMI-SUPERVISED LEARNING ----Changsheng Liu 10-30-2014 Agenda Semi Supervised Learning Topics in Semi Supervised Learning Label Propagation Local and global consistency Graph

### Semi-Supervised Support Vector Machines and Application to Spam Filtering

Semi-Supervised Support Vector Machines and Application to Spam Filtering Alexander Zien Empirical Inference Department, Bernhard Schölkopf Max Planck Institute for Biological Cybernetics ECML 2006 Discovery

### Classification - Examples -1- 1 r j C max given: n jobs with processing times p 1,..., p n and release dates

Lecture 2 Scheduling 1 Classification - Examples -1-1 r j C max given: n jobs with processing times p 1,..., p n and release dates r 1,..., r n jobs have to be scheduled without preemption on one machine

### The Classes P and NP. mohamed@elwakil.net

Intractable Problems The Classes P and NP Mohamed M. El Wakil mohamed@elwakil.net 1 Agenda 1. What is a problem? 2. Decidable or not? 3. The P class 4. The NP Class 5. TheNP Complete class 2 What is a

### Bayesian Networks. Read R&N Ch. 14.1-14.2. Next lecture: Read R&N 18.1-18.4

Bayesian Networks Read R&N Ch. 14.1-14.2 Next lecture: Read R&N 18.1-18.4 You will be expected to know Basic concepts and vocabulary of Bayesian networks. Nodes represent random variables. Directed arcs

### Computer Science Department. Technion - IIT, Haifa, Israel. Itai and Rodeh [IR] have proved that for any 2-connected graph G and any vertex s G there

- 1 - THREE TREE-PATHS Avram Zehavi Alon Itai Computer Science Department Technion - IIT, Haifa, Israel Abstract Itai and Rodeh [IR] have proved that for any 2-connected graph G and any vertex s G there

### Introduction to Deep Learning Variational Inference, Mean Field Theory

Introduction to Deep Learning Variational Inference, Mean Field Theory 1 Iasonas Kokkinos Iasonas.kokkinos@ecp.fr Center for Visual Computing Ecole Centrale Paris Galen Group INRIA-Saclay Lecture 3: recap

### Probabilistic Latent Semantic Analysis (plsa)

Probabilistic Latent Semantic Analysis (plsa) SS 2008 Bayesian Networks Multimedia Computing, Universität Augsburg Rainer.Lienhart@informatik.uni-augsburg.de www.multimedia-computing.{de,org} References

### Data Mining - Evaluation of Classifiers

Data Mining - Evaluation of Classifiers Lecturer: JERZY STEFANOWSKI Institute of Computing Sciences Poznan University of Technology Poznan, Poland Lecture 4 SE Master Course 2008/2009 revised for 2010

### Statistical Machine Translation: IBM Models 1 and 2

Statistical Machine Translation: IBM Models 1 and 2 Michael Collins 1 Introduction The next few lectures of the course will be focused on machine translation, and in particular on statistical machine translation

### Querying Joint Probability Distributions

Querying Joint Probability Distributions Sargur Srihari srihari@cedar.buffalo.edu 1 Queries of Interest Probabilistic Graphical Models (BNs and MNs) represent joint probability distributions over multiple

### 6.4 The Infinitely Many Alleles Model

6.4. THE INFINITELY MANY ALLELES MODEL 93 NE µ [F (X N 1 ) F (µ)] = + 1 i

### Scheduling Shop Scheduling. Tim Nieberg

Scheduling Shop Scheduling Tim Nieberg Shop models: General Introduction Remark: Consider non preemptive problems with regular objectives Notation Shop Problems: m machines, n jobs 1,..., n operations

### 6.852: Distributed Algorithms Fall, 2009. Class 2

.8: Distributed Algorithms Fall, 009 Class Today s plan Leader election in a synchronous ring: Lower bound for comparison-based algorithms. Basic computation in general synchronous networks: Leader election

### Learning Instance-Specific Predictive Models

Journal of Machine Learning Research 11 (2010) 3333-3369 Submitted 3/09; Revised 7/10; Published 12/10 Learning Instance-Specific Predictive Models Shyam Visweswaran Gregory F. Cooper Department of Biomedical

### Lecture 16 : Relations and Functions DRAFT

CS/Math 240: Introduction to Discrete Mathematics 3/29/2011 Lecture 16 : Relations and Functions Instructor: Dieter van Melkebeek Scribe: Dalibor Zelený DRAFT In Lecture 3, we described a correspondence

### Lecture 3: Linear methods for classification

Lecture 3: Linear methods for classification Rafael A. Irizarry and Hector Corrada Bravo February, 2010 Today we describe four specific algorithms useful for classification problems: linear regression,

### Introduction to Mobile Robotics Bayes Filter Particle Filter and Monte Carlo Localization

Introduction to Mobile Robotics Bayes Filter Particle Filter and Monte Carlo Localization Wolfram Burgard, Maren Bennewitz, Diego Tipaldi, Luciano Spinello 1 Motivation Recall: Discrete filter Discretize

### arxiv: v2 [math.co] 30 Nov 2015

PLANAR GRAPH IS ON FIRE PRZEMYSŁAW GORDINOWICZ arxiv:1311.1158v [math.co] 30 Nov 015 Abstract. Let G be any connected graph on n vertices, n. Let k be any positive integer. Suppose that a fire breaks out

### MATH 304 Linear Algebra Lecture 9: Subspaces of vector spaces (continued). Span. Spanning set.

MATH 304 Linear Algebra Lecture 9: Subspaces of vector spaces (continued). Span. Spanning set. Vector space A vector space is a set V equipped with two operations, addition V V (x,y) x + y V and scalar

### Chapter 7: Termination Detection

Chapter 7: Termination Detection Ajay Kshemkalyani and Mukesh Singhal Distributed Computing: Principles, Algorithms, and Systems Cambridge University Press A. Kshemkalyani and M. Singhal (Distributed Computing)