# Machine Learning

Save this PDF as:

Size: px
Start display at page:

## Transcription

1 Machine Learning Tom M. Mitchell Machine Learning Department Carnegie Mellon University February 23, 2015 Today: Graphical models Bayes Nets: Representing distributions Conditional independencies Simple inference Simple learning Readings: Bishop chapter 8, through 8.2 Mitchell chapter 6

2 Bayes Nets define Joint Probability Distribution in terms of this graph, plus parameters Benefits of Bayes Nets: Represent the full joint distribution in fewer parameters, using prior knowledge about dependencies Algorithms for inference and learning

3 Bayesian Networks Definition A Bayes network represents the joint probability distribution over a collection of random variables A Bayes network is a directed acyclic graph and a set of conditional probability distributions (CPD s) Each node denotes a random variable Edges denote dependencies For each node X i its CPD defines P(X i Pa(X i )) The joint distribution over all variables is defined to be Pa(X) = immediate parents of X in the graph

4 Bayesian Network Nodes = random variables StormClouds A conditional probability distribution (CPD) is associated with each node N, defining P(N Parents(N)) Lightning Rain Parents P(W Pa) P( W Pa) L, R L, R L, R L, R WindSurf Thunder WindSurf The joint distribution over all variables:

5 Bayesian Networks CPD for each node X i describes P(X i Pa(X i )) Chain rule of probability: But in a Bayes net:

6 StormClouds Inference in Bayes Nets Lightning Rain Parents P(W Pa) P( W Pa) L, R L, R L, R L, R WindSurf Thunder WindSurf P(S=1, L=0, R=1, T=0, W=1) =

7 StormClouds Learning a Bayes Net Lightning Rain Parents P(W Pa) P( W Pa) L, R L, R L, R L, R WindSurf Thunder WindSurf Consider learning when graph structure is given, and data = { <s,l,r,t,w> } What is the MLE solution? MAP?

8 Algorithm for Constructing Bayes Network Choose an ordering over variables, e.g., X 1, X 2,... X n For i=1 to n Add X i to the network Select parents Pa(X i ) as minimal subset of X 1... X i-1 such that Notice this choice of parents assures (by chain rule) (by construction)

9 Example Bird flu and Allegies both cause Nasal problems Nasal problems cause Sneezes and Headaches

10 What is the Bayes Network for X1, X4 with NO assumed conditional independencies?

11 What is the Bayes Network for Naïve Bayes?

12 What do we do if variables are mix of discrete and real valued?

13 Bayes Network for a Hidden Markov Model Implies the future is conditionally independent of the past, given the present Unobserved state: S t-2 S t-1 S t S t+1 S t+2 Observed output: O t-2 O t-1 O t O t+1 O t+2

14 Conditional Independence, Revisited We said: Each node is conditionally independent of its non-descendents, given its immediate parents. Does this rule give us all of the conditional independence relations implied by the Bayes network? No! E.g., X1 and X4 are conditionally indep given {X2, X3} But X1 and X4 not conditionally indep given X3 For this, we need to understand D-separation X1 X4 X2 X3

15 Easy Network 1: Head to Tail A prove A cond indep of B given C? ie., p(a,b c) = p(a c) p(b c) C B let s use p(a,b) as shorthand for p(a=a, B=b)

16 Easy Network 2: Tail to Tail prove A cond indep of B given C? ie., p(a,b c) = p(a c) p(b c) A C B let s use p(a,b) as shorthand for p(a=a, B=b)

17 Easy Network 3: Head to Head prove A cond indep of B given C? ie., p(a,b c) = p(a c) p(b c) A C B let s use p(a,b) as shorthand for p(a=a, B=b)

18 Easy Network 3: Head to Head A prove A cond indep of B given C? NO! C Summary: p(a,b)=p(a)p(b) p(a,b c) NotEqual p(a c)p(b c) B Explaining away. e.g., A=earthquake B=breakIn C=motionAlarm

19 X and Y are conditionally independent given Z, if and only if X and Y are D-separated by Z. [Bishop, 8.2.2] Suppose we have three sets of random variables: X, Y and Z X and Y are D-separated by Z (and therefore conditionally indep, given Z) iff every path from every variable in X to every variable in Y is blocked A path from variable X to variable Y is blocked if it includes a node in Z such that either A Z B A Z B 1. arrows on the path meet either head-to-tail or tail-to-tail at the node and this node is in Z 2. or, the arrows meet head-to-head at the node, and neither the node, nor any of its descendants, is in Z A C B D

20 X and Y are D-separated by Z (and therefore conditionally indep, given Z) iff every path from every variable in X to every variable in Y is blocked A path from variable A to variable B is blocked if it includes a node such that either 1. arrows on the path meet either head-to-tail or tail-to-tail at the node and this node is in Z 2. or, the arrows meet head-to-head at the node, and neither the node, nor any of its descendants, is in Z X1 indep of X3 given X2? X3 indep of X1 given X2? X4 indep of X1 given X2? X4 X1 X2 X3

21 X and Y are D-separated by Z (and therefore conditionally indep, given Z) iff every path from any variable in X to any variable in Y is blocked by Z A path from variable A to variable B is blocked by Z if it includes a node such that either 1. arrows on the path meet either head-to-tail or tail-to-tail at the node and this node is in Z 2. the arrows meet head-to-head at the node, and neither the node, nor any of its descendants, is in Z X4 indep of X1 given X3? X4 indep of X1 given {X3, X2}? X4 indep of X1 given {}? X1 X4 X2 X3

22 X and Y are D-separated by Z (and therefore conditionally indep, given Z) iff every path from any variable in X to any variable in Y is blocked A path from variable A to variable B is blocked if it includes a node such that either 1. arrows on the path meet either head-to-tail or tail-to-tail at the node and this node is in Z 2. or, the arrows meet head-to-head at the node, and neither the node, nor any of its descendants, is in Z a indep of b given c? a indep of b given f?

23 Markov Blanket from [Bishop, 8.2]

24 What You Should Know Bayes nets are convenient representation for encoding dependencies / conditional independence BN = Graph plus parameters of CPD s Defines joint distribution over variables Can calculate everything else from that Though inference may be intractable Reading conditional independence relations from the graph Each node is cond indep of non-descendents, given only its parents D-separation Explaining away

25 Inference in Bayes Nets In general, intractable (NP-complete) For certain cases, tractable Assigning probability to fully observed set of variables Or if just one variable unobserved Or for singly connected graphs (ie., no undirected loops) Belief propagation For multiply connected graphs Junction tree Sometimes use Monte Carlo methods Generate many samples according to the Bayes Net distribution, then count up the results Variational methods for tractable approximate solutions

### Probabilistic Graphical Models

Probabilistic Graphical Models Raquel Urtasun and Tamir Hazan TTI Chicago April 4, 2011 Raquel Urtasun and Tamir Hazan (TTI-C) Graphical Models April 4, 2011 1 / 22 Bayesian Networks and independences

### Bayesian Networks Chapter 14. Mausam (Slides by UW-AI faculty & David Page)

Bayesian Networks Chapter 14 Mausam (Slides by UW-AI faculty & David Page) Bayes Nets In general, joint distribution P over set of variables (X 1 x... x X n ) requires exponential space for representation

### CS 188: Artificial Intelligence. Probability recap

CS 188: Artificial Intelligence Bayes Nets Representation and Independence Pieter Abbeel UC Berkeley Many slides over this course adapted from Dan Klein, Stuart Russell, Andrew Moore Conditional probability

### Logic, Probability and Learning

Logic, Probability and Learning Luc De Raedt luc.deraedt@cs.kuleuven.be Overview Logic Learning Probabilistic Learning Probabilistic Logic Learning Closely following : Russell and Norvig, AI: a modern

### 3. The Junction Tree Algorithms

A Short Course on Graphical Models 3. The Junction Tree Algorithms Mark Paskin mark@paskin.org 1 Review: conditional independence Two random variables X and Y are independent (written X Y ) iff p X ( )

### Lecture 2: Introduction to belief (Bayesian) networks

Lecture 2: Introduction to belief (Bayesian) networks Conditional independence What is a belief network? Independence maps (I-maps) January 7, 2008 1 COMP-526 Lecture 2 Recall from last time: Conditional

### Bayesian Networks. Mausam (Slides by UW-AI faculty)

Bayesian Networks Mausam (Slides by UW-AI faculty) Bayes Nets In general, joint distribution P over set of variables (X 1 x... x X n ) requires exponential space for representation & inference BNs provide

### Artificial Intelligence Mar 27, Bayesian Networks 1 P (T D)P (D) + P (T D)P ( D) =

Artificial Intelligence 15-381 Mar 27, 2007 Bayesian Networks 1 Recap of last lecture Probability: precise representation of uncertainty Probability theory: optimal updating of knowledge based on new information

### 5 Directed acyclic graphs

5 Directed acyclic graphs (5.1) Introduction In many statistical studies we have prior knowledge about a temporal or causal ordering of the variables. In this chapter we will use directed graphs to incorporate

### Probability, Conditional Independence

Probability, Conditional Independence June 19, 2012 Probability, Conditional Independence Probability Sample space Ω of events Each event ω Ω has an associated measure Probability of the event P(ω) Axioms

### p if x = 1 1 p if x = 0

Probability distributions Bernoulli distribution Two possible values (outcomes): 1 (success), 0 (failure). Parameters: p probability of success. Probability mass function: P (x; p) = { p if x = 1 1 p if

### The Basics of Graphical Models

The Basics of Graphical Models David M. Blei Columbia University October 3, 2015 Introduction These notes follow Chapter 2 of An Introduction to Probabilistic Graphical Models by Michael Jordan. Many figures

### Course: Model, Learning, and Inference: Lecture 5

Course: Model, Learning, and Inference: Lecture 5 Alan Yuille Department of Statistics, UCLA Los Angeles, CA 90095 yuille@stat.ucla.edu Abstract Probability distributions on structured representation.

### Bayesian Networks. Read R&N Ch. 14.1-14.2. Next lecture: Read R&N 18.1-18.4

Bayesian Networks Read R&N Ch. 14.1-14.2 Next lecture: Read R&N 18.1-18.4 You will be expected to know Basic concepts and vocabulary of Bayesian networks. Nodes represent random variables. Directed arcs

### Querying Joint Probability Distributions

Querying Joint Probability Distributions Sargur Srihari srihari@cedar.buffalo.edu 1 Queries of Interest Probabilistic Graphical Models (BNs and MNs) represent joint probability distributions over multiple

### Big Data, Machine Learning, Causal Models

Big Data, Machine Learning, Causal Models Sargur N. Srihari University at Buffalo, The State University of New York USA Int. Conf. on Signal and Image Processing, Bangalore January 2014 1 Plan of Discussion

### The Joint Probability Distribution (JPD) of a set of n binary variables involve a huge number of parameters

DEFINING PROILISTI MODELS The Joint Probability Distribution (JPD) of a set of n binary variables involve a huge number of parameters 2 n (larger than 10 25 for only 100 variables). x y z p(x, y, z) 0

### Lecture 11: Graphical Models for Inference

Lecture 11: Graphical Models for Inference So far we have seen two graphical models that are used for inference - the Bayesian network and the Join tree. These two both represent the same joint probability

### SYSM 6304: Risk and Decision Analysis Lecture 5: Methods of Risk Analysis

SYSM 6304: Risk and Decision Analysis Lecture 5: Methods of Risk Analysis M. Vidyasagar Cecil & Ida Green Chair The University of Texas at Dallas Email: M.Vidyasagar@utdallas.edu October 17, 2015 Outline

### NP-complete? NP-hard? Some Foundations of Complexity. Prof. Sven Hartmann Clausthal University of Technology Department of Informatics

NP-complete? NP-hard? Some Foundations of Complexity Prof. Sven Hartmann Clausthal University of Technology Department of Informatics Tractability of Problems Some problems are undecidable: no computer

### Examples and Proofs of Inference in Junction Trees

Examples and Proofs of Inference in Junction Trees Peter Lucas LIAC, Leiden University February 4, 2016 1 Representation and notation Let P(V) be a joint probability distribution, where V stands for a

### Model-based Synthesis. Tony O Hagan

Model-based Synthesis Tony O Hagan Stochastic models Synthesising evidence through a statistical model 2 Evidence Synthesis (Session 3), Helsinki, 28/10/11 Graphical modelling The kinds of models that

10-601 Machine Learning http://www.cs.cmu.edu/afs/cs/academic/class/10601-f10/index.html Course data All up-to-date info is on the course web page: http://www.cs.cmu.edu/afs/cs/academic/class/10601-f10/index.html

### Question 2 Naïve Bayes (16 points)

Question 2 Naïve Bayes (16 points) About 2/3 of your email is spam so you downloaded an open source spam filter based on word occurrences that uses the Naive Bayes classifier. Assume you collected the

### Chapter 2 Introduction to Sequence Analysis for Human Behavior Understanding

Chapter 2 Introduction to Sequence Analysis for Human Behavior Understanding Hugues Salamin and Alessandro Vinciarelli 2.1 Introduction Human sciences recognize sequence analysis as a key aspect of any

### Artificial Intelligence. Conditional probability. Inference by enumeration. Independence. Lesson 11 (From Russell & Norvig)

Artificial Intelligence Conditional probability Conditional or posterior probabilities e.g., cavity toothache) = 0.8 i.e., given that toothache is all I know tation for conditional distributions: Cavity

### Probabilistic Networks An Introduction to Bayesian Networks and Influence Diagrams

Probabilistic Networks An Introduction to Bayesian Networks and Influence Diagrams Uffe B. Kjærulff Department of Computer Science Aalborg University Anders L. Madsen HUGIN Expert A/S 10 May 2005 2 Contents

### 13.3 Inference Using Full Joint Distribution

191 The probability distribution on a single variable must sum to 1 It is also true that any joint probability distribution on any set of variables must sum to 1 Recall that any proposition a is equivalent

### Page 1. CSCE 310J Data Structures & Algorithms. CSCE 310J Data Structures & Algorithms. P, NP, and NP-Complete. Polynomial-Time Algorithms

CSCE 310J Data Structures & Algorithms P, NP, and NP-Complete Dr. Steve Goddard goddard@cse.unl.edu CSCE 310J Data Structures & Algorithms Giving credit where credit is due:» Most of the lecture notes

### Why? A central concept in Computer Science. Algorithms are ubiquitous.

Analysis of Algorithms: A Brief Introduction Why? A central concept in Computer Science. Algorithms are ubiquitous. Using the Internet (sending email, transferring files, use of search engines, online

### UNIVERSITY OF LYON DOCTORAL SCHOOL OF COMPUTER SCIENCES AND MATHEMATICS P H D T H E S I S. Specialty : Computer Science. Author

UNIVERSITY OF LYON DOCTORAL SCHOOL OF COMPUTER SCIENCES AND MATHEMATICS P H D T H E S I S Specialty : Computer Science Author Sérgio Rodrigues de Morais on November 16, 29 Bayesian Network Structure Learning

### 8. Machine Learning Applied Artificial Intelligence

8. Machine Learning Applied Artificial Intelligence Prof. Dr. Bernhard Humm Faculty of Computer Science Hochschule Darmstadt University of Applied Sciences 1 Retrospective Natural Language Processing Name

### Structured Learning and Prediction in Computer Vision. Contents

Foundations and Trends R in Computer Graphics and Vision Vol. 6, Nos. 3 4 (2010) 185 365 c 2011 S. Nowozin and C. H. Lampert DOI: 10.1561/0600000033 Structured Learning and Prediction in Computer Vision

### Large-Sample Learning of Bayesian Networks is NP-Hard

Journal of Machine Learning Research 5 (2004) 1287 1330 Submitted 3/04; Published 10/04 Large-Sample Learning of Bayesian Networks is NP-Hard David Maxwell Chickering David Heckerman Christopher Meek Microsoft

### Scaling Bayesian Network Parameter Learning with Expectation Maximization using MapReduce

Scaling Bayesian Network Parameter Learning with Expectation Maximization using MapReduce Erik B. Reed Carnegie Mellon University Silicon Valley Campus NASA Research Park Moffett Field, CA 94035 erikreed@cmu.edu

### Graphical Probability Models for Inference and Decision Making"

Graphical Probability Models for Inference and Decision Making" Instructor: Kathryn Blackmond Laskey Unit 2: Graphical Probability Models Unit 2 (v3) - 1 - Learning Objectives" 1. Specify a joint distribution

### Study Manual. Probabilistic Reasoning. Silja Renooij. August 2015

Study Manual Probabilistic Reasoning 2015 2016 Silja Renooij August 2015 General information This study manual was designed to help guide your self studies. As such, it does not include material that is

### Bayesian Updating with Discrete Priors Class 11, 18.05, Spring 2014 Jeremy Orloff and Jonathan Bloom

1 Learning Goals Bayesian Updating with Discrete Priors Class 11, 18.05, Spring 2014 Jeremy Orloff and Jonathan Bloom 1. Be able to apply Bayes theorem to compute probabilities. 2. Be able to identify

### 1. Nondeterministically guess a solution (called a certificate) 2. Check whether the solution solves the problem (called verification)

Some N P problems Computer scientists have studied many N P problems, that is, problems that can be solved nondeterministically in polynomial time. Traditionally complexity question are studied as languages:

### CHAPTER 2 Estimating Probabilities

CHAPTER 2 Estimating Probabilities Machine Learning Copyright c 2016. Tom M. Mitchell. All rights reserved. *DRAFT OF January 24, 2016* *PLEASE DO NOT DISTRIBUTE WITHOUT AUTHOR S PERMISSION* This is a

### Applied Algorithm Design Lecture 5

Applied Algorithm Design Lecture 5 Pietro Michiardi Eurecom Pietro Michiardi (Eurecom) Applied Algorithm Design Lecture 5 1 / 86 Approximation Algorithms Pietro Michiardi (Eurecom) Applied Algorithm Design

### Naive-Bayes Classification Algorithm

Naive-Bayes Classification Algorithm 1. Introduction to Bayesian Classification The Bayesian Classification represents a supervised learning method as well as a statistical method for classification. Assumes

### Scheduling. Open shop, job shop, flow shop scheduling. Related problems. Open shop, job shop, flow shop scheduling

Scheduling Basic scheduling problems: open shop, job shop, flow job The disjunctive graph representation Algorithms for solving the job shop problem Computational complexity of the job shop problem Open

### Decision Trees and Networks

Lecture 21: Uncertainty 6 Today s Lecture Victor R. Lesser CMPSCI 683 Fall 2010 Decision Trees and Networks Decision Trees A decision tree is an explicit representation of all the possible scenarios from

### Factor Graphs and the Sum-Product Algorithm

498 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 47, NO. 2, FEBRUARY 2001 Factor Graphs and the Sum-Product Algorithm Frank R. Kschischang, Senior Member, IEEE, Brendan J. Frey, Member, IEEE, and Hans-Andrea

### CS570 Introduction to Data Mining. Classification and Prediction. Partial slide credits: Han and Kamber Tan,Steinbach, Kumar

CS570 Introduction to Data Mining Classification and Prediction Partial slide credits: Han and Kamber Tan,Steinbach, Kumar 1 Classification and Prediction Overview Classification algorithms and methods

### Compression algorithm for Bayesian network modeling of binary systems

Compression algorithm for Bayesian network modeling of binary systems I. Tien & A. Der Kiureghian University of California, Berkeley ABSTRACT: A Bayesian network (BN) is a useful tool for analyzing the

### Approximation Algorithms

Approximation Algorithms or: How I Learned to Stop Worrying and Deal with NP-Completeness Ong Jit Sheng, Jonathan (A0073924B) March, 2012 Overview Key Results (I) General techniques: Greedy algorithms

### A crash course in probability and Naïve Bayes classification

Probability theory A crash course in probability and Naïve Bayes classification Chapter 9 Random variable: a variable whose possible values are numerical outcomes of a random phenomenon. s: A person s

### Cell Phone based Activity Detection using Markov Logic Network

Cell Phone based Activity Detection using Markov Logic Network Somdeb Sarkhel sxs104721@utdallas.edu 1 Introduction Mobile devices are becoming increasingly sophisticated and the latest generation of smart

### Circuits 1 M H Miller

Introduction to Graph Theory Introduction These notes are primarily a digression to provide general background remarks. The subject is an efficient procedure for the determination of voltages and currents

### Learning from Data: Naive Bayes

Semester 1 http://www.anc.ed.ac.uk/ amos/lfd/ Naive Bayes Typical example: Bayesian Spam Filter. Naive means naive. Bayesian methods can be much more sophisticated. Basic assumption: conditional independence.

### Advances in Health Care Predictive Analytics

Advances in Health Care Predictive Analytics Graphical Models Part I: Intelligent Reasoning with Bayesian Networks Jarrod E. Dalton COLD OR FLU? How do we tell? A Simple Bayesian Network P() Prob. No 70%

### Lecture 4: BK inequality 27th August and 6th September, 2007

CSL866: Percolation and Random Graphs IIT Delhi Amitabha Bagchi Scribe: Arindam Pal Lecture 4: BK inequality 27th August and 6th September, 2007 4. Preliminaries The FKG inequality allows us to lower bound

### Bayesian networks - Time-series models - Apache Spark & Scala

Bayesian networks - Time-series models - Apache Spark & Scala Dr John Sandiford, CTO Bayes Server Data Science London Meetup - November 2014 1 Contents Introduction Bayesian networks Latent variables Anomaly

### OHJ-2306 Introduction to Theoretical Computer Science, Fall 2012 8.11.2012

276 The P vs. NP problem is a major unsolved problem in computer science It is one of the seven Millennium Prize Problems selected by the Clay Mathematics Institute to carry a \$ 1,000,000 prize for the

### Bayesian Tutorial (Sheet Updated 20 March)

Bayesian Tutorial (Sheet Updated 20 March) Practice Questions (for discussing in Class) Week starting 21 March 2016 1. What is the probability that the total of two dice will be greater than 8, given that

### Intelligent Systems: Reasoning and Recognition. Uncertainty and Plausible Reasoning

Intelligent Systems: Reasoning and Recognition James L. Crowley ENSIMAG 2 / MoSIG M1 Second Semester 2015/2016 Lesson 12 25 march 2015 Uncertainty and Plausible Reasoning MYCIN (continued)...2 Backward

### CMPSCI611: Approximating MAX-CUT Lecture 20

CMPSCI611: Approximating MAX-CUT Lecture 20 For the next two lectures we ll be seeing examples of approximation algorithms for interesting NP-hard problems. Today we consider MAX-CUT, which we proved to

### Knowledge-Based Probabilistic Reasoning from Expert Systems to Graphical Models

Knowledge-Based Probabilistic Reasoning from Expert Systems to Graphical Models By George F. Luger & Chayan Chakrabarti {luger cc} @cs.unm.edu Department of Computer Science University of New Mexico Albuquerque

### Cost Model: Work, Span and Parallelism. 1 The RAM model for sequential computation:

CSE341T 08/31/2015 Lecture 3 Cost Model: Work, Span and Parallelism In this lecture, we will look at how one analyze a parallel program written using Cilk Plus. When we analyze the cost of an algorithm

### Life of A Knowledge Base (KB)

Life of A Knowledge Base (KB) A knowledge base system is a special kind of database management system to for knowledge base management. KB extraction: knowledge extraction using statistical models in NLP/ML

### Technical Report No. 5 April 18, Bayesian Networks. Michal Horný

Technical Report No. 5 April 18, 2014 Bayesian Networks Michal Horný mhorny@bu.edu This paper was published in fulfillment of the requirements for PM931: Directed Study in Health Policy and Management

### Basics of Probability

Basics of Probability 1 Sample spaces, events and probabilities Begin with a set Ω the sample space e.g., 6 possible rolls of a die. ω Ω is a sample point/possible world/atomic event A probability space

### An Introduction to the Use of Bayesian Network to Analyze Gene Expression Data

n Introduction to the Use of ayesian Network to nalyze Gene Expression Data Cristina Manfredotti Dipartimento di Informatica, Sistemistica e Comunicazione (D.I.S.Co. Università degli Studi Milano-icocca

### Evaluation of Machine Learning Techniques for Green Energy Prediction

arxiv:1406.3726v1 [cs.lg] 14 Jun 2014 Evaluation of Machine Learning Techniques for Green Energy Prediction 1 Objective Ankur Sahai University of Mainz, Germany We evaluate Machine Learning techniques

### VARIATIONAL ALGORITHMS FOR APPROXIMATE BAYESIAN INFERENCE

VARIATIONAL ALGORITHMS FOR APPROXIMATE BAYESIAN INFERENCE by Matthew J. Beal M.A., M.Sci., Physics, University of Cambridge, UK (1998) The Gatsby Computational Neuroscience Unit University College London

### Bayesian Machine Learning (ML): Modeling And Inference in Big Data. Zhuhua Cai Google, Rice University caizhua@gmail.com

Bayesian Machine Learning (ML): Modeling And Inference in Big Data Zhuhua Cai Google Rice University caizhua@gmail.com 1 Syllabus Bayesian ML Concepts (Today) Bayesian ML on MapReduce (Next morning) Bayesian

### Exponential Random Graph Models for Social Network Analysis. Danny Wyatt 590AI March 6, 2009

Exponential Random Graph Models for Social Network Analysis Danny Wyatt 590AI March 6, 2009 Traditional Social Network Analysis Covered by Eytan Traditional SNA uses descriptive statistics Path lengths

### STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.cs.toronto.edu/~rsalakhu/ Lecture 6 Three Approaches to Classification Construct

### Introduction to Markov Chain Monte Carlo

Introduction to Markov Chain Monte Carlo Monte Carlo: sample from a distribution to estimate the distribution to compute max, mean Markov Chain Monte Carlo: sampling using local information Generic problem

### Relational Dynamic Bayesian Networks: a report. Cristina Manfredotti

Relational Dynamic Bayesian Networks: a report Cristina Manfredotti Dipartimento di Informatica, Sistemistica e Comunicazione (D.I.S.Co.) Università degli Studi Milano-Bicocca manfredotti@disco.unimib.it

### SIMS 255 Foundations of Software Design. Complexity and NP-completeness

SIMS 255 Foundations of Software Design Complexity and NP-completeness Matt Welsh November 29, 2001 mdw@cs.berkeley.edu 1 Outline Complexity of algorithms Space and time complexity ``Big O'' notation Complexity

### A Web-based Bayesian Intelligent Tutoring System for Computer Programming

A Web-based Bayesian Intelligent Tutoring System for Computer Programming C.J. Butz, S. Hua, R.B. Maguire Department of Computer Science University of Regina Regina, Saskatchewan, Canada S4S 0A2 Email:

### Dependency Networks for Collaborative Filtering and Data Visualization

264 UNCERTAINTY IN ARTIFICIAL INTELLIGENCE PROCEEDINGS 2000 Dependency Networks for Collaborative Filtering and Data Visualization David Heckerman, David Maxwell Chickering, Christopher Meek, Robert Rounthwaite,

### Estimating Software Reliability In the Absence of Data

Estimating Software Reliability In the Absence of Data Joanne Bechta Dugan (jbd@virginia.edu) Ganesh J. Pai (gpai@virginia.edu) Department of ECE University of Virginia, Charlottesville, VA NASA OSMA SAS

### Hidden Markov Models in Bioinformatics. By Máthé Zoltán Kőrösi Zoltán 2006

Hidden Markov Models in Bioinformatics By Máthé Zoltán Kőrösi Zoltán 2006 Outline Markov Chain HMM (Hidden Markov Model) Hidden Markov Models in Bioinformatics Gene Finding Gene Finding Model Viterbi algorithm

### Introduction to computer science

Introduction to computer science Michael A. Nielsen University of Queensland Goals: 1. Introduce the notion of the computational complexity of a problem, and define the major computational complexity classes.

### Deutsches Institut für Ernährungsforschung Potsdam-Rehbrücke. DAG program (v0.21)

Deutsches Institut für Ernährungsforschung Potsdam-Rehbrücke DAG program (v0.21) Causal diagrams: A computer program to identify minimally sufficient adjustment sets SHORT TUTORIAL Sven Knüppel http://epi.dife.de/dag

### 136 CHAPTER 4. INDUCTION, GRAPHS AND TREES

136 TER 4. INDUCTION, GRHS ND TREES 4.3 Graphs In this chapter we introduce a fundamental structural idea of discrete mathematics, that of a graph. Many situations in the applications of discrete mathematics

### Chapter 28. Bayesian Networks

Chapter 28. Bayesian Networks The Quest for Artificial Intelligence, Nilsson, N. J., 2009. Lecture Notes on Artificial Intelligence, Spring 2012 Summarized by Kim, Byoung-Hee and Lim, Byoung-Kwon Biointelligence

### CHAPTER 3 EXAMPLES: REGRESSION AND PATH ANALYSIS

Examples: Regression And Path Analysis CHAPTER 3 EXAMPLES: REGRESSION AND PATH ANALYSIS Regression analysis with univariate or multivariate dependent variables is a standard procedure for modeling relationships

### Complexity Theory. IE 661: Scheduling Theory Fall 2003 Satyaki Ghosh Dastidar

Complexity Theory IE 661: Scheduling Theory Fall 2003 Satyaki Ghosh Dastidar Outline Goals Computation of Problems Concepts and Definitions Complexity Classes and Problems Polynomial Time Reductions Examples

### Policy Analysis for Administrative Role Based Access Control without Separate Administration

Policy nalysis for dministrative Role Based ccess Control without Separate dministration Ping Yang Department of Computer Science, State University of New York at Binghamton, US Mikhail I. Gofman Department

### Monte Carlo and Empirical Methods for Stochastic Inference (MASM11/FMS091)

Monte Carlo and Empirical Methods for Stochastic Inference (MASM11/FMS091) Magnus Wiktorsson Centre for Mathematical Sciences Lund University, Sweden Lecture 5 Sequential Monte Carlo methods I February

### Outline. NP-completeness. When is a problem easy? When is a problem hard? Today. Euler Circuits

Outline NP-completeness Examples of Easy vs. Hard problems Euler circuit vs. Hamiltonian circuit Shortest Path vs. Longest Path 2-pairs sum vs. general Subset Sum Reducing one problem to another Clique

### Variational Mean Field for Graphical Models

Variational Mean Field for Graphical Models CS/CNS/EE 155 Baback Moghaddam Machine Learning Group baback @ jpl.nasa.gov Approximate Inference Consider general UGs (i.e., not tree-structured) All basic

### Bayesian Network Development

Bayesian Network Development Kenneth BACLAWSKI College of Computer Science, Northeastern University Boston, Massachusetts 02115 USA Ken@Baclawski.com Abstract Bayesian networks are a popular mechanism

### Programming Tools based on Big Data and Conditional Random Fields

Programming Tools based on Big Data and Conditional Random Fields Veselin Raychev Martin Vechev Andreas Krause Department of Computer Science ETH Zurich Zurich Machine Learning and Data Science Meet-up,

### Markov random fields and Gibbs measures

Chapter Markov random fields and Gibbs measures 1. Conditional independence Suppose X i is a random element of (X i, B i ), for i = 1, 2, 3, with all X i defined on the same probability space (.F, P).

### Algorithm Design and Analysis

Algorithm Design and Analysis LECTURE 27 Approximation Algorithms Load Balancing Weighted Vertex Cover Reminder: Fill out SRTEs online Don t forget to click submit Sofya Raskhodnikova 12/6/2011 S. Raskhodnikova;

### Transfer Learning! and Prior Estimation! for VC Classes!

15-859(B) Machine Learning Theory! Transfer Learning! and Prior Estimation! for VC Classes! Liu Yang! 04/25/2012! Liu Yang 2012 1 Notation! Instance space X = R n! Concept space C of classifiers h: X ->

### Probability Theory. Elementary rules of probability Sum rule. Product rule. p. 23

Probability Theory Uncertainty is key concept in machine learning. Probability provides consistent framework for the quantification and manipulation of uncertainty. Probability of an event is the fraction

### Outline. Probability. Random Variables. Joint and Marginal Distributions Conditional Distribution Product Rule, Chain Rule, Bayes Rule.

Probability 1 Probability Random Variables Outline Joint and Marginal Distributions Conditional Distribution Product Rule, Chain Rule, Bayes Rule Inference Independence 2 Inference in Ghostbusters A ghost

Approximation Algorithms Chapter Approximation Algorithms Q. Suppose I need to solve an NP-hard problem. What should I do? A. Theory says you're unlikely to find a poly-time algorithm. Must sacrifice one

### Incorporating Evidence in Bayesian networks with the Select Operator

Incorporating Evidence in Bayesian networks with the Select Operator C.J. Butz and F. Fang Department of Computer Science, University of Regina Regina, Saskatchewan, Canada SAS 0A2 {butz, fang11fa}@cs.uregina.ca

### INTRODUCTORY STATISTICS

INTRODUCTORY STATISTICS FIFTH EDITION Thomas H. Wonnacott University of Western Ontario Ronald J. Wonnacott University of Western Ontario WILEY JOHN WILEY & SONS New York Chichester Brisbane Toronto Singapore

### Inference on Phase-type Models via MCMC

Inference on Phase-type Models via MCMC with application to networks of repairable redundant systems Louis JM Aslett and Simon P Wilson Trinity College Dublin 28 th June 202 Toy Example : Redundant Repairable

### Bayesian models of cognition

Bayesian models of cognition Thomas L. Griffiths Department of Psychology University of California, Berkeley Charles Kemp and Joshua B. Tenenbaum Department of Brain and Cognitive Sciences Massachusetts

### Learning Instance-Specific Predictive Models

Journal of Machine Learning Research 11 (2010) 3333-3369 Submitted 3/09; Revised 7/10; Published 12/10 Learning Instance-Specific Predictive Models Shyam Visweswaran Gregory F. Cooper Department of Biomedical