# Introduction to Deep Learning Variational Inference, Mean Field Theory

Save this PDF as:

Size: px
Start display at page:

## Transcription

1 Introduction to Deep Learning Variational Inference, Mean Field Theory 1 Iasonas Kokkinos Center for Visual Computing Ecole Centrale Paris Galen Group INRIA-Saclay

2 Lecture 3: recap 2 Network Architectures Boltzmann Machine Restricted Boltzmann Machine

3 Boltzmann Machine (Hinton & Sejnowski, ) 3 Full-blown Ising Model Parameter estimation Once again: Training data MCMC 3

4 Boltzmann Machine limitations 4 Underlying statistical model: constrains second-order moments This will not get us too far even with extra information 4

5 Hidden variables, to the resque! 5 hidden, h observed, x 5

6 Boltzmann Machine: a big mixture model 6 Marginalization Mixture components Mixing weights compositional structure of components: h mixes and mashes rows of U 6

7 Botlzmann machine learning 7 As before, but with hidden variables

8 Botlzmann machine learning 8

9 Restricted Boltzmann Machine 9 hidden, h observed, x

10 RBM 10 RBM 10

11 The perks of a Restricted Boltzmann Machine 11 All hidden units are conditionally independent given the visible units and vice versa. We can update them in batch mode! 11

12 Restricted Boltzmann Machine sampling 12 Block-Gibbs MCMC 12

13 RBM inference 13 Block-Gibbs MCMC 13

14 RBM learning 14 Maximize with respect to 14

15 Lecture 4 15 Variational Approximations Mean Field Inference

16 Entropy reminder 16 Entropy = optimal coding length 16

17 Relative Entropy (Kullback-Leibler divergence) 17 Information lost when Q is used to approximate P: The KL divergence measures the expected number of extra bits required to code samples from P when using a code optimized for Q, rather than using the true code optimized for P. but (not a proper distance) 17

18 Step 1: Bounding the expectation of a convex function 18 Convex function: For more summands (Jensen s inequality): 18

19 Step 2: Bounding the KL divergence 19 Convex function: For we get KL divergence We also observe: By Jensen s inequality 19

20 Variational Inference 20 where makes the minimization tractable Typical family ( naïve mean field ): 20

21 21 Gibbs Sampling (one variant of MCMC) x 1 x 2 ),,, ( ~ ) ( ) ( 3 ) ( 2 1 1) ( 1 t K t t t x x x x x! π + ),,, ( ~ ) ( ) ( 3 1) ( 1 2 1) ( 2 t K t t t x x x x π x! + + ),, ( ~ 1) ( 1 1) ( 1 1) ( t K t K t K x x x x! π Variational Inference versus MCMC Variational inference: try to match distribution with member of

22 Variational Inference for Boltzmann-Gibbs distribution 22 Exponential family: Variational Free Energy: 22

23 Ising model 23 Boltzmann-Gibbs distribution Ising model: Variational Free Energy:

24 Lecture 4 24 Variational Approximations Mean Field Inference

25 Naïve Mean Field for binary random variables 25 Factored distribution: Notation:

26 Naïve Mean Field for Ising model

27 Naïve Mean Field for Ising model 27 Independent variables: additive entropy

28 Putting it all together 28 - Condition for extremum after some algebra.. Mean Field Equations:

29 Lecture 4 29 Variational Approximations Mean Field Inference Applications to computer vision (fully connected CRFs)

30 Mean Field Theory & Computer Vision 30 Discrete/Continuous Hopfield Networks (1982/1984) Yuille & coworkers ( X) Loopy Belief Propagation >(?) Mean Field 2011: Mean Field for fully connected CRF s

31 Winkler, 1995, p. 32 MRF nodes as pixels

32 MRFs nodes as patches 32 image Φ(x i, y i ) scene image Ψ(x i, x j ) scene

33 Network joint probability 33 1 P ( x, y ) = Ψ ( x, x ) Φ ( x, y ) i j i i Z scene image i, j Scene-scene compatibility function neighboring scene nodes i Image-scene compatibility function local observations

34 MRFs for Denoising (Geman & Geman, 1984) 34 Φ(x i, y i ) Noisy Pixel Intensities Ψ(x i, x j ) Clean Image

35 MRFs for Segmentation 35

36 Ising model (two labels) 36 Model for Binary vectors: Samples from Ising model for different Temperatures 36

37 Potts model (K-labels) 37 Multiple labels: Samples from Potts model for different Temperatures 37

38 Network Joint Probability 38 Scene Image Image-scene compatibility function Local observations Scene-scene compatibility function Neighboring scene nodes

39 Generative Framework for Vision 39 MRF: joint model over scene and observations Vision Task: recover scene given observations Bayes rule Posterior Likelihood Prior

40 Conditional Random Fields 40 MRF x 1 x 2 x 3 x 4 x 5 x 6 y 1 y 2 y 3 y 4 y 5 y 6 x 1 x 2 x 3 x 4 x 5 x 6 CRF y 1 y 2 y 3 y 4 y 5 y 6 CRFs: keep MRF tools, drop Bayesian aspect

41 CRFs in a nutshell 41

42 Grid CRF 42

43 Grid CRF limitations 43

44 Grid CRF limitations 44

45 : Fully-connected CRF (Krahnebuhl & Koltun) Philipp Krähenbühl and Vladlen Koltun, Efficient Inference in Fully Connected CRFs with Gaussian Edge Potentials, NIPS 2011

46 Fully-connected CRF 46 Philipp Krähenbühl and Vladlen Koltun, Efficient Inference in Fully Connected CRFs with Gaussian Edge Potentials, NIPS 2011

47 Fully-connected CRF 47 Philipp Krähenbühl and Vladlen Koltun, Efficient Inference in Fully Connected CRFs with Gaussian Edge Potentials, NIPS 2011

48 Fully-connected CRF 48 Philipp Krähenbühl and Vladlen Koltun, Efficient Inference in Fully Connected CRFs with Gaussian Edge Potentials, NIPS 2011

49 Fully-connected CRF: FAST 49 How? Mean Field + some tricks

50 Trick: Pairwise Term 50 Potts model Gaussian kernels Fast summation through separable convolution Philipp Krähenbühl and Vladlen Koltun, Efficient Inference in Fully Connected CRFs with Gaussian Edge Potentials, NIPS 2011

51 2014: Fully connected CRFs + Deep Classifiers 51 L.-C. Chen, G. Papandreou, I. Kokkinos, K. Murphy and A. Yuille Semantic Image Segmentation with Deep Convolutional Nets and Fully Connected CRFs, arxiv: v1, 2014

52 Evolution from mean field updates 52 L.-C. Chen, G. Papandreou, I. Kokkinos, K. Murphy and A. Yuille Semantic Image Segmentation with Deep Convolutional Nets and Fully Connected CRFs, arxiv: v1, 2014

53 Results (input, DCNN, CRF-DCNN) 53 L.-C. Chen, G. Papandreou, I. Kokkinos, K. Murphy and A. Yuille Semantic Image Segmentation with Deep Convolutional Nets and Fully Connected CRFs, arxiv: v1, 2014

54 Results (input, DCNN, CRF-DCNN) 54 L.-C. Chen, G. Papandreou, I. Kokkinos, K. Murphy and A. Yuille Semantic Image Segmentation with Deep Convolutional Nets and Fully Connected CRFs, arxiv: v1, 2014

55 Comparisons to other techniques 55 L.-C. Chen, G. Papandreou, I. Kokkinos, K. Murphy and A. Yuille Semantic Image Segmentation with Deep Convolutional Nets and Fully Connected CRFs, arxiv: v1, 2014

56 Comparisons to previous state-of-the-art 56 L.-C. Chen, G. Papandreou, I. Kokkinos, K. Murphy and A. Yuille Semantic Image Segmentation with Deep Convolutional Nets and Fully Connected CRFs, arxiv: v1, 2014

### Lecture 11: Graphical Models for Inference

Lecture 11: Graphical Models for Inference So far we have seen two graphical models that are used for inference - the Bayesian network and the Join tree. These two both represent the same joint probability

### Introduction to Segmentation

Lecture 2: Introduction to Segmentation Jonathan Krause 1 Goal Goal: Identify groups of pixels that go together image credit: Steve Seitz, Kristen Grauman 2 Types of Segmentation Semantic Segmentation:

### Course: Model, Learning, and Inference: Lecture 5

Course: Model, Learning, and Inference: Lecture 5 Alan Yuille Department of Statistics, UCLA Los Angeles, CA 90095 yuille@stat.ucla.edu Abstract Probability distributions on structured representation.

### STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.cs.toronto.edu/~rsalakhu/ Lecture 6 Three Approaches to Classification Construct

### Probabilistic Graphical Models Homework 1: Due January 29, 2014 at 4 pm

Probabilistic Graphical Models 10-708 Homework 1: Due January 29, 2014 at 4 pm Directions. This homework assignment covers the material presented in Lectures 1-3. You must complete all four problems to

### Variational Mean Field for Graphical Models

Variational Mean Field for Graphical Models CS/CNS/EE 155 Baback Moghaddam Machine Learning Group baback @ jpl.nasa.gov Approximate Inference Consider general UGs (i.e., not tree-structured) All basic

### Image Modeling using Tree Structured Conditional Random Fields

Image Modeling using Tree Structured Conditional Random Fields Pranjal Awasthi IBM India Research Lab New Delhi prawasth@in.ibm.com Aakanksha Gagrani Dept. of CSE IIT Madras aksgag@cse.iitm.ernet.in Balaraman

### Introduction to Machine Learning CMU-10701

Introduction to Machine Learning CMU-10701 Deep Learning Barnabás Póczos & Aarti Singh Credits Many of the pictures, results, and other materials are taken from: Ruslan Salakhutdinov Joshua Bengio Geoffrey

### Probabilistic Latent Semantic Analysis (plsa)

Probabilistic Latent Semantic Analysis (plsa) SS 2008 Bayesian Networks Multimedia Computing, Universität Augsburg Rainer.Lienhart@informatik.uni-augsburg.de www.multimedia-computing.{de,org} References

### A Learning Based Method for Super-Resolution of Low Resolution Images

A Learning Based Method for Super-Resolution of Low Resolution Images Emre Ugur June 1, 2004 emre.ugur@ceng.metu.edu.tr Abstract The main objective of this project is the study of a learning based method

### arxiv:1312.6062v2 [cs.lg] 9 Apr 2014

Stopping Criteria in Contrastive Divergence: Alternatives to the Reconstruction Error arxiv:1312.6062v2 [cs.lg] 9 Apr 2014 David Buchaca Prats Departament de Llenguatges i Sistemes Informàtics, Universitat

### Neural Networks for Machine Learning. Lecture 13a The ups and downs of backpropagation

Neural Networks for Machine Learning Lecture 13a The ups and downs of backpropagation Geoffrey Hinton Nitish Srivastava, Kevin Swersky Tijmen Tieleman Abdel-rahman Mohamed A brief history of backpropagation

### Learning Deep Hierarchies of Representations

Learning Deep Hierarchies of Representations Yoshua Bengio, U. Montreal Google Research, Mountain View, California September 23rd, 2009 Thanks to: Aaron Courville, Pascal Vincent, Dumitru Erhan, Olivier

### Methods of Data Analysis Working with probability distributions

Methods of Data Analysis Working with probability distributions Week 4 1 Motivation One of the key problems in non-parametric data analysis is to create a good model of a generating probability distribution,

### Tutorial on variational approximation methods. Tommi S. Jaakkola MIT AI Lab

Tutorial on variational approximation methods Tommi S. Jaakkola MIT AI Lab tommi@ai.mit.edu Tutorial topics A bit of history Examples of variational methods A brief intro to graphical models Variational

### A crash course in probability and Naïve Bayes classification

Probability theory A crash course in probability and Naïve Bayes classification Chapter 9 Random variable: a variable whose possible values are numerical outcomes of a random phenomenon. s: A person s

### IEEE TRANSACTIONS ON NEURAL NETWORKS, VOL. 20, NO. 7, JULY 2009 1181

IEEE TRANSACTIONS ON NEURAL NETWORKS, VOL. 20, NO. 7, JULY 2009 1181 The Global Kernel k-means Algorithm for Clustering in Feature Space Grigorios F. Tzortzis and Aristidis C. Likas, Senior Member, IEEE

### Neural Networks. CAP5610 Machine Learning Instructor: Guo-Jun Qi

Neural Networks CAP5610 Machine Learning Instructor: Guo-Jun Qi Recap: linear classifier Logistic regression Maximizing the posterior distribution of class Y conditional on the input vector X Support vector

### CSC321 Introduction to Neural Networks and Machine Learning. Lecture 21 Using Boltzmann machines to initialize backpropagation.

CSC321 Introduction to Neural Networks and Machine Learning Lecture 21 Using Boltzmann machines to initialize backpropagation Geoffrey Hinton Some problems with backpropagation The amount of information

### Forecasting Trade Direction and Size of Future Contracts Using Deep Belief Network

Forecasting Trade Direction and Size of Future Contracts Using Deep Belief Network Anthony Lai (aslai), MK Li (lilemon), Foon Wang Pong (ppong) Abstract Algorithmic trading, high frequency trading (HFT)

### MVA ENS Cachan. Lecture 2: Logistic regression & intro to MIL Iasonas Kokkinos Iasonas.kokkinos@ecp.fr

Machine Learning for Computer Vision 1 MVA ENS Cachan Lecture 2: Logistic regression & intro to MIL Iasonas Kokkinos Iasonas.kokkinos@ecp.fr Department of Applied Mathematics Ecole Centrale Paris Galen

### Classification in Networked Data: A Toolkit and a Univariate Case Study

Journal of Machine Learning Research 8 (27) 935-983 Submitted /5; Revised 6/6; Published 5/7 Classification in Networked Data: A Toolkit and a Univariate Case Study Sofus A. Macskassy Fetch Technologies,

### Bayesian Machine Learning (ML): Modeling And Inference in Big Data. Zhuhua Cai Google, Rice University caizhua@gmail.com

Bayesian Machine Learning (ML): Modeling And Inference in Big Data Zhuhua Cai Google Rice University caizhua@gmail.com 1 Syllabus Bayesian ML Concepts (Today) Bayesian ML on MapReduce (Next morning) Bayesian

### Various applications of restricted Boltzmann machines for bad quality training data

Wrocław University of Technology Various applications of restricted Boltzmann machines for bad quality training data Maciej Zięba Wroclaw University of Technology 20.06.2014 Motivation Big data - 7 dimensions1

### How Conditional Random Fields Learn Dynamics: An Example-Based Study

Computer Communication & Collaboration (2013) Submitted on 27/May/2013 How Conditional Random Fields Learn Dynamics: An Example-Based Study Mohammad Javad Shafiee School of Electrical & Computer Engineering,

### Invited Applications Paper

Invited Applications Paper - - Thore Graepel Joaquin Quiñonero Candela Thomas Borchert Ralf Herbrich Microsoft Research Ltd., 7 J J Thomson Avenue, Cambridge CB3 0FB, UK THOREG@MICROSOFT.COM JOAQUINC@MICROSOFT.COM

### On Contrastive Divergence Learning

On Contrastive Divergence Learning Miguel Á. Carreira-Perpiñán Geoffrey E. Hinton Dept. of Computer Science, University of Toronto 6 King s College Road. Toronto, ON M5S 3H5, Canada Email: {miguel,hinton}@cs.toronto.edu

### Tracking Groups of Pedestrians in Video Sequences

Tracking Groups of Pedestrians in Video Sequences Jorge S. Marques Pedro M. Jorge Arnaldo J. Abrantes J. M. Lemos IST / ISR ISEL / IST ISEL INESC-ID / IST Lisbon, Portugal Lisbon, Portugal Lisbon, Portugal

### Reinforcement Learning with Factored States and Actions

Journal of Machine Learning Research 5 (2004) 1063 1088 Submitted 3/02; Revised 1/04; Published 8/04 Reinforcement Learning with Factored States and Actions Brian Sallans Austrian Research Institute for

### CSCI567 Machine Learning (Fall 2014)

CSCI567 Machine Learning (Fall 2014) Drs. Sha & Liu {feisha,yanliu.cs}@usc.edu September 22, 2014 Drs. Sha & Liu ({feisha,yanliu.cs}@usc.edu) CSCI567 Machine Learning (Fall 2014) September 22, 2014 1 /

### Segmentation of 2D Gel Electrophoresis Spots Using a Markov Random Field

Segmentation of 2D Gel Electrophoresis Spots Using a Markov Random Field Christopher S. Hoeflich and Jason J. Corso csh7@cse.buffalo.edu, jcorso@cse.buffalo.edu Computer Science and Engineering University

### Probabilistic Graphical Models

Probabilistic Graphical Models Raquel Urtasun and Tamir Hazan TTI Chicago April 4, 2011 Raquel Urtasun and Tamir Hazan (TTI-C) Graphical Models April 4, 2011 1 / 22 Bayesian Networks and independences

### Learning Deep Feature Hierarchies

Learning Deep Feature Hierarchies Yoshua Bengio, U. Montreal Stanford University, California September 21st, 2009 Thanks to: Aaron Courville, Pascal Vincent, Dumitru Erhan, Olivier Delalleau, Olivier Breuleux,

### Section 5. Stan for Big Data. Bob Carpenter. Columbia University

Section 5. Stan for Big Data Bob Carpenter Columbia University Part I Overview Scaling and Evaluation data size (bytes) 1e18 1e15 1e12 1e9 1e6 Big Model and Big Data approach state of the art big model

### Norbert Schuff Professor of Radiology VA Medical Center and UCSF Norbert.schuff@ucsf.edu

Norbert Schuff Professor of Radiology Medical Center and UCSF Norbert.schuff@ucsf.edu Medical Imaging Informatics 2012, N.Schuff Course # 170.03 Slide 1/67 Overview Definitions Role of Segmentation Segmentation

### Advanced Spatial Statistics Fall 2012 NCSU. Fuentes Lecture notes

Advanced Spatial Statistics Fall 2012 NCSU Fuentes Lecture notes Areal unit data 2 Areal Modelling Areal unit data Key Issues Is there spatial pattern? Spatial pattern implies that observations from units

### Programming Tools based on Big Data and Conditional Random Fields

Programming Tools based on Big Data and Conditional Random Fields Veselin Raychev Martin Vechev Andreas Krause Department of Computer Science ETH Zurich Zurich Machine Learning and Data Science Meet-up,

### Evaluation of Machine Learning Techniques for Green Energy Prediction

arxiv:1406.3726v1 [cs.lg] 14 Jun 2014 Evaluation of Machine Learning Techniques for Green Energy Prediction 1 Objective Ankur Sahai University of Mainz, Germany We evaluate Machine Learning techniques

### Pixels Description of scene contents. Rob Fergus (NYU) Antonio Torralba (MIT) Yair Weiss (Hebrew U.) William T. Freeman (MIT) Banksy, 2006

Object Recognition Large Image Databases and Small Codes for Object Recognition Pixels Description of scene contents Rob Fergus (NYU) Antonio Torralba (MIT) Yair Weiss (Hebrew U.) William T. Freeman (MIT)

### Bayesian Clustering for Email Campaign Detection

Peter Haider haider@cs.uni-potsdam.de Tobias Scheffer scheffer@cs.uni-potsdam.de University of Potsdam, Department of Computer Science, August-Bebel-Strasse 89, 14482 Potsdam, Germany Abstract We discuss

### Generating more realistic images using gated MRF s

Generating more realistic images using gated MRF s Marc Aurelio Ranzato Volodymyr Mnih Geoffrey E. Hinton Department of Computer Science University of Toronto {ranzato,vmnih,hinton}@cs.toronto.edu Abstract

### Deterministic Sampling-based Switching Kalman Filtering for Vehicle Tracking

Proceedings of the IEEE ITSC 2006 2006 IEEE Intelligent Transportation Systems Conference Toronto, Canada, September 17-20, 2006 WA4.1 Deterministic Sampling-based Switching Kalman Filtering for Vehicle

### Lecture 6: The Bayesian Approach

Lecture 6: The Bayesian Approach What Did We Do Up to Now? We are given a model Log-linear model, Markov network, Bayesian network, etc. This model induces a distribution P(X) Learning: estimate a set

### Cell Phone based Activity Detection using Markov Logic Network

Cell Phone based Activity Detection using Markov Logic Network Somdeb Sarkhel sxs104721@utdallas.edu 1 Introduction Mobile devices are becoming increasingly sophisticated and the latest generation of smart

### Conditional Random Fields as Recurrent Neural Networks

Conditional Random Fields as Recurrent Neural Networks Shuai Zheng 1, Sadeep Jayasumana *1, Bernardino Romera-Paredes 1, Vibhav Vineet 1,2, Zhizhong Su 3, Dalong Du 3, Chang Huang 3, and Philip H. S. Torr

### Towards running complex models on big data

Towards running complex models on big data Working with all the genomes in the world without changing the model (too much) Daniel Lawson Heilbronn Institute, University of Bristol 2013 1 / 17 Motivation

### Fast Semantic Segmentation of 3D Point Clouds using a Dense CRF with Learned Parameters

Fast Semantic Segmentation of 3D Point Clouds using a Dense CRF with Learned Parameters Daniel Wolf, Johann Prankl and Markus Vincze Abstract In this paper, we present an efficient semantic segmentation

### Journal of Machine Learning Research 1 (2013) 1-1 Submitted 8/13; Published 10/13

Journal of Machine Learning Research 1 (2013) 1-1 Submitted 8/13; Published 10/13 PyStruct - Learning Structured Prediction in Python Andreas C. Müller Sven Behnke Institute of Computer Science, Department

### Object Recognition. Selim Aksoy. Bilkent University saksoy@cs.bilkent.edu.tr

Image Classification and Object Recognition Selim Aksoy Department of Computer Engineering Bilkent University saksoy@cs.bilkent.edu.tr Image classification Image (scene) classification is a fundamental

### Robust 3D Scan Point Classification using Associative Markov Networks

Robust 3D Scan Point Classification using Associative Markov Networks Rudolph Triebel and Kristian Kersting and Wolfram Burgard Department of Computer Science, University of Freiburg George-Koehler-Allee

### Introduction to Machine Learning

Introduction to Machine Learning Brown University CSCI 1950-F, Spring 2012 Prof. Erik Sudderth Lecture 5: Decision Theory & ROC Curves Gaussian ML Estimation Many figures courtesy Kevin Murphy s textbook,

### Supporting Online Material for

www.sciencemag.org/cgi/content/full/313/5786/504/dc1 Supporting Online Material for Reducing the Dimensionality of Data with Neural Networks G. E. Hinton* and R. R. Salakhutdinov *To whom correspondence

### Lecture 2: Introduction to belief (Bayesian) networks

Lecture 2: Introduction to belief (Bayesian) networks Conditional independence What is a belief network? Independence maps (I-maps) January 7, 2008 1 COMP-526 Lecture 2 Recall from last time: Conditional

### Chapter 14 Managing Operational Risks with Bayesian Networks

Chapter 14 Managing Operational Risks with Bayesian Networks Carol Alexander This chapter introduces Bayesian belief and decision networks as quantitative management tools for operational risks. Bayesian

### Dynamic Programming and Graph Algorithms in Computer Vision

Dynamic Programming and Graph Algorithms in Computer Vision Pedro F. Felzenszwalb and Ramin Zabih Abstract Optimization is a powerful paradigm for expressing and solving problems in a wide range of areas,

### Structured Learning and Prediction in Computer Vision. Contents

Foundations and Trends R in Computer Graphics and Vision Vol. 6, Nos. 3 4 (2010) 185 365 c 2011 S. Nowozin and C. H. Lampert DOI: 10.1561/0600000033 Structured Learning and Prediction in Computer Vision

### NEURAL NETWORKS A Comprehensive Foundation

NEURAL NETWORKS A Comprehensive Foundation Second Edition Simon Haykin McMaster University Hamilton, Ontario, Canada Prentice Hall Prentice Hall Upper Saddle River; New Jersey 07458 Preface xii Acknowledgments

### A Practical Guide to Training Restricted Boltzmann Machines

Department of Computer Science 6 King s College Rd, Toronto University of Toronto M5S 3G4, Canada http://learning.cs.toronto.edu fax: +1 416 978 1455 Copyright c Geoffrey Hinton 2010. August 2, 2010 UTML

### Artificial Intelligence Mar 27, Bayesian Networks 1 P (T D)P (D) + P (T D)P ( D) =

Artificial Intelligence 15-381 Mar 27, 2007 Bayesian Networks 1 Recap of last lecture Probability: precise representation of uncertainty Probability theory: optimal updating of knowledge based on new information

### Probabilistic Models for Big Data. Alex Davies and Roger Frigola University of Cambridge 13th February 2014

Probabilistic Models for Big Data Alex Davies and Roger Frigola University of Cambridge 13th February 2014 The State of Big Data Why probabilistic models for Big Data? 1. If you don t have to worry about

### Linear Classification. Volker Tresp Summer 2015

Linear Classification Volker Tresp Summer 2015 1 Classification Classification is the central task of pattern recognition Sensors supply information about an object: to which class do the object belong

### Finding the M Most Probable Configurations Using Loopy Belief Propagation

Finding the M Most Probable Configurations Using Loopy Belief Propagation Chen Yanover and Yair Weiss School of Computer Science and Engineering The Hebrew University of Jerusalem 91904 Jerusalem, Israel

### Optical Flow. Shenlong Wang CSC2541 Course Presentation Feb 2, 2016

Optical Flow Shenlong Wang CSC2541 Course Presentation Feb 2, 2016 Outline Introduction Variation Models Feature Matching Methods End-to-end Learning based Methods Discussion Optical Flow Goal: Pixel motion

### Graphical Models, Exponential Families, and Variational Inference

Foundations and Trends R in Machine Learning Vol. 1, Nos. 1 2 (2008) 1 305 c 2008 M. J. Wainwright and M. I. Jordan DOI: 10.1561/2200000001 Graphical Models, Exponential Families, and Variational Inference

### Square Root Propagation

Square Root Propagation Andrew G. Howard Department of Computer Science Columbia University New York, NY 10027 ahoward@cs.columbia.edu Tony Jebara Department of Computer Science Columbia University New

### Social Media Mining. Data Mining Essentials

Introduction Data production rate has been increased dramatically (Big Data) and we are able store much more data than before E.g., purchase data, social media data, mobile phone data Businesses and customers

### Quiz 1 for Name: Good luck! 20% 20% 20% 20% Quiz page 1 of 16

Quiz 1 for 6.034 Name: 20% 20% 20% 20% Good luck! 6.034 Quiz page 1 of 16 Question #1 30 points 1. Figure 1 illustrates decision boundaries for two nearest-neighbour classifiers. Determine which one of

### Spatial Statistics Chapter 3 Basics of areal data and areal data modeling

Spatial Statistics Chapter 3 Basics of areal data and areal data modeling Recall areal data also known as lattice data are data Y (s), s D where D is a discrete index set. This usually corresponds to data

### P 3 & Beyond: Solving Energies with Higher Order Cliques

P 3 & Beyond: Solving Energies with Higher Order Cliques Pushmeet Kohli M. Pawan Kumar Philip H. S. Torr Oxford Brookes University, UK pushmeet.kohli, pkmudigonda, philiptorr}@brookes.ac.uk http://cms.brookes.ac.uk/research/visiongroup/

### Basics of Statistical Machine Learning

CS761 Spring 2013 Advanced Machine Learning Basics of Statistical Machine Learning Lecturer: Xiaojin Zhu jerryzhu@cs.wisc.edu Modern machine learning is rooted in statistics. You will find many familiar

### Manifold Learning with Variational Auto-encoder for Medical Image Analysis

Manifold Learning with Variational Auto-encoder for Medical Image Analysis Eunbyung Park Department of Computer Science University of North Carolina at Chapel Hill eunbyung@cs.unc.edu Abstract Manifold

### Deep Belief Nets (An updated and extended version of my 2007 NIPS tutorial)

UCL Tutorial on: Deep Belief Nets (An updated and extended version of my 2007 NIPS tutorial) Geoffrey Hinton Canadian Institute for Advanced Research & Department of Computer Science University of Toronto

### Semantic Recognition: Object Detection and Scene Segmentation

Semantic Recognition: Object Detection and Scene Segmentation Xuming He xuming.he@nicta.com.au Computer Vision Research Group NICTA Robotic Vision Summer School 2015 Acknowledgement: Slides from Fei-Fei

### Web Content Mining. Dr. Ahmed Rafea

Web Content Mining Dr. Ahmed Rafea Outline Introduction The Web: Opportunities & Challenges Techniques Applications Introduction The Web is perhaps the single largest data source in the world. Web mining

### Machine Learning for Medical Image Analysis. A. Criminisi & the InnerEye team @ MSRC

Machine Learning for Medical Image Analysis A. Criminisi & the InnerEye team @ MSRC Medical image analysis the goal Automatic, semantic analysis and quantification of what observed in medical scans Brain

### Statistical Models in Data Mining

Statistical Models in Data Mining Sargur N. Srihari University at Buffalo The State University of New York Department of Computer Science and Engineering Department of Biostatistics 1 Srihari Flood of

### Image analysis based on probabilistic models

Image analysis based on probabilistic models Christian Heipke IPI - Institute for Photogrammetry and GeoInformation, Leibniz Universität Hannover Special thanks to Prof. Franz Rottensteiner Head, IPI Research

### Introduction to Machine Learning. Speaker: Harry Chao Advisor: J.J. Ding Date: 1/27/2011

Introduction to Machine Learning Speaker: Harry Chao Advisor: J.J. Ding Date: 1/27/2011 1 Outline 1. What is machine learning? 2. The basic of machine learning 3. Principles and effects of machine learning

### Divergence measures and message passing

Divergence measures and message passing Tom Minka Microsoft Research Cambridge, UK with thanks to the Machine Learning and Perception Group 1 Message-Passing Algorithms Mean-field MF [Peterson,Anderson

### Summary of Probability

Summary of Probability Mathematical Physics I Rules of Probability The probability of an event is called P(A), which is a positive number less than or equal to 1. The total probability for all possible

### Sampling via Moment Sharing: A New Framework for Distributed Bayesian Inference for Big Data

Sampling via Moment Sharing: A New Framework for Distributed Bayesian Inference for Big Data (Oxford) in collaboration with: Minjie Xu, Jun Zhu, Bo Zhang (Tsinghua) Balaji Lakshminarayanan (Gatsby) Bayesian

### INTRODUCTION TO SIGNAL PROCESSING

INTRODUCTION TO SIGNAL PROCESSING Iasonas Kokkinos Ecole Centrale Paris Lecture 7 Introduction to Random Signals Sources of randomness Inherent in the signal generation Noise due to imaging Prostate MRI

### Relational Object Maps for Mobile Robots

Relational Object Maps for Mobile Robots Benson Limketkai and Lin Liao and Dieter Fox Department of Computer Science and Engineering University of Washington Seattle, WA 9895 Abstract Mobile robot map

### BIOINF 585 Fall 2015 Machine Learning for Systems Biology & Clinical Informatics http://www.ccmb.med.umich.edu/node/1376

Course Director: Dr. Kayvan Najarian (DCM&B, kayvan@umich.edu) Lectures: Labs: Mondays and Wednesdays 9:00 AM -10:30 AM Rm. 2065 Palmer Commons Bldg. Wednesdays 10:30 AM 11:30 AM (alternate weeks) Rm.

### Factored 3-Way Restricted Boltzmann Machines For Modeling Natural Images

For Modeling Natural Images Marc Aurelio Ranzato Alex Krizhevsky Geoffrey E. Hinton Department of Computer Science - University of Toronto Toronto, ON M5S 3G4, CANADA Abstract Deep belief nets have been

### Max Flow. Lecture 4. Optimization on graphs. C25 Optimization Hilary 2013 A. Zisserman. Max-flow & min-cut. The augmented path algorithm

Lecture 4 C5 Optimization Hilary 03 A. Zisserman Optimization on graphs Max-flow & min-cut The augmented path algorithm Optimization for binary image graphs Applications Max Flow Given: a weighted directed

### Training Conditional Random Fields using Virtual Evidence Boosting

Training Conditional Random Fields using Virtual Evidence Boosting Lin Liao Tanzeem Choudhury Dieter Fox Henry Kautz University of Washington Intel Research Department of Computer Science & Engineering

### A fast learning algorithm for deep belief nets

A fast learning algorithm for deep belief nets Geoffrey E. Hinton and Simon Osindero Department of Computer Science University of Toronto 10 Kings College Road Toronto, Canada M5S 3G4 {hinton, osindero}@cs.toronto.edu

### Distributed Structured Prediction for Big Data

Distributed Structured Prediction for Big Data A. G. Schwing ETH Zurich aschwing@inf.ethz.ch T. Hazan TTI Chicago M. Pollefeys ETH Zurich R. Urtasun TTI Chicago Abstract The biggest limitations of learning

### An Introduction to Statistical Machine Learning - Overview -

An Introduction to Statistical Machine Learning - Overview - Samy Bengio bengio@idiap.ch Dalle Molle Institute for Perceptual Artificial Intelligence (IDIAP) CP 592, rue du Simplon 4 1920 Martigny, Switzerland

### Gaussian Classifiers CS498

Gaussian Classifiers CS498 Today s lecture The Gaussian Gaussian classifiers A slightly more sophisticated classifier Nearest Neighbors We can classify with nearest neighbors x m 1 m 2 Decision boundary

### An Analysis of Single-Layer Networks in Unsupervised Feature Learning

An Analysis of Single-Layer Networks in Unsupervised Feature Learning Adam Coates 1, Honglak Lee 2, Andrew Y. Ng 1 1 Computer Science Department, Stanford University {acoates,ang}@cs.stanford.edu 2 Computer

### Deformable Part Models with CNN Features

Deformable Part Models with CNN Features Pierre-André Savalle 1, Stavros Tsogkas 1,2, George Papandreou 3, Iasonas Kokkinos 1,2 1 Ecole Centrale Paris, 2 INRIA, 3 TTI-Chicago Abstract. In this work we

### HT2015: SC4 Statistical Data Mining and Machine Learning

HT2015: SC4 Statistical Data Mining and Machine Learning Dino Sejdinovic Department of Statistics Oxford http://www.stats.ox.ac.uk/~sejdinov/sdmml.html Bayesian Nonparametrics Parametric vs Nonparametric

### Support Vector Machine (SVM)

Support Vector Machine (SVM) CE-725: Statistical Pattern Recognition Sharif University of Technology Spring 2013 Soleymani Outline Margin concept Hard-Margin SVM Soft-Margin SVM Dual Problems of Hard-Margin

### Lecture 10: Sequential Data Models

CSC2515 Fall 2007 Introduction to Machine Learning Lecture 10: Sequential Data Models 1 Example: sequential data Until now, considered data to be i.i.d. Turn attention to sequential data Time-series: stock

### Statistical Arbitrage Stock Trading using Time Delay Neural Networks

Statistical Arbitrage Stock Trading using Time Delay Neural Networks Chris Pennock Final Project, Machine Learning, Fall 2004 Instructor: Yann Le Cun Introduction: Can a TDNN be used to profit from the

### Introduction to Machine Learning

Introduction to Machine Learning Brown University CSCI 1950-F, Spring 2012 Instructor: Erik Sudderth Graduate TAs: Dae Il Kim & Ben Swanson Head Undergraduate TA: William Allen Undergraduate TAs: Soravit