Latent variable and deep modeling with Gaussian processes; application to system identification. Andreas Damianou
|
|
- Ronald Golden
- 8 years ago
- Views:
Transcription
1 Latent variable and deep modeling with Gaussian processes; application to system identification Andreas Damianou Department of Computer Science, University of Sheffield, UK Brown University, 17 Feb. 2016
2 Outline Part 1: Introduction Recap from yesterday Part 2: Incorporating latent variables Unsupervised GP learning Structure in the latent space - representation learning Bayesian Automatic Relevance Determination Deep GP learning Multi-view GP learning Part 3: Dynamical systems Policy learning Autoregressive Dynamics Going deeper: Deep Recurrent Gaussian Process Regressive dynamics with deep GPs
3 Outline Part 1: Introduction Recap from yesterday Part 2: Incorporating latent variables Unsupervised GP learning Structure in the latent space - representation learning Bayesian Automatic Relevance Determination Deep GP learning Multi-view GP learning Part 3: Dynamical systems Policy learning Autoregressive Dynamics Going deeper: Deep Recurrent Gaussian Process Regressive dynamics with deep GPs
4 Part 1: GP Take-home messages Gaussian processes as infinite dimensional Gaussian distributions can be used as priors over functions Non-parametric: training data act as parameters Uncertainty Quantification Learning from scarce data
5 Notation and graphical models Graphical models x y f x f y White nodes: Observed variables Shaded nodes: Unobserved, or latent variables. Convention: Sometimes the latent function will be placed next to the arrow; sometimes I ll explicitly include the collection of instantiations f
6 Outline Part 1: Introduction Recap from yesterday Part 2: Incorporating latent variables Unsupervised GP learning Structure in the latent space - representation learning Bayesian Automatic Relevance Determination Deep GP learning Multi-view GP learning Part 3: Dynamical systems Policy learning Autoregressive Dynamics Going deeper: Deep Recurrent Gaussian Process Regressive dynamics with deep GPs
7 Unsupervised learning: GP-LVM x y f If X is unobserved, treat it as a parameter and optimize over it. GP-LVM is interpreted as non-linear, non-parametric probabilistic PCA (Lawrence, JMLR 2015). Objective (likelihood function) is similar to GPs: p(y x) = p(y f)p(f x)df = N (y 0, K ff + σ 2 I) but now x s are optimized too. Neil Lawrence, JMLR, 2005.
8 Unsupervised learning: GP-LVM x y f If X is unobserved, treat it as a parameter and optimize over it. GP-LVM is interpreted as non-linear, non-parametric probabilistic PCA (Lawrence, JMLR 2015). Objective (likelihood function) is similar to GPs: p(y x) = p(y f)p(f x)df = N (y 0, K ff + σ 2 I) but now x s are optimized too. Neil Lawrence, JMLR, 2005.
9 Unsupervised learning: GP-LVM x y f If X is unobserved, treat it as a parameter and optimize over it. GP-LVM is interpreted as non-linear, non-parametric probabilistic PCA (Lawrence, JMLR 2015). Objective (likelihood function) is similar to GPs: p(y x) = p(y f)p(f x)df = N (y 0, K ff + σ 2 I) but now x s are optimized too. Neil Lawrence, JMLR, 2005.
10 Fitting the GP-LVM
11 Fitting the GP-LVM 5.5 Figure credits: C. H. Ek
12 Fitting the GP-LVM 5.5 Figure credits: C. H. Ek Additional difficulty: x s are also missing! Improvement: Invoke the Bayesian methodology to find x s too.
13 Bayesian GP-LVM Additionally integrate out the latent space. p(y) = p(y f)p(f x)p(x)df dx where p(x) N (0, I) Titsias and Lawrence 2010, AISTATS; Damianou et al. 2015, JMLR
14 Automatic dimensionality detection In general, X = [x 1,..., x N ] with x (n) R Q. Automatic dimenionality detection by employing automatic relevance determination (ARD) priors for the mapping f. f GP(0, k f ) with: k f (x (i), x (j)) = σ 2 exp 1 Q ( w (q) x (i,q) x (j,q)) 2 2 Example: q=1
15 Outline Part 1: Introduction Recap from yesterday Part 2: Incorporating latent variables Unsupervised GP learning Structure in the latent space - representation learning Bayesian Automatic Relevance Determination Deep GP learning Multi-view GP learning Part 3: Dynamical systems Policy learning Autoregressive Dynamics Going deeper: Deep Recurrent Gaussian Process Regressive dynamics with deep GPs
16 Sampling from a deep GP Input f Unobserved f Y Output
17 Model x = given f h = f h (x) = GP(0, k h (x, x)) h = f h (x) + ɛ h f y = f y (h) = GP(0, k f (h, h)) y = f y (h) + ɛ y So: y = f y (f h (x) + ɛ h ) + ɛ y Objective: p(y x) = p(y f y )p(f y h)p(h f h )p(f h x)df y df h dh Damianou and Lawrence, AISTATS 2013; Damianou, PhD Thesis, 2015
18 Deep GP: Step function (credits for idea to J. Hensman, R. Calandra, M. Deisenroth) (standard GP) (deep GP - 1 hidden layer) (deep GP - 3 hidden layers)
19 Deep GP with three hidden plus one warping layer Standard GP
20 Non-linear feature learning Stacked GP (nonlinear) Stacked PCA (linear) Successive warping creates knots which act as features. Features discovered in one layer remain in the next ones (ie knots are not un-tied) f 1 (f 2 (f 3 (x))) With linear warpings (stacked PCA) we can t achieve this effect: W 1 (W 2 (W 3 x))) = W x Damianou, PhD Thesis, 2015
21 Deep GP: MNIST example (Live sampling video)
22 Outline Part 1: Introduction Recap from yesterday Part 2: Incorporating latent variables Unsupervised GP learning Structure in the latent space - representation learning Bayesian Automatic Relevance Determination Deep GP learning Multi-view GP learning Part 3: Dynamical systems Policy learning Autoregressive Dynamics Going deeper: Deep Recurrent Gaussian Process Regressive dynamics with deep GPs
23 Multi-view: Manifold Relevance Determination (MRD) Multi-view data arise from multiple information sources. These sources naturally contain some overlapping, or shared signal (since they describe the same phenomenon ), but also have some private signal. MRD: Model such data using overlapping sets of latent variables Demo: Faces, mocap-silhouette Damianou et al., ICML 2012; Damianou, PhD Thesis, 2015
24 Multi-view: Manifold Relevance Determination (MRD) Multi-view data arise from multiple information sources. These sources naturally contain some overlapping, or shared signal (since they describe the same phenomenon ), but also have some private signal. MRD: Model such data using overlapping sets of latent variables Demo: Faces, mocap-silhouette Damianou et al., ICML 2012; Damianou, PhD Thesis, 2015
25 Multi-view: Manifold Relevance Determination (MRD) Multi-view data arise from multiple information sources. These sources naturally contain some overlapping, or shared signal (since they describe the same phenomenon ), but also have some private signal. MRD: Model such data using overlapping sets of latent variables Demo: Faces, mocap-silhouette Damianou et al., ICML 2012; Damianou, PhD Thesis, 2015
26 Deep GPs: Another multi-view example X X
27 Automatic structure discovery summary: ARD and MRD Tools: ARD: Eliminate uncessary nodes/connections MRD: Conditional independencies Approximating evidence: Number of layers (?)
28 Automatic structure discovery summary: ARD and MRD Tools: ARD: Eliminate uncessary nodes/connections MRD: Conditional independencies Approximating evidence: Number of layers (?)
29 More deepgp representation learning examples icub interaction Faces - autoencoder
30 Outline Part 1: Introduction Recap from yesterday Part 2: Incorporating latent variables Unsupervised GP learning Structure in the latent space - representation learning Bayesian Automatic Relevance Determination Deep GP learning Multi-view GP learning Part 3: Dynamical systems Policy learning Autoregressive Dynamics Going deeper: Deep Recurrent Gaussian Process Regressive dynamics with deep GPs
31 Dynamical systems examples Through a model, we wish to learn to: perform free simulation by learning patterns coming from the latent generation process (a mechanistic system we do not know) perform inter/extrapolation in time-series data which are very high-dimensional (e.g. video) detect outliers in data coming from a dynamical system optimize policies for control based on a model of the data.
32 Policy learning Model-based policy learning (Cart-pole) (Unicycle) Work by: Marc Deisenroth, Andrew McHutchon, Carl Rasmussen
33 NARX model A standard NARX model considers an input vector x i R D comprised of L y past observed outputs y i R and L u past exogenous inputs u i R: x i = [y i 1,, y i Ly, u i 1,, u i Lu ], y i = f(x i ) + ɛ (y) i, ɛ (y) i Latent auto-regressive GP model: N (ɛ (y) i 0, σy), 2 x i = f(x i 1,, x i Lx u i 1,, u i Lu ) + ɛ (x) i, y i = x i + ɛ (y) i, Contribution 1: Simultaneous auto-regressive and representation learning. Contribution 2: Latents avoid the feedback of possibly corrupted observations into the dynamics. Mattos, Damianou, Barreto, Lawrence, 2016
34 NARX model A standard NARX model considers an input vector x i R D comprised of L y past observed outputs y i R and L u past exogenous inputs u i R: x i = [y i 1,, y i Ly, u i 1,, u i Lu ], y i = f(x i ) + ɛ (y) i, ɛ (y) i Latent auto-regressive GP model: N (ɛ (y) i 0, σy), 2 x i = f(x i 1,, x i Lx u i 1,, u i Lu ) + ɛ (x) i, y i = x i + ɛ (y) i, Contribution 1: Simultaneous auto-regressive and representation learning. Contribution 2: Latents avoid the feedback of possibly corrupted observations into the dynamics. Mattos, Damianou, Barreto, Lawrence, 2016
35 Robustness to outliers Latent auto-regressive GP model: x i = f(x i 1,, x i Lx u i 1,, u i Lu ) + ɛ (x) i, y i = x i + ɛ (y) i, ɛ (x) i ɛ (y) i N (ɛ (x) i 0, σx), 2 N (ɛ (y) 0, τ 1 i i ), τ i Γ(τ i α, β), Contribution 3: Switching-off outliers by including the above Student-t likelihood for the noise. Mattos, Damianou, Barreto, Lawrence, DYCOPS 2016
36 Robust GP autoregressive model: demonstration (a) Artificial 1. (b) Artificial 2. Figure: RMSE values for free simulation on test data with different levels of contamination by outliers.
37 (c) Artificial 3. (d) Artificial 4. (e) Artificial 5.
38 Going deeper: Deep Recurrent Gaussian Process u x (1) x (2) x (H ) y Figure 1: RGP graphical model with H hidden layers. x is the lagged latent function values augmented with the lagged exogenous inputs. Mattos, Dai, Damianou, Barreto, Lawrence, ICLR 2016
39 Inference is tricky... log p(y) N L 2 Tr H+1 h=1 ( ( K (H+1) z log 2πσ 2 h 1 ) 1 Ψ (H+1) (σH+1 2 y Ψ (H+1) )2 1 { ( H + 1 N 2σ 2 h=1 h i=l K (h) z 1 2 K(h) + 1 ( µ (h)) (h) Ψ 2(σh 2)2 1 N ) i=l+1 x (h) i ( q 2σ 2 H+1 ) ) ( y y + Ψ (H+1) 0 K z (H+1) ( K z (H+1) + 1 Ψ (H+1) σh ( λ (h) i + µ (h)) µ (h) + Ψ (h) 0 Tr z + 1 Ψ (h) σh 2 2 ( K z (h) + 1 ) 1 ( Ψ (h) σh 2 2 ( ) L log q + x (h) i x (h) i i=1 x (h) i 1 2 K(H+1) z + 1 σh+1 2 ) 1 ( ) y Ψ (h) 1 ( q x (h) i Ψ (H+1) 1 ( ( K (h) z ) µ (h) ) ( log p Ψ (H+1) 2 ) 1 Ψ (h) 2 x (h) i ) }. ) )
40 Results Results in nonlinear systems identification: 1. artificial dataset 2. drive dataset: by a system with two electric motors that drive a pulley using a flexible belt. input: the sum of voltages applied to the motors output: speed of the belt.
41 RGP GPNARX MLP-NARX LSTM
42 RGP GPNARX MLP-NARX LSTM
43 Avatar control Figure: The generated motion with a step function signal, starting with walking (blue), switching to running (red) and switching back to walking (blue) Videos: Switching between learned speeds Interpolating (un)seen speed Constant unseen speed
44 Regressive dynamics with deep GPs t x h f Instead of coupling f s by encoding the Markov property, we couple them by coupling the f s inputs through another GP with time as input. y = f(x) + ɛ x GP(0, k x (t, t)) f GP(0, k f (x, x)) y Damianou et al., NIPS 2011; Damianou, Titsias and Lawrence, JMLR, 2015
45 Dynamics Dynamics are encoded in the covariance matrix K = k(t, t). We can consider special forms for K. Model individual sequences Model periodic data (missa) (dog) (mocap)
46 Summary Data-driven, model-based approach to real and control problems Gaussian processes: Uncertainty quantification / propagation gives an advantage Deep Gaussian processes: Representation learning + dynamics learning Future work: Deep Gaussian processes + mechanistic information; consider real applications. Thanks!
47 Thanks Thanks to: Neil Lawrence, Michalis Titsias, Carl Henrik Ek, James Hensman, Cesar Lincoln Mattos, Zhenwen Dai, Javier Gonzalez, Tony Prescott, Uriel Martinez-Hernandez, Luke Boorman Thanks to George Karniadakis, Paris Perdikaris for hosting me
48 BACKUP SLIDES 1: Deep GP Optimization - variational inference
49 MAP optimisation? x f 1 h 1 f 2 h 2 f 3 y Joint Distr. = p(y h 2 )p(h 2 h 1 )p(h 1 x) MAP optimization is extremely problematic because: Dimensionality of hs has to be decided a priori Prone to overfitting, if h are treated as parameters Deep structures are not supported by the model s objective but have to be forced [Lawrence & Moore 07] We want: To use the marginal likelihood as the objective: marg. lik. = h 2,h 1 p(y h 2 )p(h 2 h 1 )p(h 1 x) Further regularization tools.
50 Marginal likelihood is intractable Let s try to marginalize out the top layer only: p(h 2 ) = p(h 2 h 1 )p(h 1 )dh 1 = p(h 2 f 2 )p(f 2 h 1 )p(h 1 )df 2 h 1 = p(h 2 f 2 )[ ] p(f 2 h 1 )p(h 1 )dh 1 df 2 } {{ } Intractable! Intractability: h 1 appears non-linearly in p(f 2 h 1 ), inside K 1 (and also the determinant term), where K = k(h 1, h 1 ).
51 Solution: Variational Inference Similar issues arise for 1-layer models. Solution was given by Titsias and Lawrence, A small modification to that solution does the trick in deep GPs too. Extend Titsias method for variational learning of inducing variables in Sparse GPs. Analytic variational bound F p(y x) Approximately marginalise out h Hence obtain the approximate posterior q(h)
52 Inducing points: sparseness, tractability and Big Data h (1) f (1) h (2) f (2) h (30) f (30) h (31) f (31) h (N) f (N)
53 Inducing points: sparseness, tractability and Big Data h (1) f (1) h (2) f (2) h (30) f (30) h (i) u (i) h (31) f (31) h (N) f (N)
54 Inducing points: sparseness, tractability and Big Data h (1) f (1) h (2) f (2) h (30) f (30) h (i) u (i) h (31) f (31) h (N) f (N) Inducing points originally introduced for faster (sparse) GPs But this also induces tractability in our models, due to the conditional independencies assumed Viewing them as global variables extension to Big Data [Hensman et al., UAI 2013]
55 Bayesian regularization F = Data Fit KL (q(h 1 ) p(h 1 )) + } {{ } Regularisation L H (q(h l )) } {{ } Regularisation l=2
arxiv:1410.4984v1 [cs.dc] 18 Oct 2014
Gaussian Process Models with Parallelization and GPU acceleration arxiv:1410.4984v1 [cs.dc] 18 Oct 2014 Zhenwen Dai Andreas Damianou James Hensman Neil Lawrence Department of Computer Science University
More informationHT2015: SC4 Statistical Data Mining and Machine Learning
HT2015: SC4 Statistical Data Mining and Machine Learning Dino Sejdinovic Department of Statistics Oxford http://www.stats.ox.ac.uk/~sejdinov/sdmml.html Bayesian Nonparametrics Parametric vs Nonparametric
More informationBayesian Time Series Learning with Gaussian Processes
Bayesian Time Series Learning with Gaussian Processes Roger Frigola-Alcalde Department of Engineering St Edmund s College University of Cambridge August 2015 This dissertation is submitted for the degree
More informationGaussian Processes for Big Data
Gaussian Processes for Big Data James Hensman Dept. Computer Science The University of Sheffield Sheffield, UK Nicolò Fusi Dept. Computer Science The University of Sheffield Sheffield, UK Neil D. Lawrence
More informationLearning Gaussian process models from big data. Alan Qi Purdue University Joint work with Z. Xu, F. Yan, B. Dai, and Y. Zhu
Learning Gaussian process models from big data Alan Qi Purdue University Joint work with Z. Xu, F. Yan, B. Dai, and Y. Zhu Machine learning seminar at University of Cambridge, July 4 2012 Data A lot of
More informationHybrid Discriminative-Generative Approach with Gaussian Processes
Hybrid Discriminative-Generative Approach with Gaussian Processes Ricardo Andrade-Pacheco acq11ra@sheffield.ac.uk Max Zwießele m.zwiessele@sheffield.ac.uk James Hensman james.hensman@sheffield.ac.uk Neil
More informationGaussian Process Dynamical Models
Gaussian Process Dynamical Models Jack M. Wang, David J. Fleet, Aaron Hertzmann Department of Computer Science University of Toronto, Toronto, ON M5S 3G4 {jmwang,hertzman}@dgp.toronto.edu, fleet@cs.toronto.edu
More informationModel-based Synthesis. Tony O Hagan
Model-based Synthesis Tony O Hagan Stochastic models Synthesising evidence through a statistical model 2 Evidence Synthesis (Session 3), Helsinki, 28/10/11 Graphical modelling The kinds of models that
More informationSTA 4273H: Statistical Machine Learning
STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.cs.toronto.edu/~rsalakhu/ Lecture 6 Three Approaches to Classification Construct
More informationBayesian Statistics: Indian Buffet Process
Bayesian Statistics: Indian Buffet Process Ilker Yildirim Department of Brain and Cognitive Sciences University of Rochester Rochester, NY 14627 August 2012 Reference: Most of the material in this note
More informationGaussian Process Training with Input Noise
Gaussian Process Training with Input Noise Andrew McHutchon Department of Engineering Cambridge University Cambridge, CB PZ ajm57@cam.ac.uk Carl Edward Rasmussen Department of Engineering Cambridge University
More informationCell Phone based Activity Detection using Markov Logic Network
Cell Phone based Activity Detection using Markov Logic Network Somdeb Sarkhel sxs104721@utdallas.edu 1 Introduction Mobile devices are becoming increasingly sophisticated and the latest generation of smart
More informationNeural Networks for Machine Learning. Lecture 13a The ups and downs of backpropagation
Neural Networks for Machine Learning Lecture 13a The ups and downs of backpropagation Geoffrey Hinton Nitish Srivastava, Kevin Swersky Tijmen Tieleman Abdel-rahman Mohamed A brief history of backpropagation
More informationProbabilistic Models for Big Data. Alex Davies and Roger Frigola University of Cambridge 13th February 2014
Probabilistic Models for Big Data Alex Davies and Roger Frigola University of Cambridge 13th February 2014 The State of Big Data Why probabilistic models for Big Data? 1. If you don t have to worry about
More informationLearning Shared Latent Structure for Image Synthesis and Robotic Imitation
University of Washington CSE TR 2005-06-02 To appear in Advances in NIPS 18, 2006 Learning Shared Latent Structure for Image Synthesis and Robotic Imitation Aaron P. Shon Keith Grochow Aaron Hertzmann
More informationManifold Learning with Variational Auto-encoder for Medical Image Analysis
Manifold Learning with Variational Auto-encoder for Medical Image Analysis Eunbyung Park Department of Computer Science University of North Carolina at Chapel Hill eunbyung@cs.unc.edu Abstract Manifold
More informationCS 2750 Machine Learning. Lecture 1. Machine Learning. http://www.cs.pitt.edu/~milos/courses/cs2750/ CS 2750 Machine Learning.
Lecture Machine Learning Milos Hauskrecht milos@cs.pitt.edu 539 Sennott Square, x5 http://www.cs.pitt.edu/~milos/courses/cs75/ Administration Instructor: Milos Hauskrecht milos@cs.pitt.edu 539 Sennott
More informationAcknowledgments. Data Mining with Regression. Data Mining Context. Overview. Colleagues
Data Mining with Regression Teaching an old dog some new tricks Acknowledgments Colleagues Dean Foster in Statistics Lyle Ungar in Computer Science Bob Stine Department of Statistics The School of the
More informationGaussian Processes in Machine Learning
Gaussian Processes in Machine Learning Carl Edward Rasmussen Max Planck Institute for Biological Cybernetics, 72076 Tübingen, Germany carl@tuebingen.mpg.de WWW home page: http://www.tuebingen.mpg.de/ carl
More informationGaussian Processes for Big Data Problems
Gaussian Processes for Big Data Problems Marc Deisenroth Department of Computing Imperial College London http://wp.doc.ic.ac.uk/sml/marc-deisenroth Machine Learning Summer School, Chalmers University 14
More informationSwitching Gaussian Process Dynamic Models for Simultaneous Composite Motion Tracking and Recognition
Switching Gaussian Process Dynamic Models for Simultaneous Composite Motion Tracking and Recognition Jixu Chen Minyoung Kim Yu Wang Qiang Ji Department of Electrical,Computer and System Engineering Rensselaer
More informationCourse: Model, Learning, and Inference: Lecture 5
Course: Model, Learning, and Inference: Lecture 5 Alan Yuille Department of Statistics, UCLA Los Angeles, CA 90095 yuille@stat.ucla.edu Abstract Probability distributions on structured representation.
More informationDeterministic Sampling-based Switching Kalman Filtering for Vehicle Tracking
Proceedings of the IEEE ITSC 2006 2006 IEEE Intelligent Transportation Systems Conference Toronto, Canada, September 17-20, 2006 WA4.1 Deterministic Sampling-based Switching Kalman Filtering for Vehicle
More informationSampling via Moment Sharing: A New Framework for Distributed Bayesian Inference for Big Data
Sampling via Moment Sharing: A New Framework for Distributed Bayesian Inference for Big Data (Oxford) in collaboration with: Minjie Xu, Jun Zhu, Bo Zhang (Tsinghua) Balaji Lakshminarayanan (Gatsby) Bayesian
More informationBasics of Statistical Machine Learning
CS761 Spring 2013 Advanced Machine Learning Basics of Statistical Machine Learning Lecturer: Xiaojin Zhu jerryzhu@cs.wisc.edu Modern machine learning is rooted in statistics. You will find many familiar
More informationMachine Learning and Pattern Recognition Logistic Regression
Machine Learning and Pattern Recognition Logistic Regression Course Lecturer:Amos J Storkey Institute for Adaptive and Neural Computation School of Informatics University of Edinburgh Crichton Street,
More informationGaussian Process Latent Variable Models for Visualisation of High Dimensional Data
Gaussian Process Latent Variable Models for Visualisation of High Dimensional Data Neil D. Lawrence Department of Computer Science, University of Sheffield, Regent Court, 211 Portobello Street, Sheffield,
More informationStatistical Machine Learning
Statistical Machine Learning UoC Stats 37700, Winter quarter Lecture 4: classical linear and quadratic discriminants. 1 / 25 Linear separation For two classes in R d : simple idea: separate the classes
More informationBayesian Image Super-Resolution
Bayesian Image Super-Resolution Michael E. Tipping and Christopher M. Bishop Microsoft Research, Cambridge, U.K..................................................................... Published as: Bayesian
More informationBayesian networks - Time-series models - Apache Spark & Scala
Bayesian networks - Time-series models - Apache Spark & Scala Dr John Sandiford, CTO Bayes Server Data Science London Meetup - November 2014 1 Contents Introduction Bayesian networks Latent variables Anomaly
More informationCoding and decoding with convolutional codes. The Viterbi Algor
Coding and decoding with convolutional codes. The Viterbi Algorithm. 8 Block codes: main ideas Principles st point of view: infinite length block code nd point of view: convolutions Some examples Repetition
More informationEfficient online learning of a non-negative sparse autoencoder
and Machine Learning. Bruges (Belgium), 28-30 April 2010, d-side publi., ISBN 2-93030-10-2. Efficient online learning of a non-negative sparse autoencoder Andre Lemme, R. Felix Reinhart and Jochen J. Steil
More informationMachine Learning and Data Analysis overview. Department of Cybernetics, Czech Technical University in Prague. http://ida.felk.cvut.
Machine Learning and Data Analysis overview Jiří Kléma Department of Cybernetics, Czech Technical University in Prague http://ida.felk.cvut.cz psyllabus Lecture Lecturer Content 1. J. Kléma Introduction,
More informationStatistical Models in Data Mining
Statistical Models in Data Mining Sargur N. Srihari University at Buffalo The State University of New York Department of Computer Science and Engineering Department of Biostatistics 1 Srihari Flood of
More informationMonte Carlo and Empirical Methods for Stochastic Inference (MASM11/FMS091)
Monte Carlo and Empirical Methods for Stochastic Inference (MASM11/FMS091) Magnus Wiktorsson Centre for Mathematical Sciences Lund University, Sweden Lecture 5 Sequential Monte Carlo methods I February
More informationGaussian Processes to Speed up Hamiltonian Monte Carlo
Gaussian Processes to Speed up Hamiltonian Monte Carlo Matthieu Lê Murray, Iain http://videolectures.net/mlss09uk_murray_mcmc/ Rasmussen, Carl Edward. "Gaussian processes to speed up hybrid Monte Carlo
More informationFeature Engineering in Machine Learning
Research Fellow Faculty of Information Technology, Monash University, Melbourne VIC 3800, Australia August 21, 2015 Outline A Machine Learning Primer Machine Learning and Data Science Bias-Variance Phenomenon
More informationMACHINE LEARNING IN HIGH ENERGY PHYSICS
MACHINE LEARNING IN HIGH ENERGY PHYSICS LECTURE #1 Alex Rogozhnikov, 2015 INTRO NOTES 4 days two lectures, two practice seminars every day this is introductory track to machine learning kaggle competition!
More informationINDIRECT INFERENCE (prepared for: The New Palgrave Dictionary of Economics, Second Edition)
INDIRECT INFERENCE (prepared for: The New Palgrave Dictionary of Economics, Second Edition) Abstract Indirect inference is a simulation-based method for estimating the parameters of economic models. Its
More informationData Mining: An Overview. David Madigan http://www.stat.columbia.edu/~madigan
Data Mining: An Overview David Madigan http://www.stat.columbia.edu/~madigan Overview Brief Introduction to Data Mining Data Mining Algorithms Specific Eamples Algorithms: Disease Clusters Algorithms:
More informationPrinciples of Data Mining by Hand&Mannila&Smyth
Principles of Data Mining by Hand&Mannila&Smyth Slides for Textbook Ari Visa,, Institute of Signal Processing Tampere University of Technology October 4, 2010 Data Mining: Concepts and Techniques 1 Differences
More informationAUTOMATION OF ENERGY DEMAND FORECASTING. Sanzad Siddique, B.S.
AUTOMATION OF ENERGY DEMAND FORECASTING by Sanzad Siddique, B.S. A Thesis submitted to the Faculty of the Graduate School, Marquette University, in Partial Fulfillment of the Requirements for the Degree
More informationClassification. Chapter 3
Chapter 3 Classification In chapter we have considered regression problems, where the targets are real valued. Another important class of problems is classification problems, where we wish to assign an
More informationMachine Learning and Statistics: What s the Connection?
Machine Learning and Statistics: What s the Connection? Institute for Adaptive and Neural Computation School of Informatics, University of Edinburgh, UK August 2006 Outline The roots of machine learning
More informationClassification Problems
Classification Read Chapter 4 in the text by Bishop, except omit Sections 4.1.6, 4.1.7, 4.2.4, 4.3.3, 4.3.5, 4.3.6, 4.4, and 4.5. Also, review sections 1.5.1, 1.5.2, 1.5.3, and 1.5.4. Classification Problems
More informationDetection of changes in variance using binary segmentation and optimal partitioning
Detection of changes in variance using binary segmentation and optimal partitioning Christian Rohrbeck Abstract This work explores the performance of binary segmentation and optimal partitioning in the
More informationDynamical Binary Latent Variable Models for 3D Human Pose Tracking
Dynamical Binary Latent Variable Models for 3D Human Pose Tracking Graham W. Taylor New York University New York, USA gwtaylor@cs.nyu.edu Leonid Sigal Disney Research Pittsburgh, USA lsigal@disneyresearch.com
More informationExact Inference for Gaussian Process Regression in case of Big Data with the Cartesian Product Structure
Exact Inference for Gaussian Process Regression in case of Big Data with the Cartesian Product Structure Belyaev Mikhail 1,2,3, Burnaev Evgeny 1,2,3, Kapushev Yermek 1,2 1 Institute for Information Transmission
More informationStatistics Graduate Courses
Statistics Graduate Courses STAT 7002--Topics in Statistics-Biological/Physical/Mathematics (cr.arr.).organized study of selected topics. Subjects and earnable credit may vary from semester to semester.
More informationUnsupervised Learning
Unsupervised Learning Zoubin Ghahramani Gatsby Computational Neuroscience Unit University College London, UK zoubin@gatsby.ucl.ac.uk http://www.gatsby.ucl.ac.uk/~zoubin September 16, 2004 Abstract We give
More informationBig Data, Machine Learning, Causal Models
Big Data, Machine Learning, Causal Models Sargur N. Srihari University at Buffalo, The State University of New York USA Int. Conf. on Signal and Image Processing, Bangalore January 2014 1 Plan of Discussion
More informationMachine Learning Overview
Machine Learning Overview Sargur N. Srihari University at Buffalo, State University of New York USA 1 Outline 1. What is Machine Learning (ML)? 1. As a scientific Discipline 2. As an area of Computer Science/AI
More informationDynamic Modeling, Predictive Control and Performance Monitoring
Biao Huang, Ramesh Kadali Dynamic Modeling, Predictive Control and Performance Monitoring A Data-driven Subspace Approach 4y Spri nnger g< Contents Notation XIX 1 Introduction 1 1.1 An Overview of This
More informationChristfried Webers. Canberra February June 2015
c Statistical Group and College of Engineering and Computer Science Canberra February June (Many figures from C. M. Bishop, "Pattern Recognition and ") 1of 829 c Part VIII Linear Classification 2 Logistic
More informationLeast Squares Estimation
Least Squares Estimation SARA A VAN DE GEER Volume 2, pp 1041 1045 in Encyclopedia of Statistics in Behavioral Science ISBN-13: 978-0-470-86080-9 ISBN-10: 0-470-86080-4 Editors Brian S Everitt & David
More informationMachine learning for algo trading
Machine learning for algo trading An introduction for nonmathematicians Dr. Aly Kassam Overview High level introduction to machine learning A machine learning bestiary What has all this got to do with
More informationFlexible and efficient Gaussian process models for machine learning
Flexible and efficient Gaussian process models for machine learning Edward Lloyd Snelson M.A., M.Sci., Physics, University of Cambridge, UK (2001) Gatsby Computational Neuroscience Unit University College
More informationInference on Phase-type Models via MCMC
Inference on Phase-type Models via MCMC with application to networks of repairable redundant systems Louis JM Aslett and Simon P Wilson Trinity College Dublin 28 th June 202 Toy Example : Redundant Repairable
More informationMixtures of Robust Probabilistic Principal Component Analyzers
Mixtures of Robust Probabilistic Principal Component Analyzers Cédric Archambeau, Nicolas Delannay 2 and Michel Verleysen 2 - University College London, Dept. of Computer Science Gower Street, London WCE
More informationBIOINF 585 Fall 2015 Machine Learning for Systems Biology & Clinical Informatics http://www.ccmb.med.umich.edu/node/1376
Course Director: Dr. Kayvan Najarian (DCM&B, kayvan@umich.edu) Lectures: Labs: Mondays and Wednesdays 9:00 AM -10:30 AM Rm. 2065 Palmer Commons Bldg. Wednesdays 10:30 AM 11:30 AM (alternate weeks) Rm.
More informationLogistic Regression. Jia Li. Department of Statistics The Pennsylvania State University. Logistic Regression
Logistic Regression Department of Statistics The Pennsylvania State University Email: jiali@stat.psu.edu Logistic Regression Preserve linear classification boundaries. By the Bayes rule: Ĝ(x) = arg max
More informationMonte Carlo-based statistical methods (MASM11/FMS091)
Monte Carlo-based statistical methods (MASM11/FMS091) Jimmy Olsson Centre for Mathematical Sciences Lund University, Sweden Lecture 5 Sequential Monte Carlo methods I February 5, 2013 J. Olsson Monte Carlo-based
More informationMachine Learning in FX Carry Basket Prediction
Machine Learning in FX Carry Basket Prediction Tristan Fletcher, Fabian Redpath and Joe D Alessandro Abstract Artificial Neural Networks ANN), Support Vector Machines SVM) and Relevance Vector Machines
More informationLecture 3: Linear methods for classification
Lecture 3: Linear methods for classification Rafael A. Irizarry and Hector Corrada Bravo February, 2010 Today we describe four specific algorithms useful for classification problems: linear regression,
More informationSpatial Statistics Chapter 3 Basics of areal data and areal data modeling
Spatial Statistics Chapter 3 Basics of areal data and areal data modeling Recall areal data also known as lattice data are data Y (s), s D where D is a discrete index set. This usually corresponds to data
More informationPattern Analysis. Logistic Regression. 12. Mai 2009. Joachim Hornegger. Chair of Pattern Recognition Erlangen University
Pattern Analysis Logistic Regression 12. Mai 2009 Joachim Hornegger Chair of Pattern Recognition Erlangen University Pattern Analysis 2 / 43 1 Logistic Regression Posteriors and the Logistic Function Decision
More informationThese slides follow closely the (English) course textbook Pattern Recognition and Machine Learning by Christopher Bishop
Music and Machine Learning (IFT6080 Winter 08) Prof. Douglas Eck, Université de Montréal These slides follow closely the (English) course textbook Pattern Recognition and Machine Learning by Christopher
More informationReal-time Body Tracking Using a Gaussian Process Latent Variable Model
Real-time Body Tracking Using a Gaussian Process Latent Variable Model Shaobo Hou Aphrodite Galata Fabrice Caillette Neil Thacker Paul Bromiley School of Computer Science The University of Manchester ISBE
More informationData Analytics at NICTA. Stephen Hardy National ICT Australia (NICTA) shardy@nicta.com.au
Data Analytics at NICTA Stephen Hardy National ICT Australia (NICTA) shardy@nicta.com.au NICTA Copyright 2013 Outline Big data = science! Data analytics at NICTA Discrete Finite Infinite Machine Learning
More informationSimilarity Search and Mining in Uncertain Spatial and Spatio Temporal Databases. Andreas Züfle
Similarity Search and Mining in Uncertain Spatial and Spatio Temporal Databases Andreas Züfle Geo Spatial Data Huge flood of geo spatial data Modern technology New user mentality Great research potential
More informationCCNY. BME I5100: Biomedical Signal Processing. Linear Discrimination. Lucas C. Parra Biomedical Engineering Department City College of New York
BME I5100: Biomedical Signal Processing Linear Discrimination Lucas C. Parra Biomedical Engineering Department CCNY 1 Schedule Week 1: Introduction Linear, stationary, normal - the stuff biology is not
More informationTracking in flussi video 3D. Ing. Samuele Salti
Seminari XXIII ciclo Tracking in flussi video 3D Ing. Tutors: Prof. Tullio Salmon Cinotti Prof. Luigi Di Stefano The Tracking problem Detection Object model, Track initiation, Track termination, Tracking
More informationSemi-Supervised Support Vector Machines and Application to Spam Filtering
Semi-Supervised Support Vector Machines and Application to Spam Filtering Alexander Zien Empirical Inference Department, Bernhard Schölkopf Max Planck Institute for Biological Cybernetics ECML 2006 Discovery
More informationMonocular 3D Human Motion Tracking Using Dynamic Probabilistic Latent Semantic Analysis
Monocular 3D Human Motion Tracking Using Dynamic Probabilistic Latent Semantic Analysis Kooksang Moon and Vladimir Pavlović Department of Computer Science, Rutgers University, Piscataway, NJ 8854 {ksmoon,
More informationA Hybrid Neural Network-Latent Topic Model
Li Wan Leo Zhu Rob Fergus Dept. of Computer Science, Courant institute, New York University {wanli,zhu,fergus}@cs.nyu.edu Abstract This paper introduces a hybrid model that combines a neural network with
More informationPeople Tracking with the Laplacian Eigenmaps Latent Variable Model
People Tracking with the Laplacian Eigenmaps Latent Variable Model Zhengdong Lu CSEE, OGI, OHSU zhengdon@csee.ogi.edu Miguel Á. Carreira-Perpiñán EECS, UC Merced http://eecs.ucmerced.edu Cristian Sminchisescu
More informationArtificial Neural Networks and Support Vector Machines. CS 486/686: Introduction to Artificial Intelligence
Artificial Neural Networks and Support Vector Machines CS 486/686: Introduction to Artificial Intelligence 1 Outline What is a Neural Network? - Perceptron learners - Multi-layer networks What is a Support
More informationSupervised Learning (Big Data Analytics)
Supervised Learning (Big Data Analytics) Vibhav Gogate Department of Computer Science The University of Texas at Dallas Practical advice Goal of Big Data Analytics Uncover patterns in Data. Can be used
More informationLinear Threshold Units
Linear Threshold Units w x hx (... w n x n w We assume that each feature x j and each weight w j is a real number (we will relax this later) We will study three different algorithms for learning linear
More informationLarge-Scale Similarity and Distance Metric Learning
Large-Scale Similarity and Distance Metric Learning Aurélien Bellet Télécom ParisTech Joint work with K. Liu, Y. Shi and F. Sha (USC), S. Clémençon and I. Colin (Télécom ParisTech) Séminaire Criteo March
More informationBayesian probability theory
Bayesian probability theory Bruno A. Olshausen arch 1, 2004 Abstract Bayesian probability theory provides a mathematical framework for peforming inference, or reasoning, using probability. The foundations
More informationParametric Statistical Modeling
Parametric Statistical Modeling ECE 275A Statistical Parameter Estimation Ken Kreutz-Delgado ECE Department, UC San Diego Ken Kreutz-Delgado (UC San Diego) ECE 275A SPE Version 1.1 Fall 2012 1 / 12 Why
More informationLocal Gaussian Process Regression for Real Time Online Model Learning and Control
Local Gaussian Process Regression for Real Time Online Model Learning and Control Duy Nguyen-Tuong Jan Peters Matthias Seeger Max Planck Institute for Biological Cybernetics Spemannstraße 38, 776 Tübingen,
More informationExtreme Value Modeling for Detection and Attribution of Climate Extremes
Extreme Value Modeling for Detection and Attribution of Climate Extremes Jun Yan, Yujing Jiang Joint work with Zhuo Wang, Xuebin Zhang Department of Statistics, University of Connecticut February 2, 2016
More informationAn Introduction to Machine Learning
An Introduction to Machine Learning L5: Novelty Detection and Regression Alexander J. Smola Statistical Machine Learning Program Canberra, ACT 0200 Australia Alex.Smola@nicta.com.au Tata Institute, Pune,
More informationEE 570: Location and Navigation
EE 570: Location and Navigation On-Line Bayesian Tracking Aly El-Osery 1 Stephen Bruder 2 1 Electrical Engineering Department, New Mexico Tech Socorro, New Mexico, USA 2 Electrical and Computer Engineering
More informationData Mining mit der JMSL Numerical Library for Java Applications
Data Mining mit der JMSL Numerical Library for Java Applications Stefan Sineux 8. Java Forum Stuttgart 07.07.2005 Agenda Visual Numerics JMSL TM Numerical Library Neuronale Netze (Hintergrund) Demos Neuronale
More information11. Time series and dynamic linear models
11. Time series and dynamic linear models Objective To introduce the Bayesian approach to the modeling and forecasting of time series. Recommended reading West, M. and Harrison, J. (1997). models, (2 nd
More informationMachine learning in financial forecasting. Haindrich Henrietta Vezér Evelin
Machine learning in financial forecasting Haindrich Henrietta Vezér Evelin Contents Financial forecasting Window Method Machine learning-past and future MLP (Multi-layer perceptron) Gaussian Process Bibliography
More informationComparing GPLVM Approaches for Dimensionality Reduction in Character Animation
Comparing GPLVM Approaches for Dimensionality Reduction in Character Animation Sébastien Quirion 1 Chantale Duchesne 1 Denis Laurendeau 1 Mario Marchand 2 1 Computer Vision and Systems Laboratory Dept.
More informationNTC Project: S01-PH10 (formerly I01-P10) 1 Forecasting Women s Apparel Sales Using Mathematical Modeling
1 Forecasting Women s Apparel Sales Using Mathematical Modeling Celia Frank* 1, Balaji Vemulapalli 1, Les M. Sztandera 2, Amar Raheja 3 1 School of Textiles and Materials Technology 2 Computer Information
More informationLinear Classification. Volker Tresp Summer 2015
Linear Classification Volker Tresp Summer 2015 1 Classification Classification is the central task of pattern recognition Sensors supply information about an object: to which class do the object belong
More informationFiltered Gaussian Processes for Learning with Large Data-Sets
Filtered Gaussian Processes for Learning with Large Data-Sets Jian Qing Shi, Roderick Murray-Smith 2,3, D. Mike Titterington 4,and Barak A. Pearlmutter 3 School of Mathematics and Statistics, University
More informationAnalysis of Bayesian Dynamic Linear Models
Analysis of Bayesian Dynamic Linear Models Emily M. Casleton December 17, 2010 1 Introduction The main purpose of this project is to explore the Bayesian analysis of Dynamic Linear Models (DLMs). The main
More informationExploratory Data Analysis Using Radial Basis Function Latent Variable Models
Exploratory Data Analysis Using Radial Basis Function Latent Variable Models Alan D. Marrs and Andrew R. Webb DERA St Andrews Road, Malvern Worcestershire U.K. WR14 3PS {marrs,webb}@signal.dera.gov.uk
More informationUW CSE Technical Report 03-06-01 Probabilistic Bilinear Models for Appearance-Based Vision
UW CSE Technical Report 03-06-01 Probabilistic Bilinear Models for Appearance-Based Vision D.B. Grimes A.P. Shon R.P.N. Rao Dept. of Computer Science and Engineering University of Washington Seattle, WA
More informationRecent Developments of Statistical Application in. Finance. Ruey S. Tsay. Graduate School of Business. The University of Chicago
Recent Developments of Statistical Application in Finance Ruey S. Tsay Graduate School of Business The University of Chicago Guanghua Conference, June 2004 Summary Focus on two parts: Applications in Finance:
More informationMachine Learning. CUNY Graduate Center, Spring 2013. Professor Liang Huang. huang@cs.qc.cuny.edu
Machine Learning CUNY Graduate Center, Spring 2013 Professor Liang Huang huang@cs.qc.cuny.edu http://acl.cs.qc.edu/~lhuang/teaching/machine-learning Logistics Lectures M 9:30-11:30 am Room 4419 Personnel
More informationMachine Learning for Medical Image Analysis. A. Criminisi & the InnerEye team @ MSRC
Machine Learning for Medical Image Analysis A. Criminisi & the InnerEye team @ MSRC Medical image analysis the goal Automatic, semantic analysis and quantification of what observed in medical scans Brain
More informationWeb-based Supplementary Materials for Bayesian Effect Estimation. Accounting for Adjustment Uncertainty by Chi Wang, Giovanni
1 Web-based Supplementary Materials for Bayesian Effect Estimation Accounting for Adjustment Uncertainty by Chi Wang, Giovanni Parmigiani, and Francesca Dominici In Web Appendix A, we provide detailed
More information