# An EM algorithm for the estimation of a ne state-space systems with or without known inputs

Save this PDF as:

Size: px
Start display at page:

Download "An EM algorithm for the estimation of a ne state-space systems with or without known inputs"

## Transcription

1 An EM algorithm for the estimation of a ne state-space systems with or without known inputs Alexander W Blocker January 008 Abstract We derive an EM algorithm for the estimation of a ne Gaussian state-space systems with or without known inputs. We also provide comments on its numerical implementation & application and note recent extensions to predictive state representations. Introduction Linear and a ne state-space systems are useful across a wide range of applications, from control theory & signal processing to speech recognition & interest rate modelling. However, despite their wide applicability and theoretical simplicity, the estimation of such systems is nontrivial. One of the most straightforward approaches is to apply the expectationmaximization (EM) algorithm originally developed by Dempster, Laird, and Rubin (), treating the unobserved state variable as missing data. This problem was previously addressed for the linear case without control inputs in (); here, this approach is extended to the a ne case with and without known inputs. Some notes and cautions on the algorithm s implementation and usage are also provided. EM Literature As mentioned above, the EM algorithm was rst formally presented by Dempster, Laird, and Rubin in 977 (). It is a powerful method for obtaining maximum-likelihood parameter estimates in the presence of missing data. The idea of the algorithm is to alternate between computing the expectation of the sample log-likelihood conditional on the previous parameter estimates (the expectation or "E" step) and maximizing this expectation with respect to the desired parameters to obtain parameter estimates for the next recursion (the maximization or "M" step). In (), it was shown that this procedure is guaranteed to produce a monotonically increasing sequence of expected sample log-likelihoods which converges to a local maximum of the likelihood function (if the likelihood function is unimodal, this is clearly the unique ML estimate).

2 The application of the EM algorithm to state space systems has been an active area of research for over 0 years. Shumway and Sto er presented some of the earliest work on the topic in (3), which was then extended by Ghahramani and Hintons in (). The latter piece forms the basis for the procedure presented herein. Some examples of how this approach has been applied can be found in (4) and (5). 3 Derivation In its most general form, the model presented here is: x t = F x t + Bu t + " t, " t N k (0; Q) i.i.d. z t = h + Hx t + t, t N n (0; R) i.i.d. We assume that " t & are independent for all t &. This is the basic setup of an a ne Gaussian state-space system, where z t is the observable output variable, u t is a known input, and x t is the unobserved state variable (the noise variables " t & t are also unobserved). The parameters are the transition matrix F, input matrix B, observation matrix H, a ne term h; and the covariance matrices Q and R. Such a model is motivated by the problem of measurement errors. The state variable of interest x t evolves according to some linear dynamics with known inputs. However, we can only observe z t ; a noisy, a ne-transformed version of x t : When considered in this framework, the restriction of independent errors appears quite mild, as t is just measurement noise. It is important to note that, conditional on x t, z t is a multivariate normally distributed random variable with mean h + Hx t and covariance matrix R. Similarly, conditional on x t and u t, x t is a multivariate normally distributed random variable with mean F x t + Bu t & covariance matrix Q. Thus, using the fact that the given process is Markovian and adding the assumption that the initial state x 0 is unconditionally normally distributed with mean & covariance V, we obtain the following log-likelihood function: ` (X; Zj) = (x t F x t Bu t ) 0 Q (x t F x t Bu t ) + T log jqj + (z t h Hx t ) 0 R (z t h Hx t ) + (T ) log jrj + (x ) 0 V (x ) + log jv j + (T ) log jv j where X is the T by k matrix of hidden states, Z is the T by n matrix of observations, and is a vector containing the given parameters. Taking matrix derivatives, we obtain the following simpli ed forms for the rst order conditions (FOCs): F : x t x 0 t F x t x 0 t Bu t x 0 t = 0

3 B : Q : h : H : R : (x t u 0 t F x t u 0 t Bu t u 0 t) = 0 (x t F x t Bu t ) (x t F x t Bu t ) 0 = (T ) Q (z t h Hx t ) = 0 (z t x 0 t hx 0 t Hx t x t ) = 0 (z t h Hx t ) (z t h Hx t ) 0 = T Q x : x = V : (x ) (x ) 0 = V After taking the expectations of the above FOCs, we can obtain the formulas for the M-step of the EM algorithm by solving for the desired parameters. However, before turning to this step, we must consider the E-step. As our state-space system has linear state dynamics and the measurement equation is a ne in the state, a combination of the Kalman lter & smoother can be used to compute the expectations of the necessary statistics. The ltering is performed via the standard Kalman lter recursions. De ne ^x t as the estimated state at time t, V t as the estimated state variance at time t, P t as E [x t x 0 t] = V t + ^x t^x 0 t; and P t;t as E x t x 0 t = Vt;t + ^x t^x 0 t : Any variable with a superscript indicates an initial estimate, and an f superscript indicates a ltered estimate (as opposed to the nal, smoothed estimates). The forward recursion equations, as in (6), are: ^x t = F ^x f t + Bu t V t = F V f t F 0 + Q K t = V t H 0 HV t H 0 + R ^x f t = ^x t + K t z t h H ^x t V f t = (I K t H) V t These formulas are applied recursively from t = through T, with the initial values given by & V. The smoother is then run in the opposite direction, from t = T through. The equations for this backward recursion, as in (4) and (3), are: J t = V f t F 0 V t+ ^x t = ^x f t + J t ^x t+ F ^x f t Bu t V t = V f t + J t V t+ V t+ J 0 t 3

4 V t;t = V f t J 0 t + J t V t;t F V f t Jt 0 The values for time T are given by ^x T = ^x f T, V T = V f T ; and V T;T = (I K T H) F V f T : Using the series f^x t g ; fp t g ; and fp t;t g from the E-step, we can write the solutions to the FOCs in expectation terms as: ^F = P t;t ^But^x 0 t P t ^B = 4 4 ^x t u 0 t u t u 0 t P t;t u t^x 0 t P t P t ^Q = P t ^F P 0 T t;t ^But^x 0 t ^h = 4 z t z t^x 0 t P t T ^H = 4 ^R = T ^ = ^x ^V = V T z t^x 0 t z t zt 0 ^x 0 t ^h^x t ^hz 0 t P t P t ^H ^x t z 0 t ^x t 3 5 t3 ^x t u 0 5 t3 ^x t u 0 5 ^x t 3 5 These are the key equations for the M-step of the algorithm. To modify them for a version of the model without inputs or an a ne measurement term, the only change required is setting ^B or ^h to zero in the relevant equations. Applying both of these restrictions leads to the same M-step equations found in (). 4 Implementation When writing a numerical implementation of this algorithm, setting the proper halting condition is of great importance. In (), Dempster, Lair, & Rubin show that an EM algorithm, 4

5 such as the one given above, is guaranteed to converge to at least a local maximum. Thus, in theory, one must simply tell the algorithm to stop when the expected log-likelihoods produced by two sequential iterations are identical. In practice, this is not feasible due to numerical imprecision and the sheer number of iterations that is often required to reach such a xed point. Therefore, the typical halting condition used for applications such as this is a measure of the relative change in expected log-likelihood. De ne `k as the expected log-likelihood of the k th iteration. A typical halting condition would be: Stop if `k `k j`k + `k + "j < c The averaging in the denominator is used to increase the stability of the condition, and the " term is a small non-zero value is used to keep the condition well-behaved in the event a xed point is reached. A typical value for c in such a condition is between 0 4 & 0 6 : Initialization of the algorithm is also an important consideration. As mentioned earlier, an EM algorithm is only guaranteed to converge to a local maximum of the likelihood function, not the global maximum. For problems in low dimensions with relatively few parameters to be estimated, this is not as large a concern. In particular, if the likelihood function is unimodal, initialization will only a ect the speed of convergence, not the nal result. Unfortunately, for higher-dimensional problems (such as many state space estimation problems), initialization has a signi cant impact on the estimates produced by the EM algorithm. Thus, using the EM algorithm on such systems with little or no prior knowledge is frequently unproductive. If one has some prior knowledge about the system s parameters, the best course of action is the initialize based on this information. One of the best examples of this situation would be estimating the parameters for a physical system with partially understood system & observation matrices. Another possible method relies on an assumption of independence. If one believes that the nal observations should be independent, independent components analysis could be used to obtain an initial estimate for the observation matrix H: However, this is a relatively strong assumption to make. Overall, the local maximization property of the EM algorithm makes it more signi cantly more useful for tuning a relatively known model than for blind estimation of an unknown system. 5 Conclusion We have presented a tractable EM algorithm for the estimation of a ne state-space systems with known inputs and provided some notes & cautions on its usage. For lower-dimensional systems, this method is quite e ective; however, for higher-dimensional systems, it s performance deteriorates signi cantly due to the local maximization property of EM algorithms. To get past this limitation, some recent work has focused on predictive state representations (PSRs), which de ne systems using statistics over future observations. In (7), Ruday, Singh, and Wingate show that any uncontrolled linear state-space system has an equivalent PSR. They also demonstrate an algorithm for the estimation of the latter form, which is extended 5

6 to controlled systems in (8). Their approach appears promising, but the basis for comparison (and, in some cases, the preferred approach), remain expectation maximization algorithms of the type presented here. References [] A. P. Dempster, N. M. Laird, and D. B. Rubin, Maximum likelihood from incomplete data via the EM algorithm, Journal of the Royal Statistical Society. Series B (Methodological), vol. 39, no., pp. 38, 977. [] Z. Ghahramani and G. E. Hintons, Parameter estimation for linear dynamical systems, Tech. Rep. CRG-TR-96-, Department of Computer Science, University of Toronto, 996. [3] R. H. Shumway and D. S. Sto er, An approach to time series smoothing and forecasting using the EM algorithm, Journal of Time Series Analysis, vol. 3, no. 4, pp , 98. [4] V. Digalakis, J. R. Rohlicek, and M. Ostendorf, ML estimation of a stochastic linear system with the EM algorithm and its application to speech recognition, IEEE Transactions on Speech and Audio Processing, vol., pp , October 993. [5] A. Hero, H. Messer, J. Goldberg, D. Thomson, M. Amin, G. Giannakis, A. Swami, J. Tugnait, A. Nehorai, A. Swindlehurst, et al., Highlights of statistical signal and array processing, IEEE Signal Processing Magazine, vol. 5, no. 5, pp. 64, 998. [6] G. Welch and G. Bishop, An introduction to the Kalman lter. SIGGRAPH Course Packet, 00. [7] M. Rudary, S. Singh, and D. Wingate, Predictive linear-gaussian models of stochastic dynamical systems. Working paper. [8] M. Rudary and S. Singh, Predictive linear-gaussian models of controlled stochastic dynamical systems, in ICML 06: Proceedings of the 3rd international conference on Machine learning, (New York, NY, USA), pp , ACM,

### Clustering - example. Given some data x i X Find a partitioning of the data into k disjunctive clusters Example: k-means clustering

Clustering - example Graph Mining and Graph Kernels Given some data x i X Find a partitioning of the data into k disjunctive clusters Example: k-means clustering x!!!!8!!! 8 x 1 1 Clustering - example

### Hidden Markov Models Fundamentals

Hidden Markov Models Fundamentals Daniel Ramage CS229 Section Notes December, 2007 Abstract How can we apply machine learning to data that is represented as a sequence of observations over time? For instance,

### Stock Option Pricing Using Bayes Filters

Stock Option Pricing Using Bayes Filters Lin Liao liaolin@cs.washington.edu Abstract When using Black-Scholes formula to price options, the key is the estimation of the stochastic return variance. In this

### Mixture Models. Jia Li. Department of Statistics The Pennsylvania State University. Mixture Models

Mixture Models Department of Statistics The Pennsylvania State University Email: jiali@stat.psu.edu Clustering by Mixture Models General bacground on clustering Example method: -means Mixture model based

### Maximum Likelihood Estimation of an ARMA(p,q) Model

Maximum Likelihood Estimation of an ARMA(p,q) Model Constantino Hevia The World Bank. DECRG. October 8 This note describes the Matlab function arma_mle.m that computes the maximum likelihood estimates

### Time Series Analysis III

Lecture 12: Time Series Analysis III MIT 18.S096 Dr. Kempthorne Fall 2013 MIT 18.S096 Time Series Analysis III 1 Outline Time Series Analysis III 1 Time Series Analysis III MIT 18.S096 Time Series Analysis

### Machine Learning and Data Mining. Clustering. (adapted from) Prof. Alexander Ihler

Machine Learning and Data Mining Clustering (adapted from) Prof. Alexander Ihler Unsupervised learning Supervised learning Predict target value ( y ) given features ( x ) Unsupervised learning Understand

### Lecture 10: Sequential Data Models

CSC2515 Fall 2007 Introduction to Machine Learning Lecture 10: Sequential Data Models 1 Example: sequential data Until now, considered data to be i.i.d. Turn attention to sequential data Time-series: stock

### Regime Switching Models: An Example for a Stock Market Index

Regime Switching Models: An Example for a Stock Market Index Erik Kole Econometric Institute, Erasmus School of Economics, Erasmus University Rotterdam April 2010 In this document, I discuss in detail

### System Identification for Acoustic Comms.:

System Identification for Acoustic Comms.: New Insights and Approaches for Tracking Sparse and Rapidly Fluctuating Channels Weichang Li and James Preisig Woods Hole Oceanographic Institution The demodulation

### Chapter 4. Simulated Method of Moments and its siblings

Chapter 4. Simulated Method of Moments and its siblings Contents 1 Two motivating examples 1 1.1 Du e and Singleton (1993)......................... 1 1.2 Model for nancial returns with stochastic volatility...........

### Square Root Propagation

Square Root Propagation Andrew G. Howard Department of Computer Science Columbia University New York, NY 10027 ahoward@cs.columbia.edu Tony Jebara Department of Computer Science Columbia University New

### Statistical Machine Translation: IBM Models 1 and 2

Statistical Machine Translation: IBM Models 1 and 2 Michael Collins 1 Introduction The next few lectures of the course will be focused on machine translation, and in particular on statistical machine translation

### Deterministic Sampling-based Switching Kalman Filtering for Vehicle Tracking

Proceedings of the IEEE ITSC 2006 2006 IEEE Intelligent Transportation Systems Conference Toronto, Canada, September 17-20, 2006 WA4.1 Deterministic Sampling-based Switching Kalman Filtering for Vehicle

### The Expectation Maximization Algorithm A short tutorial

The Expectation Maximiation Algorithm A short tutorial Sean Borman Comments and corrections to: em-tut at seanborman dot com July 8 2004 Last updated January 09, 2009 Revision history 2009-0-09 Corrected

### ELEC-E8104 Stochastics models and estimation, Lecture 3b: Linear Estimation in Static Systems

Stochastics models and estimation, Lecture 3b: Linear Estimation in Static Systems Minimum Mean Square Error (MMSE) MMSE estimation of Gaussian random vectors Linear MMSE estimator for arbitrarily distributed

### Particle Filter-Based On-Line Estimation of Spot (Cross-)Volatility with Nonlinear Market Microstructure Noise Models. Jan C.

Particle Filter-Based On-Line Estimation of Spot (Cross-)Volatility with Nonlinear Market Microstructure Noise Models Jan C. Neddermeyer (joint work with Rainer Dahlhaus, University of Heidelberg) DZ BANK

### A Unified Method for Measurement and Tracking of Contacts From an Array of Sensors

2950 IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 49, NO. 12, DECEMBER 2001 A Unified Method for Measurement and Tracking of Contacts From an Array of Sensors Robert E. Zarnich, Kristine L. Bell, Senior

### Hidden Markov Models. and. Sequential Data

Hidden Markov Models and Sequential Data Sequential Data Often arise through measurement of time series Snowfall measurements on successive days in Buffalo Rainfall measurements in Chirrapunji Daily values

### Linear Threshold Units

Linear Threshold Units w x hx (... w n x n w We assume that each feature x j and each weight w j is a real number (we will relax this later) We will study three different algorithms for learning linear

### EM Clustering Approach for Multi-Dimensional Analysis of Big Data Set

EM Clustering Approach for Multi-Dimensional Analysis of Big Data Set Amhmed A. Bhih School of Electrical and Electronic Engineering Princy Johnson School of Electrical and Electronic Engineering Martin

### Advanced Signal Processing and Digital Noise Reduction

Advanced Signal Processing and Digital Noise Reduction Saeed V. Vaseghi Queen's University of Belfast UK WILEY HTEUBNER A Partnership between John Wiley & Sons and B. G. Teubner Publishers Chichester New

### Automated Hierarchical Mixtures of Probabilistic Principal Component Analyzers

Automated Hierarchical Mixtures of Probabilistic Principal Component Analyzers Ting Su tsu@ece.neu.edu Jennifer G. Dy jdy@ece.neu.edu Department of Electrical and Computer Engineering, Northeastern University,

### An Introduction to the Kalman Filter

An Introduction to the Kalman Filter Greg Welch 1 and Gary Bishop 2 TR 95041 Department of Computer Science University of North Carolina at Chapel Hill Chapel Hill, NC 275993175 Updated: Monday, July 24,

### A Reliability Point and Kalman Filter-based Vehicle Tracking Technique

A Reliability Point and Kalman Filter-based Vehicle Tracing Technique Soo Siang Teoh and Thomas Bräunl Abstract This paper introduces a technique for tracing the movement of vehicles in consecutive video

### Part 2: Kalman Filtering COS 323

Part 2: Kalman Filtering COS 323 On-Line Estimation Have looked at off-line model estimation: all data is available For many applications, want best estimate immediately when each new datapoint arrives

### An extension of the factoring likelihood approach for non-monotone missing data

An extension of the factoring likelihood approach for non-monotone missing data Jae Kwang Kim Dong Wan Shin January 14, 2010 ABSTRACT We address the problem of parameter estimation in multivariate distributions

### State Space Time Series Analysis

State Space Time Series Analysis p. 1 State Space Time Series Analysis Siem Jan Koopman http://staff.feweb.vu.nl/koopman Department of Econometrics VU University Amsterdam Tinbergen Institute 2011 State

### Hardware Implementation of Probabilistic State Machine for Word Recognition

IJECT Vo l. 4, Is s u e Sp l - 5, Ju l y - Se p t 2013 ISSN : 2230-7109 (Online) ISSN : 2230-9543 (Print) Hardware Implementation of Probabilistic State Machine for Word Recognition 1 Soorya Asokan, 2

### Mixtures of Robust Probabilistic Principal Component Analyzers

Mixtures of Robust Probabilistic Principal Component Analyzers Cédric Archambeau, Nicolas Delannay 2 and Michel Verleysen 2 - University College London, Dept. of Computer Science Gower Street, London WCE

### Introduction to latent variable models

Introduction to latent variable models lecture 1 Francesco Bartolucci Department of Economics, Finance and Statistics University of Perugia, IT bart@stat.unipg.it Outline [2/24] Latent variables and their

### A Unifying Review of Linear Gaussian Models

REVIEW Communicated by Steven Nowlan A Unifying Review of Linear Gaussian Models Sam Roweis Computation and Neural Systems, California Institute of Technology, Pasadena, CA 91125, U.S.A. Zoubin Ghahramani

### Multivariate Normal Distribution

Multivariate Normal Distribution Lecture 4 July 21, 2011 Advanced Multivariate Statistical Methods ICPSR Summer Session #2 Lecture #4-7/21/2011 Slide 1 of 41 Last Time Matrices and vectors Eigenvalues

### Using Mixtures-of-Distributions models to inform farm size selection decisions in representative farm modelling. Philip Kostov and Seamus McErlean

Using Mixtures-of-Distributions models to inform farm size selection decisions in representative farm modelling. by Philip Kostov and Seamus McErlean Working Paper, Agricultural and Food Economics, Queen

### Hidden Markov Model. Jia Li. Department of Statistics The Pennsylvania State University. Hidden Markov Model

Hidden Markov Model Department of Statistics The Pennsylvania State University Email: jiali@stat.psu.edu Hidden Markov Model Hidden Markov models have close connection with mixture models. A mixture model

### CONTROL SYSTEMS, ROBOTICS AND AUTOMATION - Vol. VIII - Reduced-Order State Observers - Bernard Friedland

REDUCED-ORDER STATE OBSERVERS Bernard Friedland Department of Electrical and Computer Engineering, New Jersey Institute of Technology, Newark, NJ, USA Keywords: Observer, Reduced-order observer, Luenberger

### Revenue Management with Correlated Demand Forecasting

Revenue Management with Correlated Demand Forecasting Catalina Stefanescu Victor DeMiguel Kristin Fridgeirsdottir Stefanos Zenios 1 Introduction Many airlines are struggling to survive in today's economy.

### Measurement-Based Network Monitoring and Inference: Scalability and Missing Information

714 IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS, VOL. 20, NO. 4, MAY 2002 Measurement-Based Network Monitoring and Inference: Scalability and Missing Information Chuanyi Ji, Member, IEEE and Anwar

### Note on the EM Algorithm in Linear Regression Model

International Mathematical Forum 4 2009 no. 38 1883-1889 Note on the M Algorithm in Linear Regression Model Ji-Xia Wang and Yu Miao College of Mathematics and Information Science Henan Normal University

### Direct Methods for Solving Linear Systems. Linear Systems of Equations

Direct Methods for Solving Linear Systems Linear Systems of Equations Numerical Analysis (9th Edition) R L Burden & J D Faires Beamer Presentation Slides prepared by John Carroll Dublin City University

### Literature: (6) Pattern Recognition and Machine Learning Christopher M. Bishop MIT Press

Literature: (6) Pattern Recognition and Machine Learning Christopher M. Bishop MIT Press -! Generative statistical models try to learn how to generate data from latent variables (for example class labels).

### Adaptive Demand-Forecasting Approach based on Principal Components Time-series an application of data-mining technique to detection of market movement

Adaptive Demand-Forecasting Approach based on Principal Components Time-series an application of data-mining technique to detection of market movement Toshio Sugihara Abstract In this study, an adaptive

### CONTROL SYSTEMS, ROBOTICS, AND AUTOMATION - Vol. V - Relations Between Time Domain and Frequency Domain Prediction Error Methods - Tomas McKelvey

COTROL SYSTEMS, ROBOTICS, AD AUTOMATIO - Vol. V - Relations Between Time Domain and Frequency Domain RELATIOS BETWEE TIME DOMAI AD FREQUECY DOMAI PREDICTIO ERROR METHODS Tomas McKelvey Signal Processing,

University of British Columbia Department of Economics, Macroeconomics (Econ 502) Prof. Amartya Lahiri Handout 7: Business Cycles We now use the methods that we have introduced to study modern business

### CCNY. BME I5100: Biomedical Signal Processing. Linear Discrimination. Lucas C. Parra Biomedical Engineering Department City College of New York

BME I5100: Biomedical Signal Processing Linear Discrimination Lucas C. Parra Biomedical Engineering Department CCNY 1 Schedule Week 1: Introduction Linear, stationary, normal - the stuff biology is not

### Scaling Bayesian Network Parameter Learning with Expectation Maximization using MapReduce

Scaling Bayesian Network Parameter Learning with Expectation Maximization using MapReduce Erik B. Reed Carnegie Mellon University Silicon Valley Campus NASA Research Park Moffett Field, CA 94035 erikreed@cmu.edu

### CS 688 Pattern Recognition Lecture 4. Linear Models for Classification

CS 688 Pattern Recognition Lecture 4 Linear Models for Classification Probabilistic generative models Probabilistic discriminative models 1 Generative Approach ( x ) p C k p( C k ) Ck p ( ) ( x Ck ) p(

### INDIRECT INFERENCE (prepared for: The New Palgrave Dictionary of Economics, Second Edition)

INDIRECT INFERENCE (prepared for: The New Palgrave Dictionary of Economics, Second Edition) Abstract Indirect inference is a simulation-based method for estimating the parameters of economic models. Its

### 4F7 Adaptive Filters (and Spectrum Estimation) Least Mean Square (LMS) Algorithm Sumeetpal Singh Engineering Department Email : sss40@eng.cam.ac.

4F7 Adaptive Filters (and Spectrum Estimation) Least Mean Square (LMS) Algorithm Sumeetpal Singh Engineering Department Email : sss40@eng.cam.ac.uk 1 1 Outline The LMS algorithm Overview of LMS issues

### A Tutorial on Particle Filters for Online Nonlinear/Non-Gaussian Bayesian Tracking

174 IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 50, NO. 2, FEBRUARY 2002 A Tutorial on Particle Filters for Online Nonlinear/Non-Gaussian Bayesian Tracking M. Sanjeev Arulampalam, Simon Maskell, Neil

### Robotics 2 Clustering & EM. Giorgio Grisetti, Cyrill Stachniss, Kai Arras, Maren Bennewitz, Wolfram Burgard

Robotics 2 Clustering & EM Giorgio Grisetti, Cyrill Stachniss, Kai Arras, Maren Bennewitz, Wolfram Burgard 1 Clustering (1) Common technique for statistical data analysis to detect structure (machine learning,

### Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model

Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model 1 September 004 A. Introduction and assumptions The classical normal linear regression model can be written

### CS229 Lecture notes. Andrew Ng

CS229 Lecture notes Andrew Ng Part X Factor analysis Whenwehavedatax (i) R n thatcomesfromamixtureofseveral Gaussians, the EM algorithm can be applied to fit a mixture model. In this setting, we usually

### Automated Stellar Classification for Large Surveys with EKF and RBF Neural Networks

Chin. J. Astron. Astrophys. Vol. 5 (2005), No. 2, 203 210 (http:/www.chjaa.org) Chinese Journal of Astronomy and Astrophysics Automated Stellar Classification for Large Surveys with EKF and RBF Neural

### Kristine L. Bell and Harry L. Van Trees. Center of Excellence in C 3 I George Mason University Fairfax, VA 22030-4444, USA kbell@gmu.edu, hlv@gmu.

POSERIOR CRAMÉR-RAO BOUND FOR RACKING ARGE BEARING Kristine L. Bell and Harry L. Van rees Center of Excellence in C 3 I George Mason University Fairfax, VA 22030-4444, USA bell@gmu.edu, hlv@gmu.edu ABSRAC

### Course: Model, Learning, and Inference: Lecture 5

Course: Model, Learning, and Inference: Lecture 5 Alan Yuille Department of Statistics, UCLA Los Angeles, CA 90095 yuille@stat.ucla.edu Abstract Probability distributions on structured representation.

### Stefanos D. Georgiadis Perttu O. Ranta-aho Mika P. Tarvainen Pasi A. Karjalainen. University of Kuopio Department of Applied Physics Kuopio, FINLAND

5 Finnish Signal Processing Symposium (Finsig 5) Kuopio, Finland Stefanos D. Georgiadis Perttu O. Ranta-aho Mika P. Tarvainen Pasi A. Karjalainen University of Kuopio Department of Applied Physics Kuopio,

### Statistical Machine Learning from Data

Samy Bengio Statistical Machine Learning from Data 1 Statistical Machine Learning from Data Gaussian Mixture Models Samy Bengio IDIAP Research Institute, Martigny, Switzerland, and Ecole Polytechnique

### ABSTRACT 1. INTRODUCTION 2. A MODEL OF THE ENVIRONMENT

A VECTOR TAYLOR SERIES APPROACH FOR ENVIRONMENT-INDEPENDENT SPEECH RECOGNITION Pedro J. Moreno, Bhiksha Raj and Richard M. Stern Department of Electrical and Computer Engineering & School of Computer Science

### Markov chains and Markov Random Fields (MRFs)

Markov chains and Markov Random Fields (MRFs) 1 Why Markov Models We discuss Markov models now. This is the simplest statistical model in which we don t assume that all variables are independent; we assume

### Univariate and Multivariate Methods PEARSON. Addison Wesley

Time Series Analysis Univariate and Multivariate Methods SECOND EDITION William W. S. Wei Department of Statistics The Fox School of Business and Management Temple University PEARSON Addison Wesley Boston

### The GMWM: A New Framework for Inertial Sensor Calibration

The GMWM: A New Framework for Inertial Sensor Calibration Roberto Molinari Research Center for Statistics (GSEM) University of Geneva joint work with S. Guerrier (UIUC), J. Balamuta (UIUC), J. Skaloud

### Estimating an ARMA Process

Statistics 910, #12 1 Overview Estimating an ARMA Process 1. Main ideas 2. Fitting autoregressions 3. Fitting with moving average components 4. Standard errors 5. Examples 6. Appendix: Simple estimators

### CS 2750 Machine Learning. Lecture 1. Machine Learning. http://www.cs.pitt.edu/~milos/courses/cs2750/ CS 2750 Machine Learning.

Lecture Machine Learning Milos Hauskrecht milos@cs.pitt.edu 539 Sennott Square, x5 http://www.cs.pitt.edu/~milos/courses/cs75/ Administration Instructor: Milos Hauskrecht milos@cs.pitt.edu 539 Sennott

### Analysis of Bayesian Dynamic Linear Models

Analysis of Bayesian Dynamic Linear Models Emily M. Casleton December 17, 2010 1 Introduction The main purpose of this project is to explore the Bayesian analysis of Dynamic Linear Models (DLMs). The main

### Identification of Exploitation Conditions of the Automobile Tire while Car Driving by Means of Hidden Markov Models

Identification of Exploitation Conditions of the Automobile Tire while Car Driving by Means of Hidden Markov Models Denis Tananaev, Galina Shagrova, Victor Kozhevnikov North-Caucasus Federal University,

### Problem of Missing Data

VASA Mission of VA Statisticians Association (VASA) Promote & disseminate statistical methodological research relevant to VA studies; Facilitate communication & collaboration among VA-affiliated statisticians;

### Monotonicity Hints. Abstract

Monotonicity Hints Joseph Sill Computation and Neural Systems program California Institute of Technology email: joe@cs.caltech.edu Yaser S. Abu-Mostafa EE and CS Deptartments California Institute of Technology

### Review of Likelihood Theory

Appendix A Review of Likelihood Theory This is a brief summary of some of the key results we need from likelihood theory. A.1 Maximum Likelihood Estimation Let Y 1,..., Y n be n independent random variables

### 11. Time series and dynamic linear models

11. Time series and dynamic linear models Objective To introduce the Bayesian approach to the modeling and forecasting of time series. Recommended reading West, M. and Harrison, J. (1997). models, (2 nd

### A Learning Based Method for Super-Resolution of Low Resolution Images

A Learning Based Method for Super-Resolution of Low Resolution Images Emre Ugur June 1, 2004 emre.ugur@ceng.metu.edu.tr Abstract The main objective of this project is the study of a learning based method

### MUSICAL INSTRUMENT FAMILY CLASSIFICATION

MUSICAL INSTRUMENT FAMILY CLASSIFICATION Ricardo A. Garcia Media Lab, Massachusetts Institute of Technology 0 Ames Street Room E5-40, Cambridge, MA 039 USA PH: 67-53-0 FAX: 67-58-664 e-mail: rago @ media.

### IDENTIFICATION IN A CLASS OF NONPARAMETRIC SIMULTANEOUS EQUATIONS MODELS. Steven T. Berry and Philip A. Haile. March 2011 Revised April 2011

IDENTIFICATION IN A CLASS OF NONPARAMETRIC SIMULTANEOUS EQUATIONS MODELS By Steven T. Berry and Philip A. Haile March 2011 Revised April 2011 COWLES FOUNDATION DISCUSSION PAPER NO. 1787R COWLES FOUNDATION

### A Review of Methods for Missing Data

Educational Research and Evaluation 1380-3611/01/0704-353\$16.00 2001, Vol. 7, No. 4, pp. 353±383 # Swets & Zeitlinger A Review of Methods for Missing Data Therese D. Pigott Loyola University Chicago, Wilmette,

### Linear Classification. Volker Tresp Summer 2015

Linear Classification Volker Tresp Summer 2015 1 Classification Classification is the central task of pattern recognition Sensors supply information about an object: to which class do the object belong

### MARKOVIAN signal models have proven to be a powerful

2658 IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 53, NO. 8, AUGUST 2005 Offline and Online Identification of Hidden Semi-Markov Models Mehran Azimi, Panos Nasiopoulos, and Rabab Kreidieh Ward, Fellow,

### Statistical Machine Learning

Statistical Machine Learning UoC Stats 37700, Winter quarter Lecture 4: classical linear and quadratic discriminants. 1 / 25 Linear separation For two classes in R d : simple idea: separate the classes

### Video Affective Content Recognition Based on Genetic Algorithm Combined HMM

Video Affective Content Recognition Based on Genetic Algorithm Combined HMM Kai Sun and Junqing Yu Computer College of Science & Technology, Huazhong University of Science & Technology, Wuhan 430074, China

### Overview. Longitudinal Data Variation and Correlation Different Approaches. Linear Mixed Models Generalized Linear Mixed Models

Overview 1 Introduction Longitudinal Data Variation and Correlation Different Approaches 2 Mixed Models Linear Mixed Models Generalized Linear Mixed Models 3 Marginal Models Linear Models Generalized Linear

### Sales Rate and Cumulative Sales Forecasting Using Kalman Filtering Techniques

1 Sales Rate and Cumulative Sales Forecasting Using Kalman Filtering Techniques Michael Munroe, Intel Corporation Ted Hench, Arizona State University, 480 614 1496, ted.hench@asu.edu Matthew Timmons, Arizona

### Stock Price Prediction using Hidden Markov Model

Stock Price Prediction using Hidden Markov Model Nguyet Nguyen Youngstown State University, Ohio, USA Email: ntnguyen01@ysu.edu March 21, 2016 Abstract The stock market is an important indicator which

### Normalization and Mixed Degrees of Integration in Cointegrated Time Series Systems

Normalization and Mixed Degrees of Integration in Cointegrated Time Series Systems Robert J. Rossana Department of Economics, 04 F/AB, Wayne State University, Detroit MI 480 E-Mail: r.j.rossana@wayne.edu

### An Introduction to Statistical Machine Learning - Overview -

An Introduction to Statistical Machine Learning - Overview - Samy Bengio bengio@idiap.ch Dalle Molle Institute for Perceptual Artificial Intelligence (IDIAP) CP 592, rue du Simplon 4 1920 Martigny, Switzerland

### Web User Segmentation Based on a Mixture of Factor Analyzers

Web User Segmentation Based on a Mixture of Factor Analyzers Yanzan Kevin Zhou 1 and Bamshad Mobasher 2 1 ebay Inc., San Jose, CA yanzzhou@ebay.com 2 DePaul University, Chicago, IL mobasher@cs.depaul.edu

### THREE DIMENSIONAL REPRESENTATION OF AMINO ACID CHARAC- TERISTICS

THREE DIMENSIONAL REPRESENTATION OF AMINO ACID CHARAC- TERISTICS O.U. Sezerman 1, R. Islamaj 2, E. Alpaydin 2 1 Laborotory of Computational Biology, Sabancı University, Istanbul, Turkey. 2 Computer Engineering

### Improved Residual Analysis in ADC Testing. István Kollár

Improved Residual Analysis in ADC Testing István Kollár Dept. of Measurement and Information Systems Budapest University of Technology and Economics H-1521 Budapest, Magyar tudósok krt. 2., HUNGARY Tel.:

### Herding, Contrarianism and Delay in Financial Market Trading

Herding, Contrarianism and Delay in Financial Market Trading A Lab Experiment Andreas Park & Daniel Sgroi University of Toronto & University of Warwick Two Restaurants E cient Prices Classic Herding Example:

### Tagging with Hidden Markov Models

Tagging with Hidden Markov Models Michael Collins 1 Tagging Problems In many NLP problems, we would like to model pairs of sequences. Part-of-speech (POS) tagging is perhaps the earliest, and most famous,

### Gaussian Processes in Machine Learning

Gaussian Processes in Machine Learning Carl Edward Rasmussen Max Planck Institute for Biological Cybernetics, 72076 Tübingen, Germany carl@tuebingen.mpg.de WWW home page: http://www.tuebingen.mpg.de/ carl

### Stability of the LMS Adaptive Filter by Means of a State Equation

Stability of the LMS Adaptive Filter by Means of a State Equation Vítor H. Nascimento and Ali H. Sayed Electrical Engineering Department University of California Los Angeles, CA 90095 Abstract This work

### Chapter 3: The Multiple Linear Regression Model

Chapter 3: The Multiple Linear Regression Model Advanced Econometrics - HEC Lausanne Christophe Hurlin University of Orléans November 23, 2013 Christophe Hurlin (University of Orléans) Advanced Econometrics

### Linear concepts and hidden variables: An empirical study

Linear concepts and hidden variables An empirical study Adam J. Grove NEC Research Institute 4 Independence Way Princeton NJ 854 grove@research.nj.nec.com Dan Roth Department of Computer Science University

### Analysis of the Largest Normalized Residual Test Robustness for Measurements Gross Errors Processing in the WLS State Estimator

Analysis of the Largest Normalized Residual Test Robustness for s Gross Errors Processing in the WLS State Estimator Breno CARVALHO Electrical Engineering Department, University of São Paulo São Carlos,

### APPLICATION OF HIDDEN MARKOV CHAINS IN QUALITY CONTROL

INTERNATIONAL JOURNAL OF ELECTRONICS; MECHANICAL and MECHATRONICS ENGINEERING Vol.2 Num.4 pp.(353-360) APPLICATION OF HIDDEN MARKOV CHAINS IN QUALITY CONTROL Hanife DEMIRALP 1, Ehsan MOGHIMIHADJI 2 1 Department

### Classifying Galaxies using a data-driven approach

Classifying Galaxies using a data-driven approach Supervisor : Prof. David van Dyk Department of Mathematics Imperial College London London, April 2015 Outline The Classification Problem 1 The Classification

### Chapter 4: Artificial Neural Networks

Chapter 4: Artificial Neural Networks CS 536: Machine Learning Littman (Wu, TA) Administration icml-03: instructional Conference on Machine Learning http://www.cs.rutgers.edu/~mlittman/courses/ml03/icml03/

### Exploiting A Constellation of Narrowband RF Sensors to Detect and Track Moving Targets

Exploiting A Constellation of Narrowband RF Sensors to Detect and Track Moving Targets Chris Kreucher a, J. Webster Stayman b, Ben Shapo a, and Mark Stuff c a Integrity Applications Incorporated 900 Victors

### Introduction to Algorithmic Trading Strategies Lecture 2

Introduction to Algorithmic Trading Strategies Lecture 2 Hidden Markov Trading Model Haksun Li haksun.li@numericalmethod.com www.numericalmethod.com Outline Carry trade Momentum Valuation CAPM Markov chain

### LINEAR ALGEBRA: NUMERICAL METHODS. Version: August 12,

LINEAR ALGEBRA: NUMERICAL METHODS. Version: August 12, 2000 45 4 Iterative methods 4.1 What a two year old child can do Suppose we want to find a number x such that cos x = x (in radians). This is a nonlinear