Particle Filtering. Emin Orhan August 11, 2012

Similar documents
A Tutorial on Particle Filters for Online Nonlinear/Non-Gaussian Bayesian Tracking

Introduction to Mobile Robotics Bayes Filter Particle Filter and Monte Carlo Localization

Discrete Frobenius-Perron Tracking

Master s thesis tutorial: part III

Forecasting "High" and "Low" of financial time series by Particle systems and Kalman filters

Kristine L. Bell and Harry L. Van Trees. Center of Excellence in C 3 I George Mason University Fairfax, VA , USA kbell@gmu.edu, hlv@gmu.

ROBUST REAL-TIME ON-BOARD VEHICLE TRACKING SYSTEM USING PARTICLES FILTER. Ecole des Mines de Paris, Paris, France

Mobility Tracking in Cellular Networks with Sequential Monte Carlo Filters

Probability Hypothesis Density filter versus Multiple Hypothesis Tracking

Mobility Tracking in Cellular Networks Using Particle Filtering

STA 4273H: Statistical Machine Learning

Monte Carlo-based statistical methods (MASM11/FMS091)

How To Solve A Sequential Mca Problem

System Identification for Acoustic Comms.:

Real-Time Monitoring of Complex Industrial Processes with Particle Filters

A Robust Multiple Object Tracking for Sport Applications 1) Thomas Mauthner, Horst Bischof

Mean-Shift Tracking with Random Sampling

Deterministic Sampling-based Switching Kalman Filtering for Vehicle Tracking

Particle Filter-Based On-Line Estimation of Spot (Cross-)Volatility with Nonlinear Market Microstructure Noise Models. Jan C.

A Reliability Point and Kalman Filter-based Vehicle Tracking Technique

EE 570: Location and Navigation

VEHICLE TRACKING USING ACOUSTIC AND VIDEO SENSORS

Linear regression methods for large n and streaming data

Mathieu St-Pierre. Denis Gingras Dr. Ing.

Mixture of Uniform Probability Density Functions for non Linear State Estimation using Interval Analysis

Achieving High Availability of Web Services Based on A Particle Filtering Approach

Logistic Regression. Jia Li. Department of Statistics The Pennsylvania State University. Logistic Regression

Example: Credit card default, we may be more interested in predicting the probabilty of a default than classifying individuals as default or not.

1 Maximum likelihood estimation

Gaussian Processes in Machine Learning

11. Time series and dynamic linear models

Performance evaluation of multi-camera visual tracking

Adapting Particle Filter Algorithms to Many-Core Architectures

PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 4: LINEAR MODELS FOR CLASSIFICATION

Rao-Blackwellized Marginal Particle Filtering for Multiple Object Tracking in Molecular Bioimaging

Christfried Webers. Canberra February June 2015

Stock Option Pricing Using Bayes Filters

Kalman Filter Applied to a Active Queue Management Problem

Bayesian probability theory

Centre for Central Banking Studies

S. Tugac and M. Efe * Electrical and Electronics Engineering Department, Faculty of Engineering, Ankara University, Tandogan, Ankara, Turkey

PTE505: Inverse Modeling for Subsurface Flow Data Integration (3 Units)

Wireless Sensor Networks Coverage Optimization based on Improved AFSA Algorithm

Class #6: Non-linear classification. ML4Bio 2012 February 17 th, 2012 Quaid Morris

Exploiting A Constellation of Narrowband RF Sensors to Detect and Track Moving Targets

Expectation Maximization Based Parameter Estimation by Sigma-Point and Particle Smoothing

Bayesian Statistics: Indian Buffet Process

Tutorial on Markov Chain Monte Carlo

Online Bayesian learning in dynamic models: An illustrative introduction to particle methods. September 2, Abstract

Figure 1. The Ball and Beam System.

Real-time Visual Tracker by Stream Processing

A PHD filter for tracking multiple extended targets using random matrices

An Introduction to the Kalman Filter

Linear Threshold Units

19 LINEAR QUADRATIC REGULATOR

Since it is necessary to consider the ability of the lter to predict many data over a period of time a more meaningful metric is the expected value of

Semi-Supervised Learning in Inferring Mobile Device Locations

EM Clustering Approach for Multi-Dimensional Analysis of Big Data Set

99.37, 99.38, 99.38, 99.39, 99.39, 99.39, 99.39, 99.40, 99.41, cm

5.5. Solving linear systems by the elimination method

Linear Models for Classification

Big Data Interpolation: An Effcient Sampling Alternative for Sensor Data Aggregation

Partial Fractions. p(x) q(x)

Introduction to Kalman Filtering

UW CSE Technical Report Probabilistic Bilinear Models for Appearance-Based Vision

Data Modeling & Analysis Techniques. Probability & Statistics. Manfred Huber

Hybrid Data and Decision Fusion Techniques for Model-Based Data Gathering in Wireless Sensor Networks

1 Prior Probability and Posterior Probability

Chapter 27: Taxation. 27.1: Introduction. 27.2: The Two Prices with a Tax. 27.2: The Pre-Tax Position

2x + y = 3. Since the second equation is precisely the same as the first equation, it is enough to find x and y satisfying the system

Real-Time Camera Tracking Using a Particle Filter

Markov Chain Monte Carlo Simulation Made Simple

Normal distribution. ) 2 /2σ. 2π σ

2.4 Real Zeros of Polynomial Functions

1 Simulating Brownian motion (BM) and geometric Brownian motion (GBM)

An introduction to OBJECTIVE ASSESSMENT OF IMAGE QUALITY. Harrison H. Barrett University of Arizona Tucson, AZ

Tracking and Recognition in Sports Videos

Lecture 3: Linear methods for classification

General Framework for an Iterative Solution of Ax b. Jacobi s Method

Particle Filters and Their Applications

Chapter 6. The stacking ensemble approach

Advanced Computer Graphics. Rendering Equation. Matthias Teschner. Computer Science Department University of Freiburg

The CUSUM algorithm a small review. Pierre Granjon

PHOTON mapping is a practical approach for computing global illumination within complex

The Chinese Restaurant Process

Supervised Feature Selection & Unsupervised Dimensionality Reduction

Bayesian tracking of elongated structures in 3D images

Visual Tracking. Frédéric Jurie LASMEA. CNRS / Université Blaise Pascal. France. jurie@lasmea.univ-bpclermoaddressnt.fr

Fall detection in the elderly by head tracking

CHAPTER 3 EXAMPLES: REGRESSION AND PATH ANALYSIS

11 Linear and Quadratic Discriminant Analysis, Logistic Regression, and Partial Least Squares Regression

Big data challenges for physics in the next decades

A New Quantitative Behavioral Model for Financial Prediction

Model-based Synthesis. Tony O Hagan

Raitoharju, Matti; Dashti, Marzieh; Ali-Löytty, Simo; Piché, Robert

3-d Blood Flow Reconstruction from 2-d Angiograms

An Introduction to Machine Learning

Filtered Gaussian Processes for Learning with Large Data-Sets

m i: is the mass of each particle

Transcription:

Particle Filtering Emin Orhan eorhan@bcs.rochester.edu August 11, 1 Introduction: Particle filtering is a general Monte Carlo (sampling) method for performing inference in state-space models where the state of a system evolves in time and information about the state is obtained via noisy measurements made at each time step. In a general discrete-time state-space model, the state of a system evolves according to: x = f (x 1,v 1 ) (1) where x is a vector representing the state of the system at time, v 1 is the state noise vector, f is a possibly non-linear and time-dependent function describing the evolution of the state vector. The state vector x is assumed to be latent or unobservable. Information about x is obtained only through noisy measurements of it, z, which are governed by the equation: z = h (x,n ) () where h is a possibly non-linear and time-dependent function describing the measurement process and n is the measurement noise vector. Optimal filtering: The filtering problem involves the estimation of the state vector at time, given all themeasurements uptoandincludingtime, whichwedenotebyz 1:. InaBayesian setting, thisproblem can be formalized as the computation of the distribution p(x z 1: ), which can be done recursively in two steps. In the prediction step, p(x z 1: 1 ) is computed from the filtering distribution p(x 1 z 1: 1 ) at time 1: p(x z 1: 1 ) = p(x x 1 )p(x 1 z 1: 1 )dx 1 (3) where p(x 1 z 1: 1 ) is assumed to benown dueto recursion and p(x x 1 ) is given by Equation 1. The distributionp(x z 1: 1 )canbethoughtofasapriorover x beforereceivingthemostrecent measurement z. In the update step, this prior is updated with the new measurement z using the Bayes rule to obtain the posterior over x : p(x z 1: ) p(z x )p(x z 1: 1 ) (4) In general, the computations in the prediction and update steps (Equations 3-4) cannot be carried out analytically, hence the need for approximate methods such as Monte Carlo sampling. In some restricted cases, however, the computations in Equations 3-4 can be carried out analytically, as will be shown next. The Kalman filter: If the filtering distribution at time 1, p(x 1 z 1: 1 ), is Gaussian, the filtering distribution at the next time step, p(x z 1: ) can be shown to be Gaussian as well if the following two conditions are satisfied: (1) the state evolution and measurement functions, f and h, are linear; () the state noise and the measurement noise, v and n, are Gaussian. In this case, the state and measurement 1

equations reduce to the following forms: x = F x 1 +v 1 (5) z = H x +n (6) wherev 1 andn aregaussianrandomvariables andf andh arethestateevolution andmeasurement matrices that are assumed to be nown. The analytically tractable computation of the prediction and update equations in this linear Gaussian case leads to the well-nown Kalman filter algorithm. It is a very useful and intellectually satisfying exercise to derive the prediction and update equations for the linear Gaussian system given in Equations 5-6. Sequential importance sampling (SIS): When the prediction and update steps of the optimal filtering are not analytically tractable, one has to resort to approximate methods, such as Monte Carlo sampling. Sequential importance sampling (SIS) is the most basic Monte Carlo method used for this purpose. In deriving the SIS algorithm, it is useful to consider the full posterior distribution at time 1, p(x : 1 z 1: 1 ), rather than the filtering distribution, p(x 1 z 1: 1 ), which is just the marginal of the full posterior distribution with respect to x 1. The idea in SIS is to approximate the posterior distribution at time 1, p(x : 1 z 1: 1 ), with a weighted set of samples {x i : 1,wi 1 }N i=1, also called particles, and recursively update these particles to obtain an approximation to the posterior distribution at the next time step: p(x : z 1: ). SIS is based on importance sampling. In importance sampling, one approximates a target distribution p(x), using samples drawn from a proposal distribution q(x). Importance sampling is generally used when it is difficult to sample directly from the target distribution itself, but much easier to sample from the proposal distribution. To compensate for the discrepancy between the target and proposal distributions, one has to weight each sample x i by w i π(x i )/q(x i ) where π(x) is a function that is proportional to p(x) (i.e. p(x) π(x)) and that we now how to evaluate. Applied to our posterior distribution at time 1, importance sampling yields: p(x : 1 z 1: 1 ) N w 1 i δ x i : 1 i=1 (7) where δ x i is a delta function centered at x i : 1 : 1. The crucial idea in SIS is to update the particles x i : 1 and their weights wi 1 such that they would approximate the posterior distribution at the next time step: p(x : z 1: ). To do this, we first assume that the proposal distribution at time can be factorized as follows: q(x : z 1: ) = q(x x : 1,z 1: )q(x : 1 z 1: 1 ) (8) so that we can simply augment each particle x i : 1 at time 1 with a new state xi at time sampled from q(x x : 1,z 1: ). To update the weights, w 1 i, we note that according importance sampling, the weights of the particles at time should be as follows: w i p(xi : z 1:) q(x i : z 1:) (9) We will not give the full derivation here, but it turns out that it is possible to express Equation 9 recursively in terms of w 1 i. As usual, it is a very instructive exercise to derive this weight update equation by yourself. The final result turns out to be: w i p(z x i wi )p(xi xi 1 ) 1 q(x i xi : 1,z 1:) (1)

4 3 1 1 3 4 5 1 3 4 5 Figure 1: Demonstration of the SIS filtering algorithm on a linear Gaussian model. The red line shows the state, x, of the system at each time step ; the blue circles show the measurements, z, at the corresponding time steps. The gray dots show the particles generated by the SIS algorithm at each time step and the weights of the particles are represented by the gray levels of the dots, with darer colors corresponding to larger weights. In this example, we used N = 5 particles. The weights {w i}n i=1 are then normalized to sum to 1. If we also assume that q(x x : 1,z 1: ) = q(x x 1,z ) so that the proposal at the next time step only depends on the most recent state and the most recent measurement, then we only need to store x i 1 and generate the particle at the next time step from the distribution q(x x i 1,z ). Thus, in this case, the update equations simplify to: x i q(x x i 1,z ) (11) w i w 1 i p(z x i )p(xi xi 1 ) q(x i xi 1,z ) (1) Figure 1 shows the application of the SIS algorithm to a toy linear Gaussian model. Note that the weights of the particles, represented by the gray levels of the dots in this figure, are initially uniform, but as increases, they gradually come to be dominated by a smaller and smaller number of particles. This illustrates the degeneracy problem, which we tae up next. Degeneracy problem: In practice, iteration of the update equations in 11-1 leads to a degeneracy problem where only a few of the particles will have a significant weight, and all the other particles will have very small weights. Degeneracy is typically measured by an estimate of the effective sample size: N eff = 1 N i=1 (wi ) (13) 3

A 5 B 5 4 4 3 1 Neff 3 1 1 3 4 5 Particle 1 3 4 5 Figure : (A) The weights of all 5 particles (x-axis) at each time step (y-axis). Hotter colors represent larger weights. (B) The effective sample size N eff as a function of time step. where a smaller N eff means a larger variance for the weights, hence more degeneracy. Figure illustrates the degeneracy problem in a toy example. In this example, an SIS filter with N = 5 particles was applied to a linear Gaussian state-space model for 5 time steps. Figure A shows the weights of all particles (x-axis) at each time step (y-axis); Figure B plots the effective sample size N eff as a function of time step. As increases, N eff drops very quicly from an initial value of 5 to values smaller than 5, indicating that for large, only a handful of particles have significant weights. Resampling: One common way to deal with degeneracy is resampling. In resampling, one draws(with replacement) a new set of N particles from the discrete approximation to the filtering distribution p(x z 1: ) (or to the full posterior distribution, if one is interested in computing the full posterior) provided by the weighted particles: N p(x z 1: ) w i δ x i (14) Resampling is performed whenever the effective sample size N eff drops below a certain threshold. Note that since resampling is done with replacement, a particle with a large weight is liely to be drawn multiple times and conversely particles with very small weights are not liely to be drawn at all. Also note that the weights of the new particles will all be equal to 1/N. Thus, resampling effectively deals with the degeneracy problem by getting rid of the particles with very small weights. This, however, introduces a new problem, called the sample impoverishment problem. Because particles with large weights are liely to be drawn multiple times during resampling, whereas particles with small weights are not liely to be drawn at all, the diversity of the particles will tend to decrease after a resampling step. In the extreme case, all particles might collapse into a single particle. This will negatively impact the quality of the approximation in Equation 14, as one will be trying to represent a whole distribution with a fewer number of distinct samples (in the limit of a single particle, one will be representing the full distribution with a single point estimate). We will mention below one possible way to deal with this sample impoverishment problem, but we first briefly review a quite popular particle filtering algorithm that uses resampling. Sampling importance resampling (SIR) algorithm: Most particle filtering algorithms are variants of the basic SIS algorithm considered above. The SIR algorithm can be seen as a variant of SIS where the proposal distribution q(x x i 1,z ) is taen to be the state transition distribution p(x x i 1 ) and i=1 4

resampling is applied at every iteration. Thus, in the SIR algorithm, the update equations for the particles (Equations 11-1 in the SIS algorithm) reduce to: x i p(x x i 1 ) (15) w i p(z x i ) (16) These updates are then followed by a resampling step at each iteration. Note that the w 1 i term disappears in the weight updateequation in 16, because after resampling at time 1, all weights w 1 i become equal to 1/N. The main advantage of the SIR algorithm is that it is extremely simple to implement, since it only requires sampling from the distribution p(x x i 1 ) and evaluating p(z x i ) (plus resampling at every iteration). Its disadvantages are (i) the independence of the proposal distribution p(x x i 1 ) from the observations z (this means that the states are updated without directly taing into account the information provided by the observations); (ii) the sample impoverishment problem which can be severe due to resampling at every iteration. Figure 3A shows the application of an SIR filter to a simple linear Gaussian system. To illustrate the effects of resampling, Figure 3B shows the results for an SIS filter applied to the same system. The SIS filter used the same proposal distribution as the SIR filter, but did not involve resampling. Note that in Figure 3B, the weight distribution is dominated by a single particle for later time steps, illustrating the degeneracy problem, whereas the particles in Figure 3A all have the same weight, 1/N, due to resampling at each time step in the SIR algorithm. Although resampling deals with the degeneracy problem, it leads to the sample impoverishment problem, which can be observed in Figure 3 by noting that particle diversity is reduced in Figure 3A compared with Figure 3B. We will now present another variant of the SIS algorithm, called the regularized particle filter, that also uses resampling, but unlie the SIR algorithm, effectively deals with the sample impoverishment problem introduced by resampling. Regularized particle filter: The sample impoverishment problem during resampling arises because we try to approximate a continuous distribution with a discrete one (Equation 14): with a discrete distribution, there is a strictly positive probability that two samples drawn from the distribution will be identical, whereas with a continuous distribution, no two samples drawn from the distribution will be identical. In the regularized particle filter, one approximates the filtering distribution in Equation 14 (or the full posterior distribution) with a ernel density estimate using the particles rather than directly with the particles themselves. Thus, in resampling, Equation 14 is replaced by: p(x z 1: ) N w i K(x x i ) (17) i=1 where K(x x i ) is a ernel function centered at xi. For the case of equal weights, it turns out to be possible to analytically determine the shape and the bandwidth of the optimal ernel function that minimizes a certain distance measure between the true filtering distribution and the ernel density estimate obtained via the approximation in Equation 17. The optimal ernel function in this case is nown as the Epanechniov ernel (see Arulampalam, Masell, Gordon & Clapp () for details). The complete regularized particle filter algorithm then consists of the state and weight update equations of the SIS filter (Equations 11-1), combined with resampling N new particles from Equation 17 whenever the effective sample size N eff falls below a certain threshold N T. 5

A 4 SIR B 4 1 3 4 5 4 SIS 4 1 3 4 5 Figure 3: Comparison of SIR with SIS. (A) An SIR filter applied to a linear Gaussian system. The red line shows the state, x, of the system at each time step ; the blue circles show the measurements, z, at each time step. The gray dots show the particles generated by the SIR algorithm at each time step and the weights of the particles are represented by the gray levels of the dots, with darer colors corresponding to larger weights. (B) An SIS filter applied to the same system. The SIS filter used the same proposal distribution as the SIR filter, but did not involve resampling. In this example, we used N = 5 particles in both cases. References [1] Arulampalam, M.S., Masell, S., Gordon, N. & Clapp, T. (). A tutorial on particle filters for online nonlinear/non-gaussian Bayesian tracing. IEEE Transactions on Signal Processing. 5():174-188. 6