Lecture 3: Bayesian Optimal Filtering Equations and Kalman Filter

Equations and Kalman Filter Department of Biomedical Engineering and Computational Science Aalto University February 10, 2011

Contents 1 Probabilistics State Space Models 2 Bayesian Optimal Filter 3 Kalman Filter 4 Examples 5 Summary and Demonstration

Probabilistics State Space Models: General Model General probabilistic state space model: measurement model: y k p(y k x k ) dynamic model: x k p(x k x k 1 ) x k = (x k1,...,x kn ) is the state and y k = (y k1,..., y km ) is the measurement. Has the form of hidden Markov model (HMM): observed: y 1 y 2 y 3 y 4 hidden: x 1 x 2 x 3 x 4...

Probabilistics State Space Models: Example Example (Gaussian random walk) Gaussian random walk model can be written as x k = x k 1 + w k 1, w k 1 N(0, q) y k = x k + e k, e k N(0, r), where x k is the hidden state and y k is the measurement. In terms of probability densities the model can be written as p(x k x k 1 ) = 1 ( exp 1 ) 2πq 2q (x k x k 1 ) 2 p(y k x k ) = 1 exp ( 12r ) (y k x k ) 2 2πr which is a discrete-time state space model.

Probabilistics State Space Models: Example (cont.) Example (Gaussian random walk (cont.)) 6 Measurement Signal 4 2 0 2 4 6 8 10 0 20 40 60 80 100

Probabilistics State Space Models: Further Examples Linear Gauss-Markov model: x k = A k 1 x k 1 + q k y k = H k x k + r k, Gaussian driven non-linear model: x k = f(x k 1, q k ) y k = h(x k, r k ). Hierarchical and/or non-gaussian models q k Dirichlet(q k α) x k = f(x k 1, q k ) σ 2 k InvGamma(σ2 k σ2 k 1,γ) r k N(0,σ 2 k I) y k = h(x k, r k ).

Probabilistics State Space Models: Markov and Independence Assumptions The dynamic model p(x k x k 1 ) is Markovian: 1 Future x k is independent of the past given the present (here present is x k 1 ): p(x k x 1:k 1, y 1:k 1 ) = p(x k x k 1 ). 2 Past x k 1 is independent of the future given the present (here present is x k ): p(x k 1 x k:t, y k:t ) = p(x k 1 x k ). The measurements y k are conditionally independent given x k : p(y k x 1:k, y 1:k 1 ) = p(y k x k ).

Bayesian Optimal Filter: Principle Bayesian optimal filter computes the distribution Given the following: 1 Prior distribution p(x 0 ). 2 State space model: p(x k y 1:k ) x k p(x k x k 1 ) y k p(y k x k ), 3 Measurement sequence y 1:k = y 1,...,y k. Computation is based on recursion rule for incorporation of the new measurement y k into the posterior: p(x k 1 y 1:k 1 ) p(x k y 1:k )

Bayesian Optimal Filter: Derivation of Prediction Step Assume that we know the posterior distribution of previous time step: p(x k 1 y 1:k 1 ). The joint distribution of x k, x k 1 given y 1:k 1 can be computed as (recall the Markov property): p(x k, x k 1 y 1:k 1 ) = p(x k x k 1, y 1:k 1 ) p(x k 1 y 1:k 1 ) = p(x k x k 1 ) p(x k 1 y 1:k 1 ), Integrating over x k 1 gives the Chapman-Kolmogorov equation p(x k y 1:k 1 ) = p(x k x k 1 ) p(x k 1 y 1:k 1 ) dx k 1. This is the prediction step of the optimal filter.

Bayesian Optimal Filter: Derivation of Update Step Now we have: 1 Prior distribution from the Chapman-Kolmogorov equation p(x k y 1:k 1 ) 2 Measurement likelihood from the state space model: p(y k x k ) The posterior distribution can be computed by the Bayes rule (recall the conditional independence of measurements): p(x k y 1:k ) = 1 Z k p(y k x k, y 1:k 1 ) p(x k y 1:k 1 ) = 1 Z k p(y k x k ) p(x k y 1:k 1 ) This is the update step of the optimal filter.

Bayesian Optimal Filter: Formal Equations Optimal filter Initialization: The recursion starts from the prior distribution p(x 0 ). Prediction: by the Chapman-Kolmogorov equation p(x k y 1:k 1 ) = p(x k x k 1 ) p(x k 1 y 1:k 1 ) dx k 1. Update: by the Bayes rule p(x k y 1:k ) = 1 Z k p(y k x k ) p(x k y 1:k 1 ), The normalization constant Z k = p(y k y 1:k 1 ) is given as Z k = p(y k x k ) p(x k y 1:k 1 ) dx k.

Bayesian Optimal Filter: Graphical Explanation prior lhood posterior dynamics current previous On prediction step the distribution of previous step is propagated through the dynamics. Prior distribution from prediction and the likelihood of measurement. The posterior distribution after combining the prior and likelihood by Bayes rule.

Kalman Filter: Model Gaussian driven linear model, i.e., Gauss-Markov model: x k = A k 1 x k 1 + q k y k = H k x k + r k, q k N(0, Q k ) white process noise. r k N(0, R k ) white measurement noise. A k 1 is the transition matrix of the dynamic model. H k is the measurement model matrix. In probabilistic terms the model is p(x k x k 1 ) = N(x k A k 1 x k 1, Q k ) p(y k x k ) = N(y k H k x k, R k ).

Kalman Filter: Derivation Preliminaries Gaussian probability density ( 1 N(x m, P) = (2π) n/2 exp 1 ) P 1/2 2 (x m)t P 1 (x m), Let x and y have the Gaussian densities p(x) = N(x m, P), p(y x) = N(y H x, R), Then the joint and marginal distributions are ( ) (( ) ( )) x m P P H T N, y H m H P H P H T + R y N(H m, H P H T + R).

Kalman Filter: Derivation Preliminaries (cont.) If the random variables x and y have the joint Gaussian probability density ( ) (( ) ( )) x a A C N, y b C T, B Then the marginal and conditional densities of x and y are given as follows: x N(a, A) y N(b, B) x y N(a+C B 1 (y b), A C B 1 C T ) y x N(b+C T A 1 (x a), B C T A 1 C).

Kalman Filter: Derivation of Prediction Step Assume that the posterior distribution of previous step is Gaussian p(x k 1 y 1:k 1 ) = N(x k 1 m k 1, P k 1 ). The Chapman-Kolmogorov equation now gives p(x k y 1:k 1 ) = p(x k x k 1 ) p(x k 1 y 1:k 1 ) dx k 1 = N(x k A k 1 x k 1, Q k ) N(x k 1 m k 1, P k 1 ). Using the Gaussian distribution computation rules from previous slides, we get the prediction step p(x k y 1:k 1 ) = N(x k A k 1 m k 1, A k 1 P k 1 A T k 1, Q k)

Kalman Filter: Derivation of Update Step The joint distribution of y k and x k is p(x k, y k y 1:k 1 ) = p(y k x k ) p(x k y 1:k 1 ) ([ ] ) xk = N m, P, y k where m = P = ( m k H k m k ( P k H k P k ) P k HT k H k P k HT k + R k ).

Kalman Filter: Derivation of Update Step (cont.) The conditional distribution of x k given y k is then given as where p(x k y k, y 1:k 1 ) = p(x k y 1:k ) S k = H k P k HT k + R k = N(x k m k, P k ), K k = P k HT k S 1 k m k = m k + K k [y k H k m k ] P k = P k K k S k K T k.

Kalman Filter: Equations Kalman Filter Initialization: x 0 N(m 0, P 0 ) Prediction step: m k = A k 1 m k 1 P k = A k 1 P k 1 A T k 1 + Q k 1. Update step: v k = y k H k m k S k = H k P k HT k + R k K k = P k HT k S 1 k m k = m k + K k v k P k = P k K k S k K T k.

Kalman Filter: Properties Kalman filter can be applied only to linear Gaussian models, for non-linearities we need e.g. EKF or UKF. If several conditionally independent measurements are obtained at a single time step, update step is simply performed for each of them separately. If the measurement noise covariance is diagonal (as it usually is), no matrix inversion is needed at all. The covariance equation is independent of measurements the gain sequence could be computed and stored offline. If the model is time-invariant, the gain converges to a constant K k K and the filter becomes stationary: m k = (A K H A) m k 1 + K y k

Kalman Filter: Random Walk Example Example (Kalman filter for Gaussian random walk) Filtering density is Gaussian p(x k 1 y 1:k 1 ) = N(x k 1 m k 1, P k 1 ). The Kalman filter prediction and update equations are m k = m k 1 = P k 1 + q P k m k = m k + P k P k + r(y k m k ) P k = P k (P k )2 P k + r.

Kalman Filter: Random Walk Example (cont.) Example (Kalman filter for Gaussian random walk (cont.)) 6 4 Measurement Signal Filter Estimate 95% Quantiles 2 0 2 4 6 8 10 0 20 40 60 80 100

Kalman Filter: Car Tracking Example [1/4] The dynamic model of the car tracking model from the first lecture can be written in discrete form as follows: x k 1 0 t 0 y k ẋk = 0 1 0 t 0 0 1 0 + q k 1 ẏk } 0 0 0 {{ 1 } A x k 1 y k 1 ẋ k 1 ẏ k 1 where q k is zero mean with a covariance matrix Q. q1 c t3 /3 0 q c 1 t2 /2 0 Q = 0 q2 c t3 /3 0 q2 c t2 /2 q1 c t2 /2 0 q1 c t 0 0 q2 c t2 /2 0 q2 c t

Kalman Filter: Car Tracking Example [2/4] The measurement model can be written in form ( ) x k 1 0 0 0 y k = y k 0 1 0 0 ẋk + e k, }{{} ẏk H where e k has the covariance R = ( ) σ 2 0 0 σ 2

Kalman Filter: Car Tracking Example [3/4] The Kalman filter prediction equations: 1 0 t 0 m k = 0 1 0 t 0 0 1 0 m k 1 0 0 0 1 P k = 1 0 t 0 0 1 0 t 0 0 1 0 0 0 0 1 P k 1 1 0 t 0 0 1 0 t 0 0 1 0 0 0 0 1 q1 c t3 /3 0 q c 1 t2 /2 0 + 0 q2 c t3 /3 0 q2 c t2 /2 q1 c t2 /2 0 q1 c t 0 0 q2 c t2 /2 0 q2 c t T

Kalman Filter: Car Tracking Example [4/4] The Kalman filter update equations: ( ) ( ) T ( ) 1 0 0 0 S k = P 1 0 0 0 σ 2 0 0 1 0 0 k + 0 1 0 0 0 σ 2 ( ) T K k = P 1 0 0 0 k S 1 0 1 0 0 k ( ( ) ) m k = m 1 0 0 0 k + K k y k m 0 1 0 0 k P k = P k K k S k K T k

Summary Probabilistic state space models are generalizations of hidden Markov models. Special cases of such HMMs are e.g. linear Gaussian models, non-linear filtering models. Bayesian optimal filtering equations form the formal solution to general optimal filtering problem. The optimal filtering equations consist of prediction and update steps. Kalman filter is the closed form filtering solution to linear Gaussian models.

Matlab Demo: Kalman Filter Implementation [Kalman filter for car tracking model]