Time Series Modelling and Kalman Filters

Similar documents
Univariate and Multivariate Methods PEARSON. Addison Wesley

Introduction to Time Series Analysis. Lecture 6.

Lecture 2: ARMA(p,q) models (part 3)

1 Short Introduction to Time Series

Discrete Time Series Analysis with ARMA Models

Some useful concepts in univariate time series analysis

Sales forecasting # 2

Time Series Analysis

Univariate Time Series Analysis; ARIMA Models

11. Time series and dynamic linear models

Advanced Signal Processing and Digital Noise Reduction

Time Series Analysis 1. Lecture 8: Time Series Analysis. Time Series Analysis MIT 18.S096. Dr. Kempthorne. Fall 2013 MIT 18.S096

Estimating an ARMA Process

Understanding and Applying Kalman Filtering

Univariate Time Series Analysis; ARIMA Models

Time Series Analysis III

Analysis and Computation for Finance Time Series - An Introduction

Trend and Seasonal Components

Time Series Analysis of Aviation Data

Introduction to Kalman Filtering

ITSM-R Reference Manual

State Space Time Series Analysis

AR(p) + MA(q) = ARMA(p, q)

Lecture 4: Seasonal Time Series, Trend Analysis & Component Model Bus 41910, Time Series Analysis, Mr. R. Tsay

Time Series Analysis in Economics. Klaus Neusser

Time Series - ARIMA Models. Instructor: G. William Schwert

Linear Threshold Units

Time Series Analysis

Statistics and Probability Letters

Lecture 8: Signal Detection and Noise Assumption

Statistics Graduate Courses

Time Series Analysis

Time Series Analysis

Probability and Random Variables. Generation of random variables (r.v.)

CCNY. BME I5100: Biomedical Signal Processing. Linear Discrimination. Lucas C. Parra Biomedical Engineering Department City College of New York

Graphical Tools for Exploring and Analyzing Data From ARIMA Time Series Models

CS229 Lecture notes. Andrew Ng

Time Series HILARY TERM 2010 PROF. GESINE REINERT

3.1 Stationary Processes and Mean Reversion

EE 570: Location and Navigation

Master s thesis tutorial: part III

Introduction to General and Generalized Linear Models

QUALITY ENGINEERING PROGRAM

TIME SERIES ANALYSIS

Forecasting model of electricity demand in the Nordic countries. Tone Pedersen

Time Series Analysis

Time Series Analysis

Vector Time Series Model Representations and Analysis with XploRe

STA 4273H: Statistical Machine Learning

THE SVM APPROACH FOR BOX JENKINS MODELS

Autocovariance and Autocorrelation

Forecasting Geographic Data Michael Leonard and Renee Samy, SAS Institute Inc. Cary, NC, USA

Software Review: ITSM 2000 Professional Version 6.0.

NAG C Library Chapter Introduction. g13 Time Series Analysis

Linear regression methods for large n and streaming data

The SAS Time Series Forecasting System

Luciano Rispoli Department of Economics, Mathematics and Statistics Birkbeck College (University of London)

2x + y = 3. Since the second equation is precisely the same as the first equation, it is enough to find x and y satisfying the system

Analysis of algorithms of time series analysis for forecasting sales

Time Series Analysis and Forecasting Methods for Temporal Mining of Interlinked Documents

A FULLY INTEGRATED ENVIRONMENT FOR TIME-DEPENDENT DATA ANALYSIS

Traffic Safety Facts. Research Note. Time Series Analysis and Forecast of Crash Fatalities during Six Holiday Periods Cejun Liu* and Chou-Lin Chen

Discrete Frobenius-Perron Tracking

Part 4 fitting with energy loss and multiple scattering non gaussian uncertainties outliers

Financial TIme Series Analysis: Part II

Adaptive Demand-Forecasting Approach based on Principal Components Time-series an application of data-mining technique to detection of market movement

TIME SERIES ANALYSIS: AN OVERVIEW

APPLICATION OF THE VARMA MODEL FOR SALES FORECAST: CASE OF URMIA GRAY CEMENT FACTORY

Introduction to Mobile Robotics Bayes Filter Particle Filter and Monte Carlo Localization

Multivariate Normal Distribution

Performing Unit Root Tests in EViews. Unit Root Testing

Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model

Forecasting the US Dollar / Euro Exchange rate Using ARMA Models

Studying Achievement

Rob J Hyndman. Forecasting using. 11. Dynamic regression OTexts.com/fpp/9/1/ Forecasting using R 1

Systems of Linear Equations

Maximum likelihood estimation of mean reverting processes

Do Electricity Prices Reflect Economic Fundamentals?: Evidence from the California ISO

Prediction of Stock Price usingautoregressiveintegrated Moving AverageFilter Arima P,D,Q

Part 2: Analysis of Relationship Between Two Variables

Chapter 5. Analysis of Multiple Time Series. 5.1 Vector Autoregressions

Impulse Response Functions

Time Series Analysis and Forecasting

Introduction to Time Series Analysis. Lecture 1.

Solving Systems of Linear Equations Using Matrices

System Identification for Acoustic Comms.:

Chapter 1. Vector autoregressions. 1.1 VARs and the identi cation problem

Continued Fractions and the Euclidean Algorithm

The Engle-Granger representation theorem

Statistical Machine Learning

ADVANCED FORECASTING MODELS USING SAS SOFTWARE

9th Russian Summer School in Information Retrieval Big Data Analytics with R

Time Series Analysis by Higher Order Crossings. Benjamin Kedem Department of Mathematics & ISR University of Maryland College Park, MD

PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 4: LINEAR MODELS FOR CLASSIFICATION

Deterministic Sampling-based Switching Kalman Filtering for Vehicle Tracking

ARMA, GARCH and Related Option Pricing Method

Advanced Forecasting Techniques and Models: ARIMA

Chapter 6: Multivariate Cointegration Analysis

Chapter 9: Univariate Time Series Analysis

Least Squares Estimation

Transcription:

1 / 24 Time Series Modelling and Kalman Filters Chris Williams School of Informatics, University of Edinburgh November 2010

2 / 24 Outline Stochastic processes AR, MA and ARMA models The Fourier view Parameter estimation for ARMA models Linear-Gaussian HMMs (Kalman filtering) Reading: Handout on Time Series Modelling: AR, MA, ARMA and All That Reading: Bishop 13.3 (but not 13.3.2, 13.3.3)

3 / 24 Example time series FTSE 100 Meteorology: temperature, pressure... Seismology Electrocardiogram (ECG)...

Stochastic Processes A stochastic process is a family of random variables X(t), t T indexed by a parameter t in an index set T We will consider discrete-time stochastic processes where T = Z (the integers) A time series is said to be strictly stationary if the joint distribution of X(t 1 ),..., X(t n ) is the same as the joint distribution of X(t 1 + τ),..., X(t n + τ) for all t 1,..., t n, τ A time series is said to be weakly stationary if its mean is constant and its autocovariance function depends only on the lag, i.e. E[X(t)] = µ Cov[X(t)X(t + τ)] = γ(τ) A Gaussian process is a family of random variables, any finite number of which have a joint Gaussian distribution The ARMA models we will study are stationary Gaussian processes t 4 / 24

Autoregressive (AR) Models 5 / 24 Example AR(1) where w t N(0, σ 2 ) x t = αx t 1 + w t By repeated substitution we get x t = w t + αw t 1 + α 2 w t 2 +... Hence E[X(t)] = 0, and if α < 1 the process is stationary with Similarly Var[X(t)] = (1 + α 2 + α 4 +...)σ 2 = σ2 1 α 2 Cov[X(t)X(t k)] = α k Var[X(t k)] = αk σ 2 1 α 2

α = 0.5 6 / 24 α = 0.5

7 / 24 The general case is an AR(p) process x t = p α i x t i + w t i=1 Notice how x t is obtainied by a (linear) regression from x t 1,..., x t p, hence an autoregressive process Introduce the backward shift operator B, so that Bx t = x t 1, B 2 x t = x t 2 etc Then AR(p) model can be written as φ(b)x t = w t where φ(b) = (1 α 1 B... α p B p ) The condition for stationarity is that all the roots of φ(b) lie outside the unit circle

Yule-Walker Equations x t = x t x t k = p α i x t i + w t i=1 p α i x t i x t k + w t x t k i=1 Taking expectations (and exploiting stationarity) we obtain γ k = p α i γ k i k = 1,, 2,... i=1 Use p simultaneous equations to obtain the γ s from the α s. For inference, can solve a linear system to obtain the α s given estimates of the γ s Example: AR(1) process, γ k = α k γ 0 8 / 24

9 / 24 Graphical model illustrating an AR(2) process...... AR2: α 1 = 0.2, α 2 = 0.1 AR2: α 1 = 1.0, α 2 = 0.5

Vector AR processes 10 / 24 x t = p A i x t i + Gw t i=1 where the A i s and G are square matrices We can in general consider modelling multivariate (as opposed to univariate) time series An AR(2) process can be written as a vector AR(1) process: ( ) ( ) ( ) ( ) ( xt α1 α = 2 xt 1 1 0 wt + 1 0 0 0 x t 1 x t 2 w t 1 ) In general an AR(p) process can be written as a vector AR(1) process with a p-dimensional state vector (cf ODEs)

11 / 24 Moving Average (MA) processes x t = q β j w t j j=0 (linear filter) = θ(b)w t with scaling so that β 0 = 1 and θ(b) = 1 + q j=1 β jb j Example: MA(1) process w s x s

12 / 24 We have E[X(t)] = 0, and Var[X(t)] = (1 + β 2 1 +... + β2 q)σ 2 Cov[X(t)X(t k)] = E[ = { q β j w t j, j=0 q β i w t k i ] i=0 Note that covariance cuts off for k > q σ 2 q k j=0 β j+kβ j for k = 0, 1,..., q 0 for k > q

13 / 24 ARMA(p,q) processes x t = p q α i x t i + β j w t j i=1 j=0 φ(b)x t = θ(b)w t Writing an AR(p) process as a MA( ) process φ(b)x t = w t x t = (1 α 1 B... α p B p ) 1 w t = (1 + β 1 B + β 2 B 2...)w t Similarly a MA(q) process can be written as a AR( ) process Utility of ARMA(p,q) is potential parsimony

14 / 24 The Fourier View ARMA models are linear time-invariant systems. Hence sinusoids are their eigenfunctions (Fourier analysis) This means it is natural to consider the power spectrum of the ARMA process. The power spectrum S(k) can be determined from the {α}, {β} coefficients Roughly speaking S(k) is the amount of power allocated on average to the eigenfunction e 2πikt This is a useful way to understand some properties of ARMA processes, but we will not pursue it further here If you want to know more, see e.g. Chatfield (1989) chapter 7 or Diggle (1990) chapter 4

15 / 24 Parameter Estimation Let the vector of observations x = (x(t 1 ),..., x(t n )) T Estimate and subtract constant offset ˆµ if this is non zero ARMA models driven by Gaussian noise are Gaussian processes. Thus the likelihood L(x; α, β) is a multivariate Gaussian, and we can optimize this wrt the parameters (e.g. by gradient ascent) AR(p) models, x t = p α i x t i + w t i=1 can be viewed as the linear regression of x t on the p previous time steps, α and σ 2 can be estimated using linear regression This viewpoint also enables the fitting of nonlinear AR models

16 / 24 Model Order Selection, References For a MA(q) process there should be a cut-off in the autocorrelation function for lags greater than q For general ARMA models this is model order selection problem, discussed in an upcoming lecture Some useful books: The Analysis of Time Series: An Introduction. C. Chatfield, Chapman and Hall, 4th edition, 1989 Time Series: A Biostatistical Introduction. P. J. Diggle, Clarendon Press, 1990

17 / 24 Linear-Gaussian HMMs HMM with continuous state-space and observations Filtering problem known as Kalman filtering

18 / 24 z 1 z2 z3 z N A A.. C C C C x x x x 1 2 3 N Dynamical model z n+1 = Az n + w n+1 where w n+1 N(0, Γ) is Gaussian noise, i.e. p(z n+1 z n ) N(Az n, Γ)

19 / 24 Observation model x n = Cz n + v n where v n N(0, Σ) is Gaussian noise, i.e. p(x n z n ) N(Cz n, Σ) Initialization p(z 1 ) N(µ 0, V 0 )

Inference Problem filtering 20 / 24 As whole model is Gaussian, only need to compute means and variances p(z n x 1,..., x n ) N(µ n, V n ) µ n = E[z n x 1,..., x n ] V n = E[(z n µ n )(z n µ n ) T x 1,..., x n ] Recursive update split into two parts Time update p(z n x 1,..., x n ) p(z n+1 x 1,..., x n ) Measurement update p(z n+1 x 1,..., x n ) p(z n+1 x 1,..., x n, x n+1 )

21 / 24 Time update z n+1 = Az n + w n+1 thus E[z n+1 x 1,... x n ] = Aµ n cov(z n+1 x 1,... x n ) def = P n = AV n A T + Γ Measurement update (like posterior in Factor Analysis) µ n+1 = Aµ n + K n+1 (x n+1 CAµ n ) V n+1 = (I K n+1 C)P n where K n+1 = P n C T (CP n C T + Σ) 1 K n+1 is known as the Kalman gain matrix

Simple example 22 / 24 w n N(0, 1) v n N(0, 1) In the limit σ 2 we find z n+1 = z n + w n+1 x n = z n + v n p(z 1 ) N(0, σ 2 ) µ 3 = 5x 3 + 2x 2 + x 1 8 Notice how later data has more weight Compare z n+1 = z n (so that w n has zero variance); then µ 3 = x 3 + x 2 + x 1 3

23 / 24 Applications Much as a coffee filter serves to keep undesirable grounds out of your morning mug, the Kalman filter is designed to strip unwanted noise out of a stream of data. Barry Cipra, SIAM News 26(5) 1993 Navigational and guidance systems Radar tracking Sonar ranging Satellite orbit determination

24 / 24 Extensions Dealing with non-linearity The Extended Kalman Filter (EKF) If x n = f (z n ) + v n where f is a non-linear function, can linearize f, e.g. around E[z n x 1,... x n 1 ]. Works for weak non-linearities For very non-linear problems use sampling methods (known as particle filters). Example, work of Blake and Isard on tracking, see http://www.robots.ox.ac.uk/ vdg/dynamics.html It is possible to train KFs using a forward-backward algorithm