Bayesian analysis of non-gaussian Long-Range Dependent processes

Similar documents
11. Time series and dynamic linear models

STA 4273H: Statistical Machine Learning

Tutorial on Markov Chain Monte Carlo

Dirichlet Processes A gentle tutorial

Inference on Phase-type Models via MCMC

Bayesian Time Series Analysis

Monte Carlo-based statistical methods (MASM11/FMS091)

Lecture 4: Seasonal Time Series, Trend Analysis & Component Model Bus 41910, Time Series Analysis, Mr. R. Tsay

Probabilistic Models for Big Data. Alex Davies and Roger Frigola University of Cambridge 13th February 2014

Chapter 10 Introduction to Time Series Analysis

Monte Carlo and Empirical Methods for Stochastic Inference (MASM11/FMS091)

Introduction to Monte Carlo. Astro 542 Princeton University Shirley Ho

Linear Models for Classification

Bayesian Statistics: Indian Buffet Process

Introduction to Markov Chain Monte Carlo

Incorporating cost in Bayesian Variable Selection, with application to cost-effective measurement of quality of health care.

Parallelization Strategies for Multicore Data Analysis

Real-Time Monitoring of Complex Industrial Processes with Particle Filters

PS 271B: Quantitative Methods II. Lecture Notes

How To Solve A Sequential Mca Problem

Austerity in MCMC Land: Cutting the Metropolis-Hastings Budget

Model-based Synthesis. Tony O Hagan

An Application of Inverse Reinforcement Learning to Medical Records of Diabetes Treatment

Imperfect Debugging in Software Reliability

1 Short Introduction to Time Series

Estimation and comparison of multiple change-point models

Christfried Webers. Canberra February June 2015

Handling attrition and non-response in longitudinal data

Time Series Analysis

Analysis and Computation for Finance Time Series - An Introduction

Bayesian Phylogeny and Measures of Branch Support

Financial TIme Series Analysis: Part II

Dynamic Linear Models with R

Introduction to Time Series Analysis. Lecture 6.

Bayesian Statistics in One Hour. Patrick Lam

Monte Carlo-based statistical methods (MASM11/FMS091)

1 Simulating Brownian motion (BM) and geometric Brownian motion (GBM)

Gaussian Processes to Speed up Hamiltonian Monte Carlo

Master s thesis tutorial: part III

The Power of the KPSS Test for Cointegration when Residuals are Fractionally Integrated

Model Uncertainty in Classical Conditioning

Appendix 1: Time series analysis of peak-rate years and synchrony testing.

Introduction to Mobile Robotics Bayes Filter Particle Filter and Monte Carlo Localization

BayesX - Software for Bayesian Inference in Structured Additive Regression

Echantillonnage exact de distributions de Gibbs d énergies sous-modulaires. Exact sampling of Gibbs distributions with submodular energies

Statistics Graduate Courses

Sampling via Moment Sharing: A New Framework for Distributed Bayesian Inference for Big Data

An Introduction to Using WinBUGS for Cost-Effectiveness Analyses in Health Economics

Introduction to Markov Chain Monte Carlo

Basics of Statistical Machine Learning

Enhancing Parallel Pattern Search Optimization with a Gaussian Process Oracle

Computing with Finite and Infinite Networks

Service courses for graduate students in degree programs other than the MS or PhD programs in Biostatistics.

Stephane Crepey. Financial Modeling. A Backward Stochastic Differential Equations Perspective. 4y Springer

Generating Valid 4 4 Correlation Matrices

Big Data, Statistics, and the Internet

Chenfeng Xiong (corresponding), University of Maryland, College Park

A Tutorial on Particle Filters for Online Nonlinear/Non-Gaussian Bayesian Tracking

Sample Size Designs to Assess Controls

Network Security A Decision and Game-Theoretic Approach

Optimising and Adapting the Metropolis Algorithm

A Bootstrap Metropolis-Hastings Algorithm for Bayesian Analysis of Big Data

Markov Chain Monte Carlo Simulation Made Simple

Spatial Statistics Chapter 3 Basics of areal data and areal data modeling

Dealing with large datasets

Data Science Center Eindhoven. Big Data: Challenges and Opportunities for Mathematicians. Alessandro Di Bucchianico

Section 5. Stan for Big Data. Bob Carpenter. Columbia University

Traffic Safety Facts. Research Note. Time Series Analysis and Forecast of Crash Fatalities during Six Holiday Periods Cejun Liu* and Chou-Lin Chen

Bayesian Machine Learning (ML): Modeling And Inference in Big Data. Zhuhua Cai Google, Rice University

References. Importance Sampling. Jessi Cisewski (CMU) Carnegie Mellon University. June 2014

Bayesian analysis of aggregate loss models.

INDIRECT INFERENCE (prepared for: The New Palgrave Dictionary of Economics, Second Edition)

Time Series Analysis 1. Lecture 8: Time Series Analysis. Time Series Analysis MIT 18.S096. Dr. Kempthorne. Fall 2013 MIT 18.S096

> plot(exp.btgpllm, main = "treed GP LLM,", proj = c(1)) > plot(exp.btgpllm, main = "treed GP LLM,", proj = c(2)) quantile diff (error)

Selection Sampling from Large Data sets for Targeted Inference in Mixture Modeling

PREDICTIVE DISTRIBUTIONS OF OUTSTANDING LIABILITIES IN GENERAL INSURANCE

Problem of Missing Data

Hierarchical Bayesian Modeling of the HIV Response to Therapy

Chapter 9: Univariate Time Series Analysis

EE 570: Location and Navigation

Dirichlet Process Gaussian Mixture Models: Choice of the Base Distribution

MAN-BITES-DOG BUSINESS CYCLES ONLINE APPENDIX

Monte Carlo Methods and Models in Finance and Insurance

The CUSUM algorithm a small review. Pierre Granjon

Big data challenges for physics in the next decades

Program description for the Master s Degree Program in Mathematics and Finance

Master s Theory Exam Spring 2006

Bias in the Estimation of Mean Reversion in Continuous-Time Lévy Processes

Univariate Time Series Analysis; ARIMA Models

Markov Chain Monte Carlo and Numerical Differential Equations

Parameter estimation for nonlinear models: Numerical approaches to solving the inverse problem. Lecture 12 04/08/2008. Sven Zenker

The Monte Carlo Framework, Examples from Finance and Generating Correlated Random Variables

Advanced Signal Processing and Digital Noise Reduction

Bayesian Methods for the Social and Behavioral Sciences

Arnold Zellner. Booth School of Business. University of Chicago South Woodlawn Avenue Chicago, IL

Analysis of a Production/Inventory System with Multiple Retailers

HT2015: SC4 Statistical Data Mining and Machine Learning

Computational Statistics for Big Data

The Chinese Restaurant Process

The HB. How Bayesian methods have changed the face of marketing research. Summer 2004

Transcription:

Bayesian analysis of non-gaussian Long-Range Dependent processes Tim Graves 1, Christian Franzke 2, Nick Watkins 2, and Bobby Gramacy 3 Statistical Laboratory, University of Cambridge, Cambridge, UK British Antarctic Survey, Cambridge, UK Booth School of Business, The University of Chicago, Chicago, USA May 18, 2012

Outline 1 Introduction 2 Exact Bayesian analysis for Gaussian case 3 Approximate Bayesian analysis for general case

Outline 1 Introduction 2 Exact Bayesian analysis for Gaussian case 3 Approximate Bayesian analysis for general case

Long Range Dependence Definition A Long Range Dependent process is a stationary process for which k= ρ(k) =.

Long Range Dependence Definition A Long Range Dependent process is a stationary process for which k= ρ(k) =.... [T]he stationary long memory processes form a layer among the stationary processes that is near the boundary with non-stationary processes, or, alternatively, as the layer separating the non-stationary processes from the usual stationary processes. [Samorodnitsky, 2006]

Long Range Dependence Definition A Long Range Dependent process is a stationary process for which ρ(k) ck 2d 1, 0 < d < 1 2.

Long Range Dependence Definition A Long Range Dependent process is a stationary process for which ρ(k) ck 2d 1, 0 < d < 1 2. Definition A Long Range Dependent process is a stationary process for which X t = ψ k ɛ t k, k=0 ψ k ck d 1, 0 < d < 1 2.

ARFIMA processes Definition A process {X t } is an ARFIMA(p, d, q) process if it is the solution to: Φ(B)(1 B) d X t = Θ(B)ε t, p where Φ(z) = 1 + φ j z j and Θ(z) = 1 + j=1 q θ j z j, and the innovations {ε t } are iid with 0 mean and variance σ 2 <. We say that {X t } is an ARFIMA(p, d, q) process with mean µ, if {X t µ} is an ARFIMA(p, d, q) process. j=1

ARFIMA parameters µ location parameter σ scale parameter d long memory parameter (long memory process iff 0 < d < 0.5 ) φ p-dimensional short memory parameter θ q-dimensional short memory parameter Which parameters are of interest? When considering long memory processes, we are usually primarily interested in the parameter d (and possibly µ). The parameters σ, φ, θ (and even p, q) are essentially nuisance parameters.

ARFIMA parameters µ location parameter σ scale parameter d long memory parameter (long memory process iff 0 < d < 0.5 ) φ p-dimensional short memory parameter θ q-dimensional short memory parameter Which parameters are of interest? When considering long memory processes, we are usually primarily interested in the parameter d (and possibly µ). The parameters σ, φ, θ (and even p, q) are essentially nuisance parameters.

ARFIMA parameters µ location parameter σ scale parameter d long memory parameter (long memory process iff 0 < d < 0.5 ) φ p-dimensional short memory parameter θ q-dimensional short memory parameter Which parameters are of interest? When considering long memory processes, we are usually primarily interested in the parameter d (and possibly µ). The parameters σ, φ, θ (and even p, q) are essentially nuisance parameters.

ARFIMA parameters µ location parameter σ scale parameter d long memory parameter (long memory process iff 0 < d < 0.5 ) φ p-dimensional short memory parameter θ q-dimensional short memory parameter Which parameters are of interest? When considering long memory processes, we are usually primarily interested in the parameter d (and possibly µ). The parameters σ, φ, θ (and even p, q) are essentially nuisance parameters.

ARFIMA parameters µ location parameter σ scale parameter d long memory parameter (long memory process iff 0 < d < 0.5 ) φ p-dimensional short memory parameter θ q-dimensional short memory parameter Which parameters are of interest? When considering long memory processes, we are usually primarily interested in the parameter d (and possibly µ). The parameters σ, φ, θ (and even p, q) are essentially nuisance parameters.

Outline 1 Introduction 2 Exact Bayesian analysis for Gaussian case 3 Approximate Bayesian analysis for general case

Outline of problem Assume Gaussian distribution for the innovations Bayesian: use flat priors for µ, log(σ), and d...

Outline of problem Assume Gaussian distribution for the innovations Bayesian: use flat priors for µ, log(σ), and d...... but can use any set of (independent) priors if desired.

Outline of problem Assume Gaussian distribution for the innovations Bayesian: use flat priors for µ, log(σ), and d...... but can use any set of (independent) priors if desired. Even assuming Gaussianity, the likelihood for d is very complex impossible to find analytic posterior

Outline of problem Assume Gaussian distribution for the innovations Bayesian: use flat priors for µ, log(σ), and d...... but can use any set of (independent) priors if desired. Even assuming Gaussianity, the likelihood for d is very complex impossible to find analytic posterior Must resort to MCMC methods in order to obtain samples from the posterior

Outline of problem Assume Gaussian distribution for the innovations Bayesian: use flat priors for µ, log(σ), and d...... but can use any set of (independent) priors if desired. Even assuming Gaussianity, the likelihood for d is very complex impossible to find analytic posterior Must resort to MCMC methods in order to obtain samples from the posterior Don t want to assume form of short memory (i.e. p, q) must use Reversible-Jump (RJ) MCMC [Green, 1995]

Outline of method Re-parameterisation of model to enforce stationarity constraints on Φ and Θ

Outline of method Re-parameterisation of model to enforce stationarity constraints on Φ and Θ Efficient calculation of Gaussian likelihood (long memory correlation structure prevents use of standard quick methods)

Outline of method Re-parameterisation of model to enforce stationarity constraints on Φ and Θ Efficient calculation of Gaussian likelihood (long memory correlation structure prevents use of standard quick methods) Necessary use of Metropolis Hastings algorithm requires careful selection of proposal distributions

Outline of method Re-parameterisation of model to enforce stationarity constraints on Φ and Θ Efficient calculation of Gaussian likelihood (long memory correlation structure prevents use of standard quick methods) Necessary use of Metropolis Hastings algorithm requires careful selection of proposal distributions Correlation between parameters (e.g. φ and d) requires blocking.

Example: Pure Gaussian Long Range Dependence 2 1 0 1 2 3 0 200 400 600 800 1000 (1 B) 0.25 X t = ε t

Example: Pure Gaussian Long Range Dependence Density estimate of π(d) 0.0 0.1 0.2 0.3 0.4 0.5 d

Example: Pure Gaussian Long Range Dependence Density estimate of π(d) 0.0 0.1 0.2 0.3 0.4 0.5 d Similarly good results for µ and σ The posterior model probability for the (0, d, 0) model was 70%

Example: Corrupted Gaussian Long Range Dependence 2 0 2 4 0 200 400 600 800 1000 (1 + 0.75B)(1 B) 0.25 X t = (1 + 0.5B)ε t

Example: Corrupted Gaussian Long Range Dependence Density estimate of π(d) 0.0 0.1 0.2 0.3 0.4 0.5 d

Example: Corrupted Gaussian Long Range Dependence Density estimate of π(d) 0.0 0.1 0.2 0.3 0.4 0.5 d The posterior model probability for the (1, d, 1) model was 77% The posterior model probability for the (0, d, 0) model was 0%

Dependence of posterior variance on n

Dependence of posterior variance on n σ d 0.005 0.010 0.020 0.050 0.100 σ d n 1 2 2 7 2 8 2 9 2 10 2 11 2 12 2 13 2 14 n

Dependence of posterior variance on n σ d 0.005 0.010 0.020 0.050 0.100 σ d n 1 2 2 7 2 8 2 9 2 10 2 11 2 12 2 13 2 14 n σ d n 1/2

Outline 1 Introduction 2 Exact Bayesian analysis for Gaussian case 3 Approximate Bayesian analysis for general case

Assumptions and general method Drop the Gaussianity assumption Replace with a more general distribution (e.g. α-stable) Seek joint inference about d and α

Assumptions and general method Drop the Gaussianity assumption Replace with a more general distribution (e.g. α-stable) Seek joint inference about d and α Initially (for simplicity) we assume no short memory, i.e. we assume a (0, d, 0) model

Assumptions and general method Drop the Gaussianity assumption Replace with a more general distribution (e.g. α-stable) Seek joint inference about d and α Initially (for simplicity) we assume no short memory, i.e. we assume a (0, d, 0) model Infinite variance means that auto-covariance approach is no longer sound

Assumptions and general method Drop the Gaussianity assumption Replace with a more general distribution (e.g. α-stable) Seek joint inference about d and α Initially (for simplicity) we assume no short memory, i.e. we assume a (0, d, 0) model Infinite variance means that auto-covariance approach is no longer sound Lack of closed form for α-stable density implies lack of closed form for likelihood

Solution Approximate the long memory process as a very high order AR process

Solution Approximate the long memory process as a very high order AR process Construct the likelihood sequentially and evaluate using specialised efficient methods

Solution Approximate the long memory process as a very high order AR process Construct the likelihood sequentially and evaluate using specialised efficient methods f (x 1,..., x t H) = f (x t x t 1,..., x 1, H)f (x t 1,..., x 1 H) where H is the finite recent history of the process x 0, x 1,..., x n

Solution Approximate the long memory process as a very high order AR process Construct the likelihood sequentially and evaluate using specialised efficient methods f (x 1,..., x t H) = f (x t x t 1,..., x 1, H)f (x t 1,..., x 1 H) where H is the finite recent history of the process x 0, x 1,..., x n Use auxiliary variables to integrate out the (unknown) history H In practice, setting H = x,..., x suffices, providing enormous computational saving.

Example: Pure symmetric α-stable long memory 150 100 50 0 0 200 400 600 800 1000 (1 B) 0.15 X t = ε t, α = 1.5

Example: Pure symmetric α-stable long memory Density estimate of π(d) 0 10 20 30 40 50 60 0.13 0.14 0.15 0.16 0.17 0.18 d

Example: Pure symmetric α-stable long memory Density estimate of π(α) 0 2 4 6 8 1.3 1.4 1.5 1.6 α

Example: Pure symmetric α-stable long memory α 1.30 1.35 1.40 1.45 1.50 1.55 1.60 0.13 0.14 0.15 0.16 0.17 d

Example: Pure symmetric α-stable long memory α 1.30 1.35 1.40 1.45 1.50 1.55 1.60 0.13 0.14 0.15 0.16 0.17 d Good estimation of all parameters The posteriors of d and α are independent

Dependence of posterior variance on n

Dependence of posterior variance on n σ d 5e 04 1e 03 2e 03 5e 03 1e 02 2e 02 5e 02 1e 01 2 7 2 8 2 9 2 10 2 11 2 12 2 13 2 14 n σ d n 1/2

Example: Pure asymmetric α-stable long memory 0 20 40 60 0 200 400 600 800 1000 (1 B) 0.1 X t = ε t, α = 1.5 β = 0.5

Example: Pure asymmetric α-stable long memory Density estimate of π(β) 0 1 2 3 4 0.2 0.4 0.6 0.8 β

Example: Pure asymmetric α-stable long memory Density estimate of π(β) 0 1 2 3 4 0.2 0.4 0.6 0.8 β Good estimation of all other parameters

References Green, P. J. (1995). Reversible jump Markov chain Monte Carlo computation and Bayesian model determination. Biometrika, 82(4), 711 732. Samorodnitsky, G. (2006). Long range dependence. Foundations and Trends in Stochastic Systems, 1(3), 163 257.