How To Solve A Sequential Mca Problem

Similar documents

Monte Carlo-based statistical methods (MASM11/FMS091)

Monte Carlo-based statistical methods (MASM11/FMS091)

Monte Carlo and Empirical Methods for Stochastic Inference (MASM11/FMS091)

Exact Confidence Intervals

Portfolio Distribution Modelling and Computation. Harry Zheng Department of Mathematics Imperial College

Introduction to Mobile Robotics Bayes Filter Particle Filter and Monte Carlo Localization

Binomial lattice model for stock prices

Lecture 8. Confidence intervals and the central limit theorem

Introduction to General and Generalized Linear Models

LECTURE 4. Last time: Lecture outline

Generating Random Numbers Variance Reduction Quasi-Monte Carlo. Simulation Methods. Leonid Kogan. MIT, Sloan , Fall 2010

Pacific Journal of Mathematics

Nonparametric adaptive age replacement with a one-cycle criterion

1 Simulating Brownian motion (BM) and geometric Brownian motion (GBM)

Manual for SOA Exam MLC.

The sample space for a pair of die rolls is the set. The sample space for a random number between 0 and 1 is the interval [0, 1].

Confidence Intervals for One Standard Deviation Using Standard Deviation

CONTINUED FRACTIONS AND PELL S EQUATION. Contents 1. Continued Fractions 1 2. Solution to Pell s Equation 9 References 12

Model-based Synthesis. Tony O Hagan

Gaussian Processes to Speed up Hamiltonian Monte Carlo

Christfried Webers. Canberra February June 2015

Microeconomic Theory: Basic Math Concepts

CHAPTER IV - BROWNIAN MOTION

15. Symmetric polynomials

1 Sufficient statistics

Monte Carlo Methods in Finance

THE CENTRAL LIMIT THEOREM TORONTO

Generating Random Variables and Stochastic Processes

Monte Carlo Simulation

The CUSUM algorithm a small review. Pierre Granjon

Discrete Frobenius-Perron Tracking

Random access protocols for channel access. Markov chains and their stability. Laurent Massoulié.

On closed-form solutions to a class of ordinary differential equations

Basics of Statistical Machine Learning

EE 570: Location and Navigation

5 Numerical Differentiation

Principle of Data Reduction

Generating Random Variables and Stochastic Processes

The Ergodic Theorem and randomness

Chapter 2: Binomial Methods and the Black-Scholes Formula

Mathematics of Life Contingencies MATH 3281

Overview of Monte Carlo Simulation, Probability Review and Introduction to Matlab

Metric Spaces. Chapter Metrics

Nonlinear Algebraic Equations. Lectures INF2320 p. 1/88

The Heat Equation. Lectures INF2320 p. 1/88

Tagging with Hidden Markov Models

Non Parametric Inference

Chapter 4 - Lecture 1 Probability Density Functions and Cumul. Distribution Functions

Bandwidth Selection for Nonparametric Distribution Estimation

1. Prove that the empty set is a subset of every set.

1. (First passage/hitting times/gambler s ruin problem:) Suppose that X has a discrete state space and let i be a fixed state. Let

4. Continuous Random Variables, the Pareto and Normal Distributions

MATH4427 Notebook 2 Spring MATH4427 Notebook Definitions and Examples Performance Measures for Estimators...

Master s Theory Exam Spring 2006

References. Importance Sampling. Jessi Cisewski (CMU) Carnegie Mellon University. June 2014

1 Lecture: Integration of rational functions by decomposition

Master s thesis tutorial: part III

C. Wohlin, M. Höst, P. Runeson and A. Wesslén, "Software Reliability", in Encyclopedia of Physical Sciences and Technology (third edition), Vol.

Machine Learning.

Simulation Exercises to Reinforce the Foundations of Statistical Thinking in Online Classes

Probability Generating Functions

Probability and Random Variables. Generation of random variables (r.v.)

5.1 Identifying the Target Parameter

The Exponential Distribution

1 Prior Probability and Posterior Probability

11. Time series and dynamic linear models

Polarization codes and the rate of polarization

Inference on Phase-type Models via MCMC

MATH 425, PRACTICE FINAL EXAM SOLUTIONS.

Summary of Formulas and Concepts. Descriptive Statistics (Ch. 1-4)

Chapter 1. Introduction

Introduction to Markov Chain Monte Carlo

A Study on the Comparison of Electricity Forecasting Models: Korea and China

On the mathematical theory of splitting and Russian roulette

Recursive Estimation

Math 151. Rumbos Spring Solutions to Assignment #22

Nonparametric Tests for Randomness

Numerical Methods for Option Pricing

Gambling and Data Compression

Pablo A. Ferrari Antonio Galves. Construction of Stochastic Processes, Coupling and Regeneration

Statistics Graduate Courses

Two Topics in Parametric Integration Applied to Stochastic Simulation in Industrial Engineering

Random Variate Generation (Part 3)

Homework #2 Solutions

Scalar Valued Functions of Several Variables; the Gradient Vector

Is a Brownian motion skew?

L13: cross-validation

5.3 Improper Integrals Involving Rational and Exponential Functions

STAT 830 Convergence in Distribution

M/M/1 and M/M/m Queueing Systems

Theory versus Experiment. Prof. Jorgen D Hondt Vrije Universiteit Brussel jodhondt@vub.ac.be

Transcription:

Monte Carlo-based statistical methods (MASM11/FMS091) Jimmy Olsson Centre for Mathematical Sciences Lund University, Sweden Lecture 6 Sequential Monte Carlo methods II February 3, 2012

Changes in HA1 Problem 4(c) in HA1 has been modified: You do NOT have to provide any confidence interval for the variance V(P (V 1 ) + P (V 2 )) and the standard deviation D(P (V 1 ) + P (V 2 )). Provide only point estimates of these quantities.

Plan of today s lecture 1 Last time: Sequential MC problems 2

We are here 1 Last time: Sequential MC problems 2

Last time: Sequential MC problems In the sequential MC framework, we aim at sequentially estimating sequences (τ n ) n 0 of expectations τ n = E fn (φ(x 0:n )) = φ(x 0:n )f n (x 0:n ) dx 0:n ( ) X n over spaces X n of increasing dimension, where the densities (f n ) are known up to normalizing constants only, i.e., for every n 0, where c n is an unknown constant. f n (x 0:n ) = z n(x 0:n ) c n,

Last time: Markov chains Some applications involved the notion of Markov chains: A Markov chain on X R d is a family of random variables (= stochastic process) (X k ) k 0 taking values in X such that P(X k+1 A X 0, X 1,..., X k ) = P(X k+1 A X k ). The density q of the distribution of X k+1 given X k = x is called the transition density of (X k ). Consequently, P(X k+1 A X k = x k ) = q(x k+1 x k ) dx k+1. As a first example we considered an AR(1) process: X 0 = 0, X k+1 = αx k + ɛ k+1, where α is a constant and (ɛ k ) are i.i.d. variables. A

Last time: Markov chains (cont.) The following theorem provides the joint density f n (x 0, x 1,..., x n ) of X 0, X 1,..., X n. Theorem Let (X k ) be Markov with X 0 χ. Then for n > 0, n 1 f n (x 0, x 1,..., x n ) = χ(x 0 ) q(x k+1 x k ). k=0 Corollary (The Chapman-Kolmogorov equation) Let (X k ) be Markov. Then for n > 1, f n (x n x 0 ) = ( n 1 k=0 q(x k+1 x k ) ) dx 1 dx n 1.

Last time: Rare event analysis (REA) for Markov chains Let (X k ) be a Markov chain. Assume that we want to compute, for n = 0, 1, 2,... τ n = E(φ(X 0:n ) X 0:n A) = = A A f n (x 0:n ) φ(x 0:n ) P(X 0:n A) dx 0:n φ(x 0:n ) χ(x 0) n 1 k=0 q(x k+1 x k ) dx 0:n, P(X 0:n A) where A is a possibly rare event and P(X 0:n A) is generally unknown. We thus face a sequential MC problem ( ) with { z n (x 0:n ) χ(x 0 ) n 1 k=0 q(x k+1 x k ), c n P(X 0:n A).

Last time: Estimation in general HMMs Graphically: Y k 1 Y k Y k+1 (Observations)... X k 1 X k X k+1... (Markov chain) Y k X k = x k g(y k x k ) X k+1 X k = x k q(x k+1 x k ) X 0 χ(x 0 ) (Observation density) (Transition density) (Initial distribution)

Last time: Estimation in general HMMs In an HMM, the smoothing distribution f n (x 0:n y 0:n ) is the conditional distribution of a set X 0:n of hidden states given Y 0:n = y 0:n. Theorem (Smoothing distribution) where f n (x 0:n y 0:n ) = χ(x 0)g(y 0 x 0 ) n k=1 g(y k x k )q(x k x k 1 ), L n (y 0:n ) L n (y 0:n ) = density of the observations y 0:n n = χ(x 0 )g(y 0 x 0 ) g(y k x k )q(x k x k 1 ) dx 0 dx n. k=1

Last time: Estimation in general HMMs Assume that we want to compute, online for n = 0, 1, 2,... τ n = E(φ(X 0:n ) Y 0:n = y 0:n ) = φ(x 0:n )f n (x 0:n y 0:n ) dx 0 dx n = φ(x 0:n ) χ(x 0)g(y 0 x 0 ) n k=1 g(y k x k )q(x k x k 1 ) dx 0 dx n, L n (y 0:n ) where L n (y 0:n ) (= obscene integral) is generally unknown. We thus face a sequential MC problem ( ) with { z n (x 0:n ) χ(x 0 )g(y 0 x 0 ) n k=1 g(y k x k )q(x k x k 1 ), c n L n (y 0:n ).

We are here 1 Last time: Sequential MC problems 2

We are here 1 Last time: Sequential MC problems 2

It is natural to aim at solving the problem using usual self-normalized IS. However, the generated samples (X0:n i, ω n(x0:n i )) should be such that having (X0:n i, ω n(x0:n i )), the next sample (X0:n+1 i, ω n+1(x0:n+1 i )) is easily generated with a complexity that does not increase with n (online sampling). the approximation remains stable as n increases. We call each draw X i 0:n = (Xi 0,..., Xi n) a particle. Moreover, we denote importance weights by ω i n = ω n (X i 0:n).

We are here 1 Last time: Sequential MC problems 2

We proceed recursively. Assume that we have generated particles (X i 0:n ) from g n(x 0:n ) so that N i=1 ω i n N l=1 ωl n h(x i 0:n) E fn (φ(x 0:n )), where, as usual, ω i n = ω n (X i 0:n ) = z n(x i 0:n )/g n(x i 0:n ). Key trick: Choose an instrumental distribution satisfying g n+1 (x 0:n+1 ) = g n+1 (x n+1 x 0:n )g n (x 0:n ).

SIS (cont.) Last time: Sequential MC problems Consequently, X0:n+1 i and ωi n+1 can be generated by keeping the previous X0:n i, simulating X i n+1 g n+1(x n+1 X i 0:n ), setting X0:n+1 i = (Xi n+1, Xi 0:n ), and computing ω i n+1 = z n+1(x i 0:n+1 ) g n+1 (X i 0:n+1 ) = = z n+1 (X i 0:n+1 ) z n (X i 0:n )g n+1(x i n+1 Xi 0:n ) z n(x i 0:n ) g n (X i 0:n ) z n+1 (X i 0:n+1 ) z n (X i 0:n )g n+1(x i n+1 Xi 0:n ) ωi n.

SIS (cont.) Last time: Sequential MC problems Voilà, the sample (X0:n+1 i, ωi n+1 ) can now be used to approximate E fn+1 (φ(x 0:n+1 ))! So, by running the SIS algorithm, we have updated an approximation N i=1 to an approximation N i=1 ω i n N l=1 ωl n h(x i 0:n) E fn (φ(x 0:n )) ω i n+1 N l=1 ωl n+1 h(x i 0:n+1) E fn+1 (φ(x 0:n+1 )) by only adding a component Xn+1 i to Xi 0:n updating the weights. and sequentially

SIS: Pseudo code for i = 1 N do draw X i 0 g 0 set ω i 0 = z 0(X i 0 ) g 0 (X i 0 ) end for return (X i 0, ωi 0 ) for k = 0, 1, 2,... do for i = 1 N do draw X i k+1 g k+1(x k+1 X i 0:n ) set X i 0:k+1 (Xi k+1, Xi 0:k ) set ω i k+1 z k+1 (X i 0:k+1 ) z k (X i 0:k )g k+1(x i k+1 Xi 0:k ) ωi k end for return (X i 0:k+1, ωi k+1 ) end for

Example: REA reconsidered We consider again the example of REA for Markov chains (X = R): τ n = E(φ(X 0:n ) a X l, l = 0,..., n) = φ(x 0:n ) χ(x n 1 0) k=0 q(x k+1 x k ) (a, ) n+1 P(a X l, l) } {{ } =z n(x 0:n )/c n Choose g k+1 (x k+1 x 0:k ) to be the conditional density of X k+1 given X k and X k+1 a: dx 0:n. g k+1 (x k+1 x 0:k ) = {cf. HA1, Problem 1} = q(x k+1 x k ) a q(z x k) dz.

Example: REA This implies that g n (x 0:n ) = a n 1 χ(x 0 ) q(x k+1 x k ) χ(x 0) dx 0 a q(z x k) dz. k=0 In addition, the weights are updated as ωk+1 i = z k+1 (X0:k+1 i ) z k (X0:k i )g k+1(xk+1 i Xi 0:k ) ωi k k l=0 = q(xi l+1 Xi l ) k 1 l=0 q(xi l+1 Xi l ) q(xi k+1 Xi k ) ωk i = a q(z Xi k ) dz a q(z X i k ) dz ωi k.

Example: REA; Matlab implementation for AR(1) process with Gaussian noise % design of instrumental distribution: int = @(x) 1 normcdf(a,alpha*x,sigma); trunk_td_rnd = @(x)... % use HA1, Problem 1, to draw from % the truncated transition density; % SIS: % x = starting position. part = a*ones(n,1); % initialization in a w = ones(n,1); for k = 1:(n 1), % main loop part_mut = trunk_td_rnd(part); w = w.*int(part); part = part_mut; end c = mean(w); % estimated probability

REA: Importance weight distribution Serious drawback of SIS: the importance weights degenerate!... 1000 500 n = 1 0 15 10 5 0 1000 n = 10 500 0 15 10 5 0 100 n = 100 50 0 15 10 5 0 Importance weights (base 10 logarithm)

What s next? Last time: Sequential MC problems Weight degeneration is a universal problem with the SIS method and is due to the fact that the particle weights are generated through subsequent multiplications. This drawback prevented during several decades the SIS method from being practically useful. Next week we will discuss an efficient solution to this problem: SIS with resampling.