Inference on Phase-type Models via MCMC

Similar documents
Markov Chain Monte Carlo Simulation Made Simple

Bayesian Statistics in One Hour. Patrick Lam

Gaussian Processes to Speed up Hamiltonian Monte Carlo

Tutorial on Markov Chain Monte Carlo

A Bootstrap Metropolis-Hastings Algorithm for Bayesian Analysis of Big Data

Parallelization Strategies for Multicore Data Analysis

Probabilistic Models for Big Data. Alex Davies and Roger Frigola University of Cambridge 13th February 2014

Introduction to Markov Chain Monte Carlo

Introduction to Monte Carlo. Astro 542 Princeton University Shirley Ho

Spatial Statistics Chapter 3 Basics of areal data and areal data modeling

Dirichlet Processes A gentle tutorial

Optimizing Prediction with Hierarchical Models: Bayesian Clustering

STA 4273H: Statistical Machine Learning

11. Time series and dynamic linear models

Bayesian Statistics: Indian Buffet Process

An Introduction to Using WinBUGS for Cost-Effectiveness Analyses in Health Economics

DURATION ANALYSIS OF FLEET DYNAMICS

Basics of Statistical Machine Learning

Bayesian Machine Learning (ML): Modeling And Inference in Big Data. Zhuhua Cai Google, Rice University

Model-based Synthesis. Tony O Hagan

Sampling for Bayesian computation with large datasets

2WB05 Simulation Lecture 8: Generating random variables

Computational Statistics for Big Data

Statistical Machine Learning from Data

Statistical Machine Learning

Recent Developments of Statistical Application in. Finance. Ruey S. Tsay. Graduate School of Business. The University of Chicago

Imperfect Debugging in Software Reliability

Validation of Software for Bayesian Models using Posterior Quantiles. Samantha R. Cook Andrew Gelman Donald B. Rubin DRAFT

Estimation and comparison of multiple change-point models

Reliability estimators for the components of series and parallel systems: The Weibull model

BayesX - Software for Bayesian Inference in Structured Additive Regression


Monte Carlo and Empirical Methods for Stochastic Inference (MASM11/FMS091)

Programma della seconda parte del corso

Portfolio Distribution Modelling and Computation. Harry Zheng Department of Mathematics Imperial College

Introduction to Mobile Robotics Bayes Filter Particle Filter and Monte Carlo Localization

Monte Carlo Simulation

Nonparametric adaptive age replacement with a one-cycle criterion

Monte Carlo-based statistical methods (MASM11/FMS091)

Linear Classification. Volker Tresp Summer 2015

BAYESIAN ANALYSIS OF AN AGGREGATE CLAIM MODEL USING VARIOUS LOSS DISTRIBUTIONS. by Claire Dudley

Web-based Supplementary Materials for Bayesian Effect Estimation. Accounting for Adjustment Uncertainty by Chi Wang, Giovanni

1 Simulating Brownian motion (BM) and geometric Brownian motion (GBM)

Probabilistic Fitting

How To Solve A Sequential Mca Problem

Parameter estimation for nonlinear models: Numerical approaches to solving the inverse problem. Lecture 12 04/08/2008. Sven Zenker

Topic models for Sentiment analysis: A Literature Survey

HT2015: SC4 Statistical Data Mining and Machine Learning

MCMC Using Hamiltonian Dynamics

THE USE OF STATISTICAL DISTRIBUTIONS TO MODEL CLAIMS IN MOTOR INSURANCE

More details on the inputs, functionality, and output can be found below.

Visualization of Collaborative Data

A LOGNORMAL MODEL FOR INSURANCE CLAIMS DATA

An Introduction to Machine Learning

Parametric fractional imputation for missing data analysis

Notes on the Negative Binomial Distribution

Probabilistic Methods for Time-Series Analysis

Coding and decoding with convolutional codes. The Viterbi Algor

A Markov Chain Analysis of Blackjack Strategy

1 Prior Probability and Posterior Probability

References. Importance Sampling. Jessi Cisewski (CMU) Carnegie Mellon University. June 2014

MCMC-Based Assessment of the Error Characteristics of a Surface-Based Combined Radar - Passive Microwave Cloud Property Retrieval

Operational Risk Management: Added Value of Advanced Methodologies

Credit Risk Models: An Overview

Course: Model, Learning, and Inference: Lecture 5

Statistics Graduate Courses

A Tutorial on Particle Filters for Online Nonlinear/Non-Gaussian Bayesian Tracking

Incorporating cost in Bayesian Variable Selection, with application to cost-effective measurement of quality of health care.

Generating Valid 4 4 Correlation Matrices

Predicting the World Cup. Dr Christopher Watts Centre for Research in Social Simulation University of Surrey

A HYBRID GENETIC ALGORITHM FOR THE MAXIMUM LIKELIHOOD ESTIMATION OF MODELS WITH MULTIPLE EQUILIBRIA: A FIRST REPORT

Bayesian Methods for the Social and Behavioral Sciences

Model-based approaches to nonparametric Bayesian quantile regression

Package zic. September 4, 2015

Monte Carlo Methods and Models in Finance and Insurance

A three dimensional stochastic Model for Claim Reserving

Markov Chain Monte Carlo and Applied Bayesian Statistics: a short course Chris Holmes Professor of Biostatistics Oxford Centre for Gene Function

Discrete Frobenius-Perron Tracking

Dealing with large datasets

Example: Credit card default, we may be more interested in predicting the probabilty of a default than classifying individuals as default or not.

SPECTRAL POLYNOMIAL ALGORITHMS FOR COMPUTING BI-DIAGONAL REPRESENTATIONS FOR PHASE TYPE DISTRIBUTIONS AND MATRIX-EXPONENTIAL DISTRIBUTIONS

Transcription:

Inference on Phase-type Models via MCMC with application to networks of repairable redundant systems Louis JM Aslett and Simon P Wilson Trinity College Dublin 28 th June 202

Toy Example : Redundant Repairable Components C C2 State Meaning both C and C2 work 2 C failed, C2 working 3 C working, C2 failed 4 system failed a general stochastic process, e.g. Y (t) 2 3 4 t System Failed

Continuous-time Markov Chain Model C up C2 up State Meaning both C and C2 work 2 C failed, C2 working 3 C working, C2 failed 4 system failed λ r C down C2 up λ f λ f λ u λ f λ f λ r C up C2 down C down C2 down = π = 0 0, T = 2λ f λ f λ f 0 λ r λ r λ f 0 λ f λ r 0 λ r λ f λ f 0 0 0 0

Inferential Setting Cano & Rios (2006) provide conjugate posterior calculations in the context of analysing repairable systems when the stochastic process leading to absorption is observed. Data For each system failure time, one has: Starting state Length of time in each state Number of transitions between each state Ultimate system failure time

Inferential Setting Cano & Rios (2006) provide conjugate posterior calculations in the context of analysing repairable systems when the stochastic process leading to absorption is observed. Data For each system failure time, one has: Starting state Length of time in each state Number of transitions between each state Ultimate system failure time Reduced information scenario = Bladt et al. (2003) provide a Bayesian MCMC algorithm, or Asmussen et al. (996) provide a frequentist EM algorithm.

Definition of Phase-type Distributions An absorbing continuous time Markov chain is one in which there is a state that, once entered, is never left. That is, the n + state intensity matrix can be written: ( ) S s T = 0 0 where S is n n, s is n and 0 is n, with s = Se Then, a Phase-type distribution (PHT) is defined to be the distribution of the time to entering the absorbing state. F Y (y) = π T exp{ys}e Y PHT(π, S) = f Y (y) = π T exp{ys}s

Relating to the Toy Example State Meaning both C and C2 work 2 C failed, C2 working 3 C working, C2 failed 4 system failed = T = λ r C down C2 up λ f λ f C up C2 up λ u λ f C down C2 down 2λ f λ f λ f 0 λ r λ r λ f 0 λ f λ r S0 λ r λ f sλ f 0 0 0 0 λ f λ r C up C2 down f Y (y) = π T exp{ys}s F Y (y) = π T exp{ys}e

Bladt et al: Gibbs Sampling from Posterior Strategy is a Gibbs MCMC algorithm which achieves the goal of simulating from p(π, S y) by sampling from through the iterative process p(π, S, paths y) p(π, S paths, y) p(paths π, S, y)

Bladt et al: Metropolis-Hastings Simulation of Process In summary: Can simulate chain from p(path Y i y i ) trivially by rejection sampling. A Metropolis-Hastings acceptance ratio (ratio of exit rates) exists st truncating chain to time y i (at which point it absorbs) will be a draw from p(path Y i = y i ) Metropolis-Hastings p(path π, S, Y = y) Rejection Sampling p(path π, S, Y y) CTMC Sampling p(path π, S)

Motivation for Modifications Certain state transitions make no physical sense. (eg 2 3 in earlier example) 2 When part of a larger system, it is highly likely there will be censored observations. 3 Where there is no reason to believe distributional differences between parameters, they should (in idealised modelling sense) be constrained to be equal. This is as much to assist with reducing parameter dimensionality. 4 Computation time!

Toy Example Results 3.5 00 uncensored observations simulated from PHT with S = 3.6.8.8 9.5.3 0 9.5 0.3 = λ f =.8, λ r = 9.5 Density 3.0 2.5 2.0.5.0 0.5 0.0 0.5.0.5 2.0 Parameter Value Param S2 S3 s2 2 s3 3 λf f

Toy Example Results 00 uncensored observations simulated from PHT with S = 3.6.8.8 9.5.3 0 9.5 0.3 = λ f =.8, λ r = 9.5 Density 0.4 0.3 0.2 0. 0.0 8 9 0 2 3 4 5 Parameter Value Param S2 S3 Rλ r Reliability less sensitive to λ r (Daneshkhah & Bedford 2008)

Toy Example Results 2 00 uncensored observations simulated from PHT with S = 3.6.8.8 9.5.3 0 9.5 0.3 = λ f =.8, λ r = 9.5 Density 0 8 6 4 2 0 0.0 0. 0.2 0.3 0.4 0.5 Parameter Value Param s S23 S32

The Big Issue Intractable computation time for many applications!

The Big Issue Intractable computation time for many applications! longer chains and MCMC jumps to states for which observations are far in the tails can stall rejection sampling step of MH algorithm. P(Y i y i π, S) = 0 6 p(π, S paths, y) p(paths π, S, y)

The Big Issue Intractable computation time for many applications! longer chains and MCMC jumps to states for which observations are far in the tails can stall rejection sampling step of MH algorithm. P(Y i y i π, S) = 0 6 p(π, S paths, y) p(paths π, S, y) E(RS iter.) = 0 6 95% CI = [2537, 3688877]

The Big Issue Intractable computation time for many applications! longer chains and MCMC jumps to states for which observations are far in the tails can stall rejection sampling step of MH algorithm. 2 states from which absorption impossible wasteful to resample whole chain because state at time y i unsuitable for truncation. 2 3 4 invalid truncation y i simulation

The Big Issue Intractable computation time for many applications! longer chains and MCMC jumps to states for which observations are far in the tails can stall rejection sampling step of MH algorithm. 2 states from which absorption impossible wasteful to resample whole chain because state at time y i unsuitable for truncation. State Meaning P(state) both PS working 0.9986 = E(MH iter) = 429 2 failed, 2 working 0.0007 3 working, 2 failed 0.0007 95% CI = [36, 5267]

The Big Issue Intractable computation time for many applications! longer chains and MCMC jumps to states for which observations are far in the tails can stall rejection sampling step of MH algorithm. 2 states from which absorption impossible wasteful to resample whole chain because state at time y i unsuitable for truncation. 3 time for MH algorithm to reach stationarity can grow rapidly. Total Variation Distance 0.0 0.2 0.4 0 00 200 300 400 500 Iterations

Solution: Exact Conditional Sampling Metropolis-Hastings p(path π, S, Y = y) Rejection Sampling p(path π, S, Y y) CTMC Sampling p(path π, S)

Solution: Exact Conditional Sampling Metropolis-Hastings i) Starting state ~ discrete p(path π, S, Y = y) Y {0} π ii) Advance time ~ Exponential Rejection Sampling δ Exp( S Y {t},y {t} ) p(path π, S, Y y) iii) Select next state ~ discrete Y {t + δ} { S Y {t}, Y {t} /S Y {t},y {t} } CTMC Sampling iv) If not absorbed, t = t + δ, loop to ii p(path π, S)

Solution: Exact Conditional Sampling Metropolis-Hastings e.g. Starting state mass function changes p(path π, S, Y = y) with conditioning: Rejection Sampling p(path π, S, Y y) P(Y {0} = i π, S) =π i CTMC Sampling p(path p(path π, π, S, Y S) = y) P(Y {0} = i π, S,Y = y) = et i exp{sy}s π i π T exp{sy}s

Tail Depth Performance Improvement 0 2 log[time (secs)] 0 0 0 0-0 -2 Method ECS MH 0-3 2 3 4 5 6 Upper tail probability (0^-x)

Overall Performance Improvement This shows the new method keeping pace in nice problems and significantly outperforming otherwise. T = ( 3 3 3 0 0 0 0 No problems i-iii MH ECS t.6 µs 7.2 µs s t 04 µs 9 µs ) T = ( 2 0.0.99 0 300 0 299 299 0 300 0 0 0 0 All problems i-iii ) MH ECS 0.2 hours 0.06 secs 9.4 hours 0.05 secs 2,300,000 faster on average in hard problem

Networks/Systems of Components Quick overview now of current research focus: inference for networks/systems comprising nodes/components which may be of Phase-type.

Networks/Systems of Components Quick overview now of current research focus: inference for networks/systems comprising nodes/components which may be of Phase-type. 2

Networks/Systems of Components Quick overview now of current research focus: inference for networks/systems comprising nodes/components which may be of Phase-type. 2 2

Networks/Systems of Components Quick overview now of current research focus: inference for networks/systems comprising nodes/components which may be of Phase-type. 2 2

Networks/Systems of Components Quick overview now of current research focus: inference for networks/systems comprising nodes/components which may be of Phase-type. 2 2

Networks/Systems of Components Quick overview now of current research focus: inference for networks/systems comprising nodes/components which may be of Phase-type. 2 2

Networks/Systems of Components Quick overview now of current research focus: inference for networks/systems comprising nodes/components which may be of Phase-type. 2 2

Networks/Systems of Components Quick overview now of current research focus: inference for networks/systems comprising nodes/components which may be of Phase-type. 2 2

Networks/Systems of Components Quick overview now of current research focus: inference for networks/systems comprising nodes/components which may be of Phase-type. 2

Networks/Systems of Components Quick overview now of current research focus: inference for networks/systems comprising nodes/components which may be of Phase-type. 2 3

Networks/Systems of Components Quick overview now of current research focus: inference for networks/systems comprising nodes/components which may be of Phase-type. 2 3

Networks/Systems of Components Quick overview now of current research focus: inference for networks/systems comprising nodes/components which may be of Phase-type.??? T = t? Again, reduced information setting: overall network failure time, or so-called Masked system lifetime data.

Direct Masked System Lifetime Inference Even a very simple setting quite hard to tackle directly. X X 2 X i iid Weibull(scale = α, shape = β) System lifetimes t = {t,..., t n } X 3 F T (t) = ( F X (t) F X2 (t))( F X3 (t)) = L(α, β; t) = m i= { t i β(t i /α) β exp 3(t i /α) β} [ 2 exp {(t i /α) β} { + exp 2(t i /α) β} ] 3 p(α, β t) L(α, β; t)p(α, β) awkward.

MCMC Solution (Independent Case) Proposed solution in the tradition of Tanner & Wong (987), since inference easy in the presence of (augmented) component lifetimes. Thus, for X i iid FX ( ; ψ) sample from the natural completion of the posterior distribution: p(ψ, x,..., x n t) by blocked Gibbs sampling using the conditional distributions: p(x,..., x n ψ, t) p(ψ x,..., x n, t) where x i = {x i,..., x im } are the m component failure times for the i th of n systems (x ij = t j some j)

p(ψ x,..., x n, t) is now simple Bayesian inference for system lifetime distribution well understood and in Phase-type case, above algorithm slots in here. Problem shifted to sampling p(x,..., x n ψ, t)

p(ψ x,..., x n, t) is now simple Bayesian inference for system lifetime distribution well understood and in Phase-type case, above algorithm slots in here. Problem shifted to sampling p(x,..., x n ψ, t) Propose using Samaniego s system signature s j = P(T = X j:n ) e.g. p(x i,..., x im ψ, t) = m {p(x i,..., x im ψ, X j:n = t i ) j= P(T = X j:n ψ, t i )} where ( ) m P(T = X j:n ψ, t i ) s j F X (t i ) j FX (t i ) m j j

Algorithm to sample p(x,..., x n ψ, t) For each system i =,..., n: Sample j {,..., m} from the discrete probability distribution defined by the conditioned system signature, P(T = X j:n ψ, t i ). This samples the order statistic indicating that the j th failure caused system failure. 2 Sample: j values, x i,..., x i(j ), from F X X<ti ( ; ψ), the distribution of the component lifetime conditional on failure before t i m j values, x i(j+),..., x in, from F X X>ti ( ; ψ), the distribution of the component lifetime conditional on failure after t i and set x ij = t i. Each iteration provides x i

All Systems of 4 Components, Exponential (λ = 0.4) Network Rate 0.25 0.30 0.35 0.40 0.45 0.50 0.55 0.60 2 3 4 5 6 7 8 9 0 2 3 4 5 6 7 8 9 20

Exchangeable Failure Rate Parameters It is straight-forward to allow the more general setting of exchangeable failure rate parameters between networks: i X ij T i j =,...,n i =,...,m However, exchangeability within network would break the signature-based sampling of node failure times so this is probably as general as this particular approach can go.

All Systems of 4 Components, Ψ Gam(α = 9, β = 2 ) Network Population Mean of Rate Distribution 4.0 4.5 5.0 5.5 2 3 4 5 6 7 8 9 0 2 3 4 5 6 7 8 9 20

All Systems of 4 Components, Ψ Gam(α = 9, β = 2 ) Network Population Variance of Rate Distribution 2 3 4 5 6 7 2 3 4 5 6 7 8 9 0 2 3 4 5 6 7 8 9 20

Extend to Topological Inference It is now possible to add topology detection to the inferential process.? T = t??? It is simple to sample from p(m ψ, t) as an additional Gibbs update, moving between network topologies. With care, reversible jump between component numbers is also feasible.

True Topology with Signature No. 2 (40 observation) 0.25 Posterior Probability 0.20 0.5 0.0 0.05 0.00 4 6 8 0 2 4 6 8 Network

Asmussen, S., Nerman, O. & Olsson, M. (996), Fitting phase-type distributions via the EM algorithm, Scand. J. Statist. 23(4), 49 44. Bladt, M., Gonzalez, A. & Lauritzen, S. L. (2003), The estimation of phase-type related functionals using Markov chain Monte Carlo methods, Scand. Actuar. J. 2003(4), 280 300. Cano, J. & Rios, D. (2006), Reliability forecasting in complex hardware/software systems, Proceedings of the First International Conference on Availability, Reliability and Security (ARES 06). Daneshkhah, A. & Bedford, T. (2008), Sensitivity analysis of a reliability system using gaussian processes, in T. Bedford, ed., Advances in Mathematical Modeling for Reliability, IOS Press, pp. 46 62. Tanner, M. A. & Wong, W. H. (987), The calculation of posterior distributions by data augmentation, Journal of the American Statistical Association 82(398), 528 540.