How To Solve A Save Problem With Time T

Similar documents

Copyright c 2007 by Fei Liu All rights reserved

11. Time series and dynamic linear models

Model-based Synthesis. Tony O Hagan

BayesX - Software for Bayesian Inference in Structured Additive Regression

Summary Nonstationary Time Series Multitude of Representations Possibilities from Applied Computational Harmonic Analysis Tests of Stationarity

Generating Valid 4 4 Correlation Matrices

Spatial Statistics Chapter 3 Basics of areal data and areal data modeling

QUALITY ENGINEERING PROGRAM

Functional Data Analysis of MALDI TOF Protein Spectra

Tail-Dependence an Essential Factor for Correctly Measuring the Benefits of Diversification

Exact Inference for Gaussian Process Regression in case of Big Data with the Cartesian Product Structure

Two Topics in Parametric Integration Applied to Stochastic Simulation in Industrial Engineering

Portfolio Distribution Modelling and Computation. Harry Zheng Department of Mathematics Imperial College

Probability and Random Variables. Generation of random variables (r.v.)

Bayesian Statistics: Indian Buffet Process

Optimum Resolution Bandwidth for Spectral Analysis of Stationary Random Vibration Data

Dakota Software Training

The Monte Carlo Framework, Examples from Finance and Generating Correlated Random Variables

Penalized regression: Introduction

Lecture 10. Finite difference and finite element methods. Option pricing Sensitivity analysis Numerical examples

Java Modules for Time Series Analysis

Satistical modelling of clinical trials

> plot(exp.btgpllm, main = "treed GP LLM,", proj = c(1)) > plot(exp.btgpllm, main = "treed GP LLM,", proj = c(2)) quantile diff (error)

Uncertainty quantification for the family-wise error rate in multivariate copula models

Models for Product Demand Forecasting with the Use of Judgmental Adjustments to Statistical Forecasts

Handling missing data in large data sets. Agostino Di Ciaccio Dept. of Statistics University of Rome La Sapienza

Probabilistic Methods for Time-Series Analysis

A Robust Optimization Approach to Supply Chain Management

Content. Professur für Steuerung, Regelung und Systemdynamik. Lecture: Vehicle Dynamics Tutor: T. Wey Date: , 20:11:52

Multiple Linear Regression in Data Mining

Analysis of Load Frequency Control Performance Assessment Criteria

1 Maximum likelihood estimation

PTE505: Inverse Modeling for Subsurface Flow Data Integration (3 Units)

Gaussian Processes in Machine Learning

9. Model Sensitivity and Uncertainty Analysis

Integrated Resource Plan

Load Balancing and Switch Scheduling

8. Linear least-squares

Enhancing Parallel Pattern Search Optimization with a Gaussian Process Oracle

Background: State Estimation

A credibility method for profitable cross-selling of insurance products

Master of Mathematical Finance: Course Descriptions

Credit Risk Models: An Overview

Tutorial on Markov Chain Monte Carlo

HT2015: SC4 Statistical Data Mining and Machine Learning

Smoothing and Non-Parametric Regression

Clustering Time Series Based on Forecast Distributions Using Kullback-Leibler Divergence

Inference on Phase-type Models via MCMC

Imputing Missing Data using SAS

Time Series Analysis

Identification of Demand through Statistical Distribution Modeling for Improved Demand Forecasting

Christfried Webers. Canberra February June 2015

Trend and Seasonal Components

Data Preparation and Statistical Displays

Overview. Longitudinal Data Variation and Correlation Different Approaches. Linear Mixed Models Generalized Linear Mixed Models

Statistical Modeling by Wavelets

A non-gaussian Ornstein-Uhlenbeck process for electricity spot price modeling and derivatives pricing

Learning Gaussian process models from big data. Alan Qi Purdue University Joint work with Z. Xu, F. Yan, B. Dai, and Y. Zhu

ELECTRICITY REAL OPTIONS VALUATION

Incorporating cost in Bayesian Variable Selection, with application to cost-effective measurement of quality of health care.

Financial TIme Series Analysis: Part II

Adequacy of Biomath. Models. Empirical Modeling Tools. Bayesian Modeling. Model Uncertainty / Selection

CS 2750 Machine Learning. Lecture 1. Machine Learning. CS 2750 Machine Learning.

School of Engineering Department of Electrical and Computer Engineering

How can we make the most of magnetic data in building regional geological models?

jorge s. marques image processing

UNCERTAINTIES OF MATHEMATICAL MODELING

Non Linear Dependence Structures: a Copula Opinion Approach in Portfolio Optimization

Optimizing Prediction with Hierarchical Models: Bayesian Clustering

Nonnested model comparison of GLM and GAM count regression models for life insurance data

Monte Carlo Methods in Finance

A.II. Kernel Estimation of Densities

Towards running complex models on big data

the points are called control points approximating curve

Statistics Graduate Courses

The Black-Scholes-Merton Approach to Pricing Options

Bildverarbeitung und Mustererkennung Image Processing and Pattern Recognition

Bayesian inference for population prediction of individuals without health insurance in Florida

How To Understand The Theory Of Probability

Multiple Choice: 2 points each

Dynamic Linear Models with R

An extension of the factoring likelihood approach for non-monotone missing data

The Combination Forecasting Model of Auto Sales Based on Seasonal Index and RBF Neural Network

Transcription:

Bayesian Functional Data Analysis for Computer Model Validation Fei Liu Institute of Statistics and Decision Sciences Duke University Transition Workshop on Development, Assessment and Utilization of Complex Computer Models SAMSI, May 14nd, 2007 Joint work with Bayarri, Berger, Paulo, Sacks,...

Outline Computer model with functional output The methodology Basis representation Bayesian Analysis The case study example The road load data The analysis Results with Gaussian bias Results with Cauchy bias Dynamic Linear models Summary and On-going Work

Outline Computer model with functional output The methodology Basis representation Bayesian Analysis The case study example The road load data The analysis Results with Gaussian bias Results with Cauchy bias Dynamic Linear models Summary and On-going Work

Computer model validation Question: Does the computer model adequately represent the reality?

A solution SAVE (Bayarri et al., 2005) treats time t as another input. - Prior: ( ) y M (, ) GP Ψ (, )θ L 1, λ M cm ((, ), (, )). - Fast surrogate: ( y M (z, t) Data N m(z, t), V ) (z, t).

Limitations Treating t as another input, - is only applicable to low dimensional problem. - need smoothness assumption. Need new methodology to deal with: - Complex computer models. - High dimensional irregular functional output. - Uncertain (field) inputs.

Outline Computer model with functional output The methodology Basis representation Bayesian Analysis The case study example The road load data The analysis Results with Gaussian bias Results with Cauchy bias Dynamic Linear models Summary and On-going Work

Notation y F r (v i, δ ij ; t) = y M (v i, δ ij, u ; t) + y B u (v i, δ ij ; t) + e ijr (t)

Notation yr F (v i, δ ij ; t) = y M (v i, δ ij, u ; t) + yu B (v i, δ ij ; t) + e ijr (t) Inputs to the code. v i : configuration. δ ij : unknown inputs. u : calibration parameters. Simulator. Bias function. Reality y R. Field runs. Measurement errors.

Notation yr F (v i, δ ij ; t) = y M (v i, δ ij, u ; t) + yu B (v i, δ ij ; t) + e ijr (t) Inputs to the code. v i : configuration. δ ij : unknown inputs. u : calibration parameters. Simulator. Bias function. Reality y R. Field runs. Measurement errors.

Notation yr F (v i, δ ij ; t) = y M (v i, δ ij, u ; t) + yu B (v i, δ ij ; t) + e ijr (t) Inputs to the code. v i : configuration. δ ij : unknown inputs. u : calibration parameters. Simulator. Bias function. Reality y R. Field runs. Measurement errors.

Notation yr F (v i, δ ij ; t) = y M (v i, δ ij, u ; t) + yu B (v i, δ ij ; t) + e ijr (t) Inputs to the code. v i : configuration. δ ij : unknown inputs. u : calibration parameters. Simulator. Bias function. Reality y R. Field runs. Measurement errors.

Notation yr F (v i, δ ij ; t) = y M (v i, δ ij, u ; t) + yu B (v i, δ ij ; t) + e ijr (t) Inputs to the code. v i : configuration. δ ij : unknown inputs. u : calibration parameters. Simulator. Bias function. Reality y R. Field runs. Measurement errors.

Notation yr F (v i, δ ij ; t) = y M (v i, δ ij, u ; t) + yu B (v i, δ ij ; t) + e ijr (t) Inputs to the code. v i : configuration. δ ij : unknown inputs. u : calibration parameters. Simulator. Bias function. Reality y R. Field runs. Measurement errors.

Basis expansion Represent the functions as: y M (z j ; t) = W k=1 w M k (z j)ψ k (t), j = 1,..., n M ; y B u (v i, δ ij ; t) = W k=1 w B k (v i, δ ij )ψ k(t), i = (1,..., m), j = (1,..., n f i ) ; y F r (v i, δ ij ; t) = W k=1 w F kr (v i, δ ij )ψ k(t), r = 1,..., r ij.

Thresholding Reduce computational expense, y M (z j ; t) W 0 k=1 w M k (z j)ψ k (t), j = 1,..., n M ; y B u (v i, δ ij ; t) W 0 k=1 w B k (v i, δ ij )ψ k(t), i = (1,..., m), j = (1,..., n f i ) ; y F r (v i, δ ij ; t) W 0 k=1 w F kr (v i, δ ij )ψ k(t), r = 1,..., r ij.

Emulators For each k = 1,..., W 0, ) wk (µ M ( ) GP M k, σ 2M k Corr k (, ). Corr k (, ): power exponential family, Corr k (z, z ) = exp ( p d=1 ) β (M k ) d z d z k ) d α(m d. Modular approach: ( wk M (z) Data, ˆµM k, ˆσ 2M k, ˆα M k, ˆβ M ) ( k N m k M (z), V ) k M (z).

Bayesian analysis For each k = 1,..., W 0, wkr F (v i, δ ij ) = w k R(v i, δ ij ) + ɛ ijkr, ɛ ijkr N ( 0, σk 2F ), w R k (v i, δ ij ) = w M k (v i, δ ij, u ) + w B k (v i, δ ij ). w B k (v i, δ ij ) = b k(v i ) + ɛ b ijk

Bayesian analysis For each k = 1,..., W 0, wkr F (v i, δ ij ) = w k R(v i, δ ij ) + ɛ ijkr, ɛ ijkr N ( 0, σk 2F ), w R k (v i, δ ij ) = w M k (v i, δ ij, u ) + w B k (v i, δ ij ). w B k (v i, δ ij ) = b k(v i ) + ɛ b ijk Data: - w ijk F = 1 rij r ij r=1 w kr F (v i, δ ij ). - S 2 ijk = r ij r=1 (w F kr (v i, δ ij ) w F ijk )2. - m k ( ), V k ( ).

Bayesian analysis For each k = 1,..., W 0, wkr F (v i, δ ij ) = w k R(v i, δ ij ) + ɛ ijkr, ɛ ijkr N ( 0, σk 2F ), w R k (v i, δ ij ) = w M k (v i, δ ij, u ) + w B k (v i, δ ij ). w B k (v i, δ ij ) = b k(v i ) + ɛ b ijk The bias, ( ) b k (v i ) GP µ B k, σ2b k CorrB k (, ), ɛ b ijk N(0, σ2bɛ k ), with correlation, ( Corr B k (v, v p v ) = exp d=1 β (B k ) d v d v d α(b k ) d ).

Bayesian analysis For each k = 1,..., W 0, wkr F (v i, δ ij ) = w k R(v i, δ ij ) + ɛ ijkr, ɛ ijkr N ( 0, σk 2F ), w R k (v i, δ ij ) = w M k (v i, δ ij, u ) + w B k (v i, δ ij ). w B k (v i, δ ij ) = b k(v i ) + ɛ b ijk Unknowns: Inputs: u and δ ij. Bias: {bk ( )}, wk B(v i, δ ij ). Model: wk M(v i, δ ij, u ). Reality: {wk R(v i, δ ij )} Hyper-parameters: σ 2B k, σ2bɛ k, σk 2F,µB k.

Bayesian analysis For each k = 1,..., W 0, wkr F (v i, δ ij ) = w k R(v i, δ ij ) + ɛ ijkr, ɛ ijkr N ( 0, σk 2F ), w R k (v i, δ ij ) = w M k (v i, δ ij, u ) + w B k (v i, δ ij ). w B k (v i, δ ij ) = b k(v i ) + ɛ b ijk Unknowns: Inputs: u and δ ij. Bias: {bk ( )}, wk B(v i, δ ij ). Model: wk M(v i, δ ij, u ). Reality: {wk R(v i, δ ij )} Hyper-parameters: σ 2B k, σ2bɛ k, σk 2F,µB k.

Bayesian analysis For each k = 1,..., W 0, wkr F (v i, δ ij ) = w k R(v i, δ ij ) + ɛ ijkr, ɛ ijkr N ( 0, σk 2F ), w R k (v i, δ ij ) = w M k (v i, δ ij, u ) + w B k (v i, δ ij ). w B k (v i, δ ij ) = b k(v i ) + ɛ b ijk Unknowns: Inputs: u and δ ij. Bias: {bk ( )}, wk B(v i, δ ij ). Model: wk M(v i, δ ij, u ). Reality: {wk R(v i, δ ij )} Hyper-parameters: σ 2B k, σ2bɛ k, σk 2F,µB k.

Bayesian analysis For each k = 1,..., W 0, wkr F (v i, δ ij ) = w k R(v i, δ ij ) + ɛ ijkr, ɛ ijkr N ( 0, σk 2F ), w R k (v i, δ ij ) = w M k (v i, δ ij, u ) + w B k (v i, δ ij ). w B k (v i, δ ij ) = b k(v i ) + ɛ b ijk Unknowns: Inputs: u and δ ij. Bias: {bk ( )}, wk B(v i, δ ij ). Model: wk M(v i, δ ij, u ). Reality: {wk R(v i, δ ij )} Hyper-parameters: σ 2B k, σ2bɛ k, σk 2F,µB k.

Bayesian analysis For each k = 1,..., W 0, wkr F (v i, δ ij ) = w k R(v i, δ ij ) + ɛ ijkr, ɛ ijkr N ( 0, σk 2F ), w R k (v i, δ ij ) = w M k (v i, δ ij, u ) + w B k (v i, δ ij ). w B k (v i, δ ij ) = b k(v i ) + ɛ b ijk Unknowns: Inputs: u and δ ij. Bias: {bk ( )}, wk B(v i, δ ij ). Model: wk M(v i, δ ij, u ). Reality: {wk R(v i, δ ij )} Hyper-parameters: σ 2B k, σ2bɛ k, σk 2F,µB k.

Bayesian analysis For each k = 1,..., W 0, wkr F (v i, δ ij ) = w k R(v i, δ ij ) + ɛ ijkr, ɛ ijkr N ( 0, σk 2F ), w R k (v i, δ ij ) = w M k (v i, δ ij, u ) + w B k (v i, δ ij ). w B k (v i, δ ij ) = b k(v i ) + ɛ b ijk Unknowns: Inputs: u and δ ij. Bias: {bk ( )}, wk B(v i, δ ij ). Model: wk M(v i, δ ij, u ). Reality: {wk R(v i, δ ij )} Hyper-parameters: σ 2B k, σ2bɛ k, σk 2F,µB k.

Outline Computer model with functional output The methodology Basis representation Bayesian Analysis The case study example The road load data The analysis Results with Gaussian bias Results with Cauchy bias Dynamic Linear models Summary and On-going Work

The road load data 1 Time history data at T = 90843 time points. Inputs include: - One configuration v = x nom. - Seven characteristics x = (x 1, x 2,..., x 7 ), x = x nom + δ. - Two calibration parameters u = (u 1, u 2 ).

The road load data 2 64 design points in 9-d space for computer model runs. {y M (z, t), z D M, t 1,..., T }, z = (x, u). 7 field replicates. {y F r (x, t), r (1,..., 7)}, x = x nom + δ.

0 5 y 10 The road load data 3 0 10 20 30 40 d 50 60

Wavelet representation Represent the time history data by wavelets: y M (z j ; t) = y F r (x ; t) = W wk M (z j)ψ k (t), j = 1,..., 64; k=1 W wkr F (x )ψ k (t), r = 1,..., 7. k=1 Apply hard thresholding to reduce the computational expense. keep W 0 = 289 nonzero coefficients.

Emulators For each wavelet coefficient, wk (µ M ( ) GP k, 1 λ M k Corr k (, ) ). Corr k (, ): power exponential family, Corr k (z, z ) = exp ( p d=1 β (k) d z d z α(k) k d Response Surface: ) ( (w k M (z) Data, ˆµ k, ˆλ M k, ˆα k, ˆβ k N m k (z), V ) k (z). ).

Bayesian analysis For each wavelet coefficient, w R i (x ) = w M i (x, u ) + b i (x ), i = 1,..., W, w F ir (x ) = w R i (x ) + ɛ ir, r = 1,..., f. The bias, ( ) b i (x ) N 0, τj 2 or j: wavelet resolution level. ( ) b i (x ) C 0, τj 2. The error, ( ) ɛ ir N 0, σi 2.

Prior distributions The Input/Uncertainty Map, Parameter Type Variability Uncertainty Damping 1 Calibration Uncertain 15% Damping 2 Calibration Uncertain 15% Stiffness 1 Manufacturing Uncertain 10% Stiffness 2 Manufacturing Uncertain 10% Front-rebound 1 Manufacturing Uncertain 7% Front-rebound 2 Manufacturing Uncertain 8% Sprung-Mass Manufacturing Uncertain 5% Unsprung-Mass Manufacturing Uncertain 12% Body-Pitch-Inertia Manufacturing Uncertain 8% Calibration: uniform over specified range. Manufacturing: truncated normals over the ranges. π(σi 2 ) 1/σi 2, π(τj 2 {σi 2 }) 1. τj 2 + 1 7 σ2 j

Bias function (under Gaussian bias) b h (t) = W bi h ψ i (t), h = 1,..., N i=1 Load 2 1 0 1 2 7 8 9 10 Time

Reality (under Gaussian bias) y Rh (u h, x h, t) = W ( i=1 w Mh i (u h, x h ) + b h i ) ψ i (t), h = 1,..., N Load 0 5 10 7 8 9 10 Time

Bias function (under Cauchy bias) Load 2 1 0 1 2 3 Bias function MCMC (90% Tolerance bounds) 7 8 9 10 Time

Reality (under Cauchy bias) Load 0 5 10 7 8 9 10 Time

Outline Computer model with functional output The methodology Basis representation Bayesian Analysis The case study example The road load data The analysis Results with Gaussian bias Results with Cauchy bias Dynamic Linear models Summary and On-going Work

Dynamic Linear models

Model the computer model run at z by multivariate TVAR (West and Harrison, 1997), y M (z, t) = p φ t,j y M (z, t j) + ɛ M t (z), } {{ } j=1 GP(0, σ 2 t Corr(, )). Interpolator with forecasting capability. Computation is done by Forward filtering backward sampling algorithm. Predict at untried input z = 0.5.

Result

Outline Computer model with functional output The methodology Basis representation Bayesian Analysis The case study example The road load data The analysis Results with Gaussian bias Results with Cauchy bias Dynamic Linear models Summary and On-going Work

Summary We developed a Bayesian functional data analysis approach to computer model validation. The approach automatically take into account model discrepancy and model calibration. Unknown (field) inputs can be incorporated. Bias functions are modeled using hierarchical structures when needed.

On-going work: Time-dependent parameters Time-dependent parameters (Reichert, 2006) yt F = yt M (z F + φ z t ) +φ F t + ɛ F t, φ z t Stochastic process, } {{ } y M t (z F + φ z t ) y M t (z F ) + y M t (z F )φ z t. Build emulators for y M t (z F ) and y M t (z F ).

On-going work : Spatio-Temporal Outputs Figure: Regional Air Quality forecast (RAQAST) Over the U.S. (Provided by Serge et. al)

The model, y F (s, δ ; t) = y M (s, δ, u ; t) + b(s, δ ; t) + ɛ F t. - s: location (configuration). - u: calibration parameters. - δ: meteorology. - t: time. Spatial interpolation, y M (s, δ, u; t) = k K a k (s, δ, u)ψ k (t). Forecasting, b(s, δ; t) = p φ t,j b(s, δ; t j) + ɛ b t (s, δ). j=1

Thank you!