Bayesian Techniques for Parameter Estimation. He has Van Gogh s ear for music, Billy Wilder

Similar documents
Tutorial on Markov Chain Monte Carlo

STA 4273H: Statistical Machine Learning

An Introduction to Using WinBUGS for Cost-Effectiveness Analyses in Health Economics

Statistics Graduate Courses

Introduction to Markov Chain Monte Carlo

11. Time series and dynamic linear models

Master s Theory Exam Spring 2006

Optimising and Adapting the Metropolis Algorithm

Basics of Statistical Machine Learning

Bayesian Statistics in One Hour. Patrick Lam

Markov Chain Monte Carlo Simulation Made Simple

THE USE OF STATISTICAL DISTRIBUTIONS TO MODEL CLAIMS IN MOTOR INSURANCE

Inference on Phase-type Models via MCMC

A Bootstrap Metropolis-Hastings Algorithm for Bayesian Analysis of Big Data

Monte Carlo and Empirical Methods for Stochastic Inference (MASM11/FMS091)

Dirichlet Processes A gentle tutorial

Stephane Crepey. Financial Modeling. A Backward Stochastic Differential Equations Perspective. 4y Springer

Linear regression methods for large n and streaming data

Bayesian Machine Learning (ML): Modeling And Inference in Big Data. Zhuhua Cai Google, Rice University

Service courses for graduate students in degree programs other than the MS or PhD programs in Biostatistics.

Monte Carlo-based statistical methods (MASM11/FMS091)

MCMC-Based Assessment of the Error Characteristics of a Surface-Based Combined Radar - Passive Microwave Cloud Property Retrieval

An Application of Inverse Reinforcement Learning to Medical Records of Diabetes Treatment

Bayesian Statistics: Indian Buffet Process

Introduction to Monte Carlo. Astro 542 Princeton University Shirley Ho

Gaussian Processes to Speed up Hamiltonian Monte Carlo

Parameter estimation for nonlinear models: Numerical approaches to solving the inverse problem. Lecture 12 04/08/2008. Sven Zenker

A Coefficient of Variation for Skewed and Heavy-Tailed Insurance Losses. Michael R. Powers[ 1 ] Temple University and Tsinghua University

Probabilistic Fitting

MAN-BITES-DOG BUSINESS CYCLES ONLINE APPENDIX

Markov Chain Monte Carlo and Numerical Differential Equations

SAS Certificate Applied Statistics and SAS Programming

BAYESIAN ANALYSIS OF AN AGGREGATE CLAIM MODEL USING VARIOUS LOSS DISTRIBUTIONS. by Claire Dudley

DURATION ANALYSIS OF FLEET DYNAMICS

Variations of Statistical Models

APPLIED MISSING DATA ANALYSIS

How To Solve A Sequential Mca Problem

PS 271B: Quantitative Methods II. Lecture Notes

Spatial Statistics Chapter 3 Basics of areal data and areal data modeling

Bias in the Estimation of Mean Reversion in Continuous-Time Lévy Processes

BayesX - Software for Bayesian Inference in Structured Additive Regression

33. STATISTICS. 33. Statistics 1

Learning outcomes. Knowledge and understanding. Competence and skills

Marketing Mix Modelling and Big Data P. M Cain

Model-based Synthesis. Tony O Hagan

One-year reserve risk including a tail factor : closed formula and bootstrap approaches

Nonparametric adaptive age replacement with a one-cycle criterion

Bayesian Phylogeny and Measures of Branch Support

Parallelization Strategies for Multicore Data Analysis

How To Understand The Theory Of Probability

The HB. How Bayesian methods have changed the face of marketing research. Summer 2004

A Tutorial on Particle Filters for Online Nonlinear/Non-Gaussian Bayesian Tracking

Analysis of Bayesian Dynamic Linear Models

Dynamic Linear Models with R

Handling attrition and non-response in longitudinal data

Bayes and Naïve Bayes. cs534-machine Learning

Similarity Search and Mining in Uncertain Spatial and Spatio Temporal Databases. Andreas Züfle

More details on the inputs, functionality, and output can be found below.

1 Prior Probability and Posterior Probability

Probabilistic Models for Big Data. Alex Davies and Roger Frigola University of Cambridge 13th February 2014

Chenfeng Xiong (corresponding), University of Maryland, College Park

HT2015: SC4 Statistical Data Mining and Machine Learning

CHAPTER 2 Estimating Probabilities

Advanced Signal Processing and Digital Noise Reduction


MASTER S THESIS for stud.techn. Kristian Stormark. Faculty of Information Technology, Mathematics and Electrical Engineering NTNU

Multivariate Normal Distribution

Non Linear Dependence Structures: a Copula Opinion Approach in Portfolio Optimization

Comparison of frequentist and Bayesian inference. Class 20, 18.05, Spring 2014 Jeremy Orloff and Jonathan Bloom

Quantitative Methods for Finance

Inference of Probability Distributions for Trust and Security applications

Exam P - Total 23/

Towards scaling up Markov chain Monte Carlo: an adaptive subsampling approach

Monte Carlo-based statistical methods (MASM11/FMS091)

References. Importance Sampling. Jessi Cisewski (CMU) Carnegie Mellon University. June 2014

CHAPTER 3 EXAMPLES: REGRESSION AND PATH ANALYSIS

Master s thesis tutorial: part III

A Bayesian hierarchical surrogate outcome model for multiple sclerosis

Echantillonnage exact de distributions de Gibbs d énergies sous-modulaires. Exact sampling of Gibbs distributions with submodular energies

Generating Random Samples from the Generalized Pareto Mixture Model

STAT2400 STAT2400 STAT2400 STAT2400 STAT2400 STAT2400 STAT2400 STAT2400&3400 STAT2400&3400 STAT2400&3400 STAT2400&3400 STAT3400 STAT3400

Probability and Statistics

Monte Carlo Methods and Models in Finance and Insurance

Probabilistic Methods for Time-Series Analysis

Sample Size Designs to Assess Controls

A Game Theoretical Framework for Adversarial Learning

Sampling via Moment Sharing: A New Framework for Distributed Bayesian Inference for Big Data

Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model

Foundations of Statistics Frequentist and Bayesian

Probability and Random Variables. Generation of random variables (r.v.)

Probabilistic trust models in network security

AN INTRODUCTION TO MARKOV CHAIN MONTE CARLO METHODS AND THEIR ACTUARIAL APPLICATIONS. Department of Mathematics and Statistics University of Calgary

Combining Visual and Auditory Data Exploration for finding structure in high-dimensional data

Likelihood: Frequentist vs Bayesian Reasoning

1. (First passage/hitting times/gambler s ruin problem:) Suppose that X has a discrete state space and let i be a fixed state. Let

Hydrodynamic Limits of Randomized Load Balancing Networks

Microclustering: When the Cluster Sizes Grow Sublinearly with the Size of the Data Set

Operational Risk Management: Added Value of Advanced Methodologies

Fast Monte Carlo CVA using Exposure Sampling Method

Transcription:

Bayesian Techniques for Parameter Estimation He has Van Gogh s ear for music, Billy Wilder

Statistical Inference Goal: The goal in statistical inference is to make conclusions about a phenomenon based on observed data. Frequentist: Observations made in the past are analyzed with a specified model. Result is regarded as confidence about state of real world. Probabilities defined as frequencies with which an event occurs if experiment is repeated several times. Parameter Estimation: o Relies on estimators derived from different data sets and a specific sampling distribution. o Parameters may be unknown but are fixed Bayesian: Interpretation of probability is subjective and can be updated with new data. Parameter Estimation: Parameters described as density

Bayesian Inference Framework: Prior Distribution: Quantifies prior knowledge of parameter values. Likelihood: Probability of observing a data if we have a certain set of parameter values. Posterior Distribution: Conditional probability distribution of unknown parameters given observed data. Joint PDF: Quantifies all combination of data and observations Bayes Relation: Specifies posterior in terms of likelihood, prior, and normalization constant Problem: Evaluation of normalization constant typically requires high dimensional integration.

Bayesian Inference Uninformative Prior: No a priori information parameters Informative Prior: Use conjugate priors; prior and posterior from same distribution Evaluation Strategies: Analytic integration --- Rare Classical quadrature; e.g., p = 2 Monte Carlo quadrature Techniques Markov Chains

Example: Bayesian Inference

Bayesian Inference Example: Note: 1 Head, 0 Tails 5 Heads, 9 Tails 49 Heads, 51 Tails

Bayesian Inference Example: Now consider 5 Heads, 5 Tails 50 Heads, 50 Tails Note: Poor informative prior incorrectly influences results for a long time.

Parameter Estimation Problem Likelihood: Assumption: Note:

Parameter Estimation: Example Example: Consider the spring model

Ordinary Least Squares: Here Parameter Estimation: Example Sample Distribution

Parameter Estimation: Example Bayesian Inference: The likelihood is Posterior Distribution: Strategy: Create Markov chain using random sampling so that created chain has the posterior distribution as its limiting (stationary) distribution.

Markov Chains Definition: Note: A Markov chain is characterized by three components: a state space, an initial distribution, and a transition kernel. State Space: Initial Distribution: (Mass) Transition Probability: (Markov Kernel)

Markov Chains Example: Chapman-Kolmogorov Equations:

Markov Chains: Limiting Distribution Example: Raleigh weather -- Tomorrow s weather conditioned on today s weather rain sun Question: Definition: This is the limiting distribution (invariant measure)

Example: Raleigh weather Markov Chains: Limiting Distribution Solve rain sun

Reducible Markov Chain: Irreducible Markov Chains p1 p2 Note: Limiting distribution not unique if chain is reducible. Irreducible:

Periodic Markov Chains Example: Periodicity: A Markov chain is periodic if parts of the state space are visited at regular intervals. The period k is defined as

Example: Periodic Markov Chains

Stationary Distribution Theorem: A finite, homogeneous Markov chain that is irreducible and aperiodic has a unique stationary distribution and the.chain will converge in the sense of distributions from any initial distribution. Recurrence (Persistence): Example: State 3 is transient Ergodicity: A state is termed ergodic if it is aperiodic and recurrent. If all states of an irreducible Markov chain are ergodic, the chain is said to be ergodic.

Matrix Theory Definition: Lemma: Example:

Matrix Theory Theorem (Perron-Frobenius): Corollary 1: Proposition:

Stationary Distribution Corollary: Proof: Convergence: Express

Stationary Distribution

Detailed Balance Conditions Reversible Chains: A Markov chain determined by the transition matrix is reversible if there is a distribution that satisfies the detailed balance conditions Proof: We need to show that Example:

Markov Chain Monte Carlo Methods Strategy: Markov chain simulation used when it is impossible, or computationally prohibitive, to sample directly from Note: In Markov chain theory, we are given a Markov chain, P, and we construct its equilibrium distribution. In MCMC theory, we are given a distribution and we want to construct a Markov chain that is reversible with respect to it.

General Strategy: Markov Chain Monte Carlo Methods Intuition: Recall that

Markov Chain Monte Carlo Methods Intuition: Note: Narrower proposal distribution yields higher probability of acceptance.

Metropolis Algorithm Metropolis Algorithm: [Metropolis and Ulam, 1949]

Metropolis-Hastings Algorithm: Metropolis-Hastings Algorithm Examples: Note: Considered one of top 10 algorithms of 20th century

Proposal Distribution Proposal Distribution: Significantly affects mixing Too wide: Too many points rejected and chain stays still for long periods; Too narrow: Acceptance ratio is high but algorithm is slow to explore parameter space Ideally, it should have similar shape to posterior (target) distribution. Problem: Anisotropic posterior, isotropic proposal; Efficiency nonuniform for different parameters Result: Recovers efficiency of univariate case

Proposal Distribution and Acceptance Probability Proposal Distribution: Two basic approaches Choose a fixed proposal function o Independent Metropolis Random walk (local Metropolis) o Two (of several) choices: Acceptance Probability:

Random Walk Metropolis Algorithm for Parameter Estimation

Random Walk Metropolis Algorithm for Parameter Estimation

Markov Chain Monte Carlo: Example Example: Consider the spring model

Example: Single parameter c Markov Chain Monte Carlo: Example

Markov Chain Monte Carlo: Example Example: Consider the spring model Case i:

Markov Chain Monte Carlo: Example Case i: Note:

Markov Chain Monte Carlo: Example Case i:

Markov Chain Monte Carlo: Example Example: SMA-driven bending actuator -- talk with John Crews Model: Estimated Parameters:

Markov Chain Monte Carlo: Example Example: SMA-driven bending actuator -- talk with John Crews

Markov Chain Monte Carlo: Example Example: SMA-driven bending actuator -- talk with John Crews

Markov Chain Monte Carlo: Example Example: SMA-driven bending actuator -- talk with John Crews

Transition Kernel and Detailed Balance Condition Transition Kernel: Recall that Detailed Balance Condition:

Transition Kernel and Detailed Balance Condition Detailed Balance Condition: Here Note: Transition Kernel: Definition

Sampling Error Variance Strategy: Treat error variance as parameter to be estimated. Recall: Assumption that errors are normally distributed yields Goal: Determine posterior distribution Strategy: Choose prior so that posterior is from same family --- termed conjugate prior. For normal distribution with unknown variance, conjugate prior is inverse Gamma distribution which is equivalent to inverse

Definition: Sampling Error Variance Strategy: Note: Note:

Example: Consider the spring model Sampling Error Variance

Related Topics Note: This is an active research area and there are a number of related topics Burn in and convergence Adaptive algorithms Population Monte Carlo methods Sequential Monte Carlo methods and particle filters Gaussian mixture models Development of metamodels, surrogates and emulators to improve implementation speeds References: A. Solonen, Monte Carlo Methods in Parameter Estimation of Nonlinear Models, Masters Thesis, 2006. H. Haario, E. Saksman, J. Tamminen, An adaptive Metropolis algorithm, Bernoulli, 7(2), pp. 223-242, 2001. C. Andrieu and J. Thomas, A tutorial on adaptive MCMC, Statistics and Computing, 18, pp. 343-373, 2008. M. Vihola, Robust adaptive Metropolis algorithm with coerced acceptance rate, arxiv:1011.4381v2.