Estimation of non-stationary GEV model parameters

Size: px
Start display at page:

Download "Estimation of non-stationary GEV model parameters"

Transcription

1 Estimation of non-stationary GEV model parameters S. El-Adlouni, T. Ouarda & X. Zhang, R. Roy & B. Bobée Extreme Value Analysis August 2005 Statistical Hydrology Chair (INRS-ETE) 1

2 Outline Problem definition Objectives General Extreme Value Distribution Non-stationary GEV model Parameter estimation Simulation based comparison of estimation methods Case study Conclusions 2

3 Position of the problem In frequency analysis, data must generally be independent and identically distributed (i.i.d) which implies that they must meet the statistical criteria of independence, stationarity and homogeneity In reality, the probability distribution of extreme events can change with time Need to develop frequency analysis models which can handle various types of non-stationarity (trends, jumps, etc.) 3

4 Objectives of the study Develop tools for frequency analysis in a non-stationary framework Include the potential impacts of climate change Explore the case of trends or dependence on covariables Bayesian framework 4

5 GEV distribution Y is GEV (Generalised Extreme Value) distributed if : FGEV y y α κ ( ) ( ) 1/ = exp 1 µ κ κ 1 ( y µ ) > 0 α ( ), ( > 0 ) et ( ) µ α κ are respectively the location, scale and shape parameters. 5

6 Non-stationary GEV model Non-stationary framework: ( ) Y ~,, t t t t GEV µ α κ Parameters are function of time or other covariates. 6

7 Non-stationary GEV model Illustration of the two types of non-stationarity 2000 Tendance 2000 Changement d'échelle

8 Non-stationary GEV model I Xt GEV µ 0, ακ, ( ) Classic model : all parameters are constant. II ( µ = β + β α κ) X ~ GEV Y,, t 1 t 1 2 t The location parameter is a linear function of a covariable. The other two parameters are constant. III ( 2 µ = β + β + β α κ) Xt GEV2 t 1 2Yt 3 Yt,, The location parameter is a quadratic function of a temporal covariable. The other two parameters are constant. 8

9 Parameter Estimation 1. Maximum likelihood method (ML) 2. Bayesian model (Bayes) 3. Generalised maximum likelihood method (GML) 9

10 Maximum Likelihood Method Properties of ML estimators Under some regularity conditions, ML estimators have the desired optimality properties. These regularity conditions are not met when the shape parameter is different from 0, since the support of the distribution depends on parameters (Smith 1985). For small samples, the numerical resolution of the ML system can generate parameter estimators that are physically impossible and leads to very high quantile estimator variances. 10

11 Bayesian estimation : prior distribution Prior distribution of parameter vector θ = ( µακ,, ) Fisher information matrix Iij ( θ ) = E Jeffrey s information prior ( y θ ) 2 ln f θ θ i j J θ = I θ ( ) ( ) 1 2 For the GEV distribution, the Fisher information matrix is given by Jenkinson (1969) 11

12 Bayesian estimation GEV 0 model In the absence of any additional information about the parameters (regional information, historic, expert opinion, etc.), we consider the Jeffrey s noninformative prior. π 0 µακ,, = J µακ,, ( ) ( ) 12

13 Bayesian estimation GEV 1 model π β, β, α, κ = J β, α, κ p β ( ) ( ) ( ) With a vague prior for the parameter β 2 p ( β ) ( 2 ) 2 N 0, σ = and σ =

14 Bayesian estimation GEV 2 model π β, β, β, α, κ = J β, α, κ p β p β ( ) ( ) ( ) ( ) β 2 β with pβ β σ 2 ( ) ( 2 ) 2 = N 0, ( ) ( 2 ) 1 pβ β3 = N 0, σ 2 3 and σ1 = σ 2 =

15 Generalised Maximum Likelihood Method Stationary case The GML method is based on the same principle than the ML method with an additional constraint on the shape parameter to restrain its domain. Martins & Stedinger (2000) presented the GML approach for the case GEV0 with a Beta [-0.5;0.5] prior distribution for the shape parameter : π κ = Beta u = 6, v = 9 κ ( ) ( ) 15

16 Generalised Maximum Likelihood Method 4 Beta pdf on [ ] prior distribution function of the shape parameter E [ κ] = 0.1 and Var[ κ] = ( 0.122) 2 16

17 Generalised Maximum Likelihood Method Non-stationary case The GML method can be generalized to the non-stationary case: 1. Adopt the same prior distribution for the shape parameter, 2. Solve the equation system obtained by the ML method under this constraint. The GML parameter estimators are the solution of the following optimisation problem : ( x θ ) max Ln ; θ κ Beta u (,v) 17

18 Generalised Maximum Likelihood Method Non-stationary case The solution of the optimisation problem is equivalent to the maximisation of the posterior distribution of the parameters conditionally to the data : ( ) ( ) ( ) π θ θ π κ x Ln x κ The GML estimator of the parameter vector is the mode of the posterior distribution. 18

19 Parameter and quantile estimation ML : numerical solution : Newton-Raphson method. GML & Bayes : Monte-Carlo Markov-Chain methods (MCMC) The GML estimator corresponds to the mode of the posterior distribution, The Bayesian estimator corresponds to the posterior mean. 19

20 Parameter and quantile estimation MCMC method adopted For the GML and Bayesian method, the posterior distribution is simulated with the Metropolis-Hastings (M- H) algorithm (Gilks et al. 1996). Chain size and burn-in period Several techniques allow to check convergence of generated Markov Chain to the stationary distribution (El Adlouni et al. 2005). For all cases presented in this work, the convergence of the MCMC methods is obtained with a chain size of N=15000 and with a burn-in period of N0=

21 21

22 Parameter and quantile estimation Quantile estimation Aside from parameter estimators, the MCMC algorithm iterations allow to obtain the conditional distribution of quantiles given an observed value y 0 of the covariate Y t. For each iteration of the MCMC algorithm i=1,,n we ( i) compute the quantile x py, corresponding to a non-exceedance 0 probability p ( i) () i ( () i x ) ( () ( )) p, y 1 log 0 y p κ () i α = µ + 0 i κ Conditional on the value y 0 22

23 Parameter and quantile estimation Quantile estimation (cont.) ( i) µ Is the location parameter conditional to a particular y 0 value y 0 of the covariate Y t. () i ( i) µ = µ y 0 For the GEV0 model () i () i ( i) µ = β + β y For the GEV1 model y ( i) ( i) ( i) ( i) 2 µ y = β1 + β2 y0 + β3 y0 For the GEV2 model 0 23

24 Simulation based comparison The three estimation methods are compared, for all three models, using Monte Carlo simulations. The covariate Y t represents time Y t =t The following values of the shape parameter are considered: κ = 0.1, κ = 0.2 et κ =

25 Simulation based comparison Methodology Performance criteria are the bias and the RMSE of quantile estimates for different non-exceedance probabilities : p = 0.5, 0.8, 0.9, 0.99 and Obtained for R=1000 samples of size n=50. 25

26 Simulation based comparison Bias and RMSE of quantile estimates for the ML, GML and Bayesian approach and for model GEV0 GEV0 Bias RMSE κ = 0.1 κ = 0.2 κ = 0.3 p ML GML Bayes ML GML Bayes

27 Simulation based comparison Bias and RMSE of quantile estimates for the ML, GML and Bayesian approach and for model GEV1 GEV1 Bias RMSE κ = 0.1 κ = 0.2 κ = 0.3 p ML GML Bayes ML GML Bayes

28 Simulation based comparison Bias and RMSE of quantile estimates for the ML, GML and Bayesian approach and for model GEV2 GEV2 Bias RMSE κ = 0.1 κ = 0.2 κ = 0.3 p ML GML Bayes ML GML Bayes

29 Simulation based comparison Results For the ML method : 1. GEV0 & GEV1 models: large RMSE is caused by the high variance. 2. GEV2 model: reduction of the variance for extreme quantiles. Bayesian approach: positive Bias and low RMSE for all quantiles. 29

30 Simulation based comparison Results GML method 1. Best bias performance for all three models. 2. Generally low RMSE. 3. Negative Bias for high skewness since the prior is centered on -0.1 Conclusion The GML method leads to best results for all three models. 30

31 Case study Location of the Randsburg station in California Code : Station Name : Randsburg Latitude : Longitude : Period : Size of the series : n=51 31

32 Case study 90 Station Randsburg Prec. Max. Ann. (mm) Année Annual Max rainfall series at Randsburg 32

33 Case study: characteristics Basic statistics Max. Ann. Prec. at Randsburg Number of observations 51 Minimum 3.00 Maximum 87.0 Mean 27.2 St. dev Median 25.0 Coefficient of variation (Cv) Coefficient of skewness (Cs) 1.14 Coefficient of kurtosis (Ck) 4.52 Parameter estimators for the GEV (ML) µ = α = κ =

34 Case study: ML fitting Fitting of the GEV model / Classic ML 34

35 Case study : Correlation with SOI (mm) Pluie Max. Ann. SOI SOI Annual Max rainfall series and SOI index at Randsburg 35

36 Case study : Correlation with SOI Observed Annual Max rainfall and corresponding SOI value 36

37 Case study: linear dependence model GML method: Outputs of the MCMC algorithm MCMC algorithm iterations for the estimation of the GEV0 model parameters with the GML method. 37

38 Case study: linear dependence model GML method: Outputs of the MCMC algorithm Histogram of the GEV0 parameters posterior distribution, obtained with the last N-N0 iterations of the MCMC algorithm. The GML estimator corresponds to the mode of the posterior distribution for each parameter. 38

39 Case study : model comparison Maximized log-likelihood function and GML parameter estimators for each model l n * β 1 β 2 β 3 α κ GEV GEV GEV

40 2 χ ν Case study : model comparison A simple approach to compare the validity of two models M 1 and M 0 such that M is to use the deviance statistic 0 M1 defined as: * * D = l M l M ( ) ( ) { } n n l n* is the maximized log-likelihood function for each model The statistic is Chi-squared distributed. The parameter of the Chi-deux distribution is the difference of the number of parameters of the two models M 1 et M 0. 40

41 Case study : model comparison Comparison of models GEV0 and GEV1 indicates that the difference is significant, since the statistic value is D = 6.2 D is considered significant at the confidance level 95% ( 2 ) 1 Pr χ 6.2 = The GEV1 model provides a better representation of data variability than model GEV0. 41

42 Case study : model comparison Comparison of models GEV1 and GEV2 leads to a statistic value of D = 4.3 with ( 2 ) 1 Pr χ 4.3 = The difference between the two models is hence significant at the 95% level. The GEV2 model is the most adequate to represent the dependence between the location parameter of the GEV and the SOI index. 42

43 Case study : Correlation with SOI GEV0 and GEV1 Median estimators conditional to SOI values. 43

44 Case study : Correlation with SOI GEV0 AND GEV2 Median estimators conditional to SOI values. 44

45 Case study : Correlation with SOI GEV0, GEV1 and GEV2 Median estimators conditional to particular SOI values. SOI = SOI = 0.04 SOI = 2.04 GEV 0 24 (21-28) GEV 1 54 (51-58) GEV 2 77 (72-82) 24 (21-28) 23 (19-27) 21 (18-24) 24 (21-28) 4 (0.5-7) 17 (15-22) 45

46 Case study : Correlation with SOI Results The difference is more important for lower SOI values which correspond to higher extremes of annual max. precipitations. GEV2 median estimate can be 3 times higher than GEV0 estimate. The linear trend model (GEV1) under-estimates the median for the higher SOI values. 46

47 Conclusions Advantages of the non-stationary GEV model in describing the variance. No assumption of normality as in classical models. Generalisation of the GML model to the non-stationary case provides efficient estimators for real hydrometeorological data series. Use of MCMC methods allows a robust solution to the likelihood equation system and leads to estimates of credibility intervals for parameters and quantiles. 47

48 References Coles G. S. (2001). An Introduction to Statistical Modeling of Extreme Values, Springer, 208 p. El Adlouni, S., Favre, A-C. and Bobée, B. (2005). Comparison of methodologies to assess the convergence of Markov Chain Monte- Carlo Methods. Computational Statistics and Data Analysis (Under Press). Hastings, W. (1970). Monte Carlo sampling methods using Markov Chains and their applications. Biometrika, Martins, E. S., J. R. Stedinger (2000). Generalized Maximum Likelihood GEV Quantile Estimators for Hydrologic Data. Water Resources Research, vol.36., no.(3)., pp Scarf, P.A. (1992). Estimation for a four parameter generalized extreme value distribution, Comm. Stat. Theor. Meth., 21, Smith, R. L. (1985). Maximum likelihood estimation in a class of non-regular cases. Biometrika 72,

Parallelization Strategies for Multicore Data Analysis

Parallelization Strategies for Multicore Data Analysis Parallelization Strategies for Multicore Data Analysis Wei-Chen Chen 1 Russell Zaretzki 2 1 University of Tennessee, Dept of EEB 2 University of Tennessee, Dept. Statistics, Operations, and Management

More information

STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.cs.toronto.edu/~rsalakhu/ Lecture 6 Three Approaches to Classification Construct

More information

Web-based Supplementary Materials for Bayesian Effect Estimation. Accounting for Adjustment Uncertainty by Chi Wang, Giovanni

Web-based Supplementary Materials for Bayesian Effect Estimation. Accounting for Adjustment Uncertainty by Chi Wang, Giovanni 1 Web-based Supplementary Materials for Bayesian Effect Estimation Accounting for Adjustment Uncertainty by Chi Wang, Giovanni Parmigiani, and Francesca Dominici In Web Appendix A, we provide detailed

More information

Imputing Missing Data using SAS

Imputing Missing Data using SAS ABSTRACT Paper 3295-2015 Imputing Missing Data using SAS Christopher Yim, California Polytechnic State University, San Luis Obispo Missing data is an unfortunate reality of statistics. However, there are

More information

Java Modules for Time Series Analysis

Java Modules for Time Series Analysis Java Modules for Time Series Analysis Agenda Clustering Non-normal distributions Multifactor modeling Implied ratings Time series prediction 1. Clustering + Cluster 1 Synthetic Clustering + Time series

More information

Chenfeng Xiong (corresponding), University of Maryland, College Park (cxiong@umd.edu)

Chenfeng Xiong (corresponding), University of Maryland, College Park (cxiong@umd.edu) Paper Author (s) Chenfeng Xiong (corresponding), University of Maryland, College Park (cxiong@umd.edu) Lei Zhang, University of Maryland, College Park (lei@umd.edu) Paper Title & Number Dynamic Travel

More information

Note on the EM Algorithm in Linear Regression Model

Note on the EM Algorithm in Linear Regression Model International Mathematical Forum 4 2009 no. 38 1883-1889 Note on the M Algorithm in Linear Regression Model Ji-Xia Wang and Yu Miao College of Mathematics and Information Science Henan Normal University

More information

BayesX - Software for Bayesian Inference in Structured Additive Regression

BayesX - Software for Bayesian Inference in Structured Additive Regression BayesX - Software for Bayesian Inference in Structured Additive Regression Thomas Kneib Faculty of Mathematics and Economics, University of Ulm Department of Statistics, Ludwig-Maximilians-University Munich

More information

CHAPTER 3 EXAMPLES: REGRESSION AND PATH ANALYSIS

CHAPTER 3 EXAMPLES: REGRESSION AND PATH ANALYSIS Examples: Regression And Path Analysis CHAPTER 3 EXAMPLES: REGRESSION AND PATH ANALYSIS Regression analysis with univariate or multivariate dependent variables is a standard procedure for modeling relationships

More information

Bayesian Machine Learning (ML): Modeling And Inference in Big Data. Zhuhua Cai Google, Rice University caizhua@gmail.com

Bayesian Machine Learning (ML): Modeling And Inference in Big Data. Zhuhua Cai Google, Rice University caizhua@gmail.com Bayesian Machine Learning (ML): Modeling And Inference in Big Data Zhuhua Cai Google Rice University caizhua@gmail.com 1 Syllabus Bayesian ML Concepts (Today) Bayesian ML on MapReduce (Next morning) Bayesian

More information

Tutorial on Markov Chain Monte Carlo

Tutorial on Markov Chain Monte Carlo Tutorial on Markov Chain Monte Carlo Kenneth M. Hanson Los Alamos National Laboratory Presented at the 29 th International Workshop on Bayesian Inference and Maximum Entropy Methods in Science and Technology,

More information

Statistical Machine Learning

Statistical Machine Learning Statistical Machine Learning UoC Stats 37700, Winter quarter Lecture 4: classical linear and quadratic discriminants. 1 / 25 Linear separation For two classes in R d : simple idea: separate the classes

More information

Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model

Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model 1 September 004 A. Introduction and assumptions The classical normal linear regression model can be written

More information

Least Squares Estimation

Least Squares Estimation Least Squares Estimation SARA A VAN DE GEER Volume 2, pp 1041 1045 in Encyclopedia of Statistics in Behavioral Science ISBN-13: 978-0-470-86080-9 ISBN-10: 0-470-86080-4 Editors Brian S Everitt & David

More information

Basics of Statistical Machine Learning

Basics of Statistical Machine Learning CS761 Spring 2013 Advanced Machine Learning Basics of Statistical Machine Learning Lecturer: Xiaojin Zhu jerryzhu@cs.wisc.edu Modern machine learning is rooted in statistics. You will find many familiar

More information

Data Preparation and Statistical Displays

Data Preparation and Statistical Displays Reservoir Modeling with GSLIB Data Preparation and Statistical Displays Data Cleaning / Quality Control Statistics as Parameters for Random Function Models Univariate Statistics Histograms and Probability

More information

Reliability estimators for the components of series and parallel systems: The Weibull model

Reliability estimators for the components of series and parallel systems: The Weibull model Reliability estimators for the components of series and parallel systems: The Weibull model Felipe L. Bhering 1, Carlos Alberto de Bragança Pereira 1, Adriano Polpo 2 1 Department of Statistics, University

More information

Modeling and Analysis of Call Center Arrival Data: A Bayesian Approach

Modeling and Analysis of Call Center Arrival Data: A Bayesian Approach Modeling and Analysis of Call Center Arrival Data: A Bayesian Approach Refik Soyer * Department of Management Science The George Washington University M. Murat Tarimcilar Department of Management Science

More information

Handling attrition and non-response in longitudinal data

Handling attrition and non-response in longitudinal data Longitudinal and Life Course Studies 2009 Volume 1 Issue 1 Pp 63-72 Handling attrition and non-response in longitudinal data Harvey Goldstein University of Bristol Correspondence. Professor H. Goldstein

More information

PREDICTIVE DISTRIBUTIONS OF OUTSTANDING LIABILITIES IN GENERAL INSURANCE

PREDICTIVE DISTRIBUTIONS OF OUTSTANDING LIABILITIES IN GENERAL INSURANCE PREDICTIVE DISTRIBUTIONS OF OUTSTANDING LIABILITIES IN GENERAL INSURANCE BY P.D. ENGLAND AND R.J. VERRALL ABSTRACT This paper extends the methods introduced in England & Verrall (00), and shows how predictive

More information

Example: Credit card default, we may be more interested in predicting the probabilty of a default than classifying individuals as default or not.

Example: Credit card default, we may be more interested in predicting the probabilty of a default than classifying individuals as default or not. Statistical Learning: Chapter 4 Classification 4.1 Introduction Supervised learning with a categorical (Qualitative) response Notation: - Feature vector X, - qualitative response Y, taking values in C

More information

An Introduction to Extreme Value Theory

An Introduction to Extreme Value Theory An Introduction to Extreme Value Theory Petra Friederichs Meteorological Institute University of Bonn COPS Summer School, July/August, 2007 Applications of EVT Finance distribution of income has so called

More information

11. Time series and dynamic linear models

11. Time series and dynamic linear models 11. Time series and dynamic linear models Objective To introduce the Bayesian approach to the modeling and forecasting of time series. Recommended reading West, M. and Harrison, J. (1997). models, (2 nd

More information

BAYESIAN ANALYSIS OF AN AGGREGATE CLAIM MODEL USING VARIOUS LOSS DISTRIBUTIONS. by Claire Dudley

BAYESIAN ANALYSIS OF AN AGGREGATE CLAIM MODEL USING VARIOUS LOSS DISTRIBUTIONS. by Claire Dudley BAYESIAN ANALYSIS OF AN AGGREGATE CLAIM MODEL USING VARIOUS LOSS DISTRIBUTIONS by Claire Dudley A dissertation submitted for the award of the degree of Master of Science in Actuarial Management School

More information

THE USE OF STATISTICAL DISTRIBUTIONS TO MODEL CLAIMS IN MOTOR INSURANCE

THE USE OF STATISTICAL DISTRIBUTIONS TO MODEL CLAIMS IN MOTOR INSURANCE THE USE OF STATISTICAL DISTRIBUTIONS TO MODEL CLAIMS IN MOTOR INSURANCE Batsirai Winmore Mazviona 1 Tafadzwa Chiduza 2 ABSTRACT In general insurance, companies need to use data on claims gathered from

More information

Probability and Random Variables. Generation of random variables (r.v.)

Probability and Random Variables. Generation of random variables (r.v.) Probability and Random Variables Method for generating random variables with a specified probability distribution function. Gaussian And Markov Processes Characterization of Stationary Random Process Linearly

More information

Validation of Software for Bayesian Models using Posterior Quantiles. Samantha R. Cook Andrew Gelman Donald B. Rubin DRAFT

Validation of Software for Bayesian Models using Posterior Quantiles. Samantha R. Cook Andrew Gelman Donald B. Rubin DRAFT Validation of Software for Bayesian Models using Posterior Quantiles Samantha R. Cook Andrew Gelman Donald B. Rubin DRAFT Abstract We present a simulation-based method designed to establish that software

More information

Monte Carlo-based statistical methods (MASM11/FMS091)

Monte Carlo-based statistical methods (MASM11/FMS091) Monte Carlo-based statistical methods (MASM11/FMS091) Jimmy Olsson Centre for Mathematical Sciences Lund University, Sweden Lecture 5 Sequential Monte Carlo methods I February 5, 2013 J. Olsson Monte Carlo-based

More information

Monte Carlo and Empirical Methods for Stochastic Inference (MASM11/FMS091)

Monte Carlo and Empirical Methods for Stochastic Inference (MASM11/FMS091) Monte Carlo and Empirical Methods for Stochastic Inference (MASM11/FMS091) Magnus Wiktorsson Centre for Mathematical Sciences Lund University, Sweden Lecture 5 Sequential Monte Carlo methods I February

More information

Probability and Statistics

Probability and Statistics Probability and Statistics Syllabus for the TEMPUS SEE PhD Course (Podgorica, April 4 29, 2011) Franz Kappel 1 Institute for Mathematics and Scientific Computing University of Graz Žaneta Popeska 2 Faculty

More information

Characteristics of Global Calling in VoIP services: A logistic regression analysis

Characteristics of Global Calling in VoIP services: A logistic regression analysis www.ijcsi.org 17 Characteristics of Global Calling in VoIP services: A logistic regression analysis Thanyalak Mueangkhot 1, Julian Ming-Sung Cheng 2 and Chaknarin Kongcharoen 3 1 Department of Business

More information

PEST - Beyond Basic Model Calibration. Presented by Jon Traum

PEST - Beyond Basic Model Calibration. Presented by Jon Traum PEST - Beyond Basic Model Calibration Presented by Jon Traum Purpose of Presentation Present advance techniques available in PEST for model calibration High level overview Inspire more people to use PEST!

More information

Linear Threshold Units

Linear Threshold Units Linear Threshold Units w x hx (... w n x n w We assume that each feature x j and each weight w j is a real number (we will relax this later) We will study three different algorithms for learning linear

More information

Centre for Central Banking Studies

Centre for Central Banking Studies Centre for Central Banking Studies Technical Handbook No. 4 Applied Bayesian econometrics for central bankers Andrew Blake and Haroon Mumtaz CCBS Technical Handbook No. 4 Applied Bayesian econometrics

More information

MAN-BITES-DOG BUSINESS CYCLES ONLINE APPENDIX

MAN-BITES-DOG BUSINESS CYCLES ONLINE APPENDIX MAN-BITES-DOG BUSINESS CYCLES ONLINE APPENDIX KRISTOFFER P. NIMARK The next section derives the equilibrium expressions for the beauty contest model from Section 3 of the main paper. This is followed by

More information

Model-based Synthesis. Tony O Hagan

Model-based Synthesis. Tony O Hagan Model-based Synthesis Tony O Hagan Stochastic models Synthesising evidence through a statistical model 2 Evidence Synthesis (Session 3), Helsinki, 28/10/11 Graphical modelling The kinds of models that

More information

Multivariate Normal Distribution

Multivariate Normal Distribution Multivariate Normal Distribution Lecture 4 July 21, 2011 Advanced Multivariate Statistical Methods ICPSR Summer Session #2 Lecture #4-7/21/2011 Slide 1 of 41 Last Time Matrices and vectors Eigenvalues

More information

DURATION ANALYSIS OF FLEET DYNAMICS

DURATION ANALYSIS OF FLEET DYNAMICS DURATION ANALYSIS OF FLEET DYNAMICS Garth Holloway, University of Reading, garth.holloway@reading.ac.uk David Tomberlin, NOAA Fisheries, david.tomberlin@noaa.gov ABSTRACT Though long a standard technique

More information

PROPERTIES OF THE SAMPLE CORRELATION OF THE BIVARIATE LOGNORMAL DISTRIBUTION

PROPERTIES OF THE SAMPLE CORRELATION OF THE BIVARIATE LOGNORMAL DISTRIBUTION PROPERTIES OF THE SAMPLE CORRELATION OF THE BIVARIATE LOGNORMAL DISTRIBUTION Chin-Diew Lai, Department of Statistics, Massey University, New Zealand John C W Rayner, School of Mathematics and Applied Statistics,

More information

Applications of R Software in Bayesian Data Analysis

Applications of R Software in Bayesian Data Analysis Article International Journal of Information Science and System, 2012, 1(1): 7-23 International Journal of Information Science and System Journal homepage: www.modernscientificpress.com/journals/ijinfosci.aspx

More information

Christfried Webers. Canberra February June 2015

Christfried Webers. Canberra February June 2015 c Statistical Group and College of Engineering and Computer Science Canberra February June (Many figures from C. M. Bishop, "Pattern Recognition and ") 1of 829 c Part VIII Linear Classification 2 Logistic

More information

More details on the inputs, functionality, and output can be found below.

More details on the inputs, functionality, and output can be found below. Overview: The SMEEACT (Software for More Efficient, Ethical, and Affordable Clinical Trials) web interface (http://research.mdacc.tmc.edu/smeeactweb) implements a single analysis of a two-armed trial comparing

More information

Dealing with large datasets

Dealing with large datasets Dealing with large datasets (by throwing away most of the data) Alan Heavens Institute for Astronomy, University of Edinburgh with Ben Panter, Rob Tweedie, Mark Bastin, Will Hossack, Keith McKellar, Trevor

More information

> plot(exp.btgpllm, main = "treed GP LLM,", proj = c(1)) > plot(exp.btgpllm, main = "treed GP LLM,", proj = c(2)) quantile diff (error)

> plot(exp.btgpllm, main = treed GP LLM,, proj = c(1)) > plot(exp.btgpllm, main = treed GP LLM,, proj = c(2)) quantile diff (error) > plot(exp.btgpllm, main = "treed GP LLM,", proj = c(1)) > plot(exp.btgpllm, main = "treed GP LLM,", proj = c(2)) 0.4 0.2 0.0 0.2 0.4 treed GP LLM, mean treed GP LLM, 0.00 0.05 0.10 0.15 0.20 x1 x1 0.4

More information

Markov Chain Monte Carlo Simulation Made Simple

Markov Chain Monte Carlo Simulation Made Simple Markov Chain Monte Carlo Simulation Made Simple Alastair Smith Department of Politics New York University April2,2003 1 Markov Chain Monte Carlo (MCMC) simualtion is a powerful technique to perform numerical

More information

Inference on Phase-type Models via MCMC

Inference on Phase-type Models via MCMC Inference on Phase-type Models via MCMC with application to networks of repairable redundant systems Louis JM Aslett and Simon P Wilson Trinity College Dublin 28 th June 202 Toy Example : Redundant Repairable

More information

Logistic Regression. Vibhav Gogate The University of Texas at Dallas. Some Slides from Carlos Guestrin, Luke Zettlemoyer and Dan Weld.

Logistic Regression. Vibhav Gogate The University of Texas at Dallas. Some Slides from Carlos Guestrin, Luke Zettlemoyer and Dan Weld. Logistic Regression Vibhav Gogate The University of Texas at Dallas Some Slides from Carlos Guestrin, Luke Zettlemoyer and Dan Weld. Generative vs. Discriminative Classifiers Want to Learn: h:x Y X features

More information

MSwM examples. Jose A. Sanchez-Espigares, Alberto Lopez-Moreno Dept. of Statistics and Operations Research UPC-BarcelonaTech.

MSwM examples. Jose A. Sanchez-Espigares, Alberto Lopez-Moreno Dept. of Statistics and Operations Research UPC-BarcelonaTech. MSwM examples Jose A. Sanchez-Espigares, Alberto Lopez-Moreno Dept. of Statistics and Operations Research UPC-BarcelonaTech February 24, 2014 Abstract Two examples are described to illustrate the use of

More information

Statistics Graduate Courses

Statistics Graduate Courses Statistics Graduate Courses STAT 7002--Topics in Statistics-Biological/Physical/Mathematics (cr.arr.).organized study of selected topics. Subjects and earnable credit may vary from semester to semester.

More information

Lecture 3: Linear methods for classification

Lecture 3: Linear methods for classification Lecture 3: Linear methods for classification Rafael A. Irizarry and Hector Corrada Bravo February, 2010 Today we describe four specific algorithms useful for classification problems: linear regression,

More information

CCNY. BME I5100: Biomedical Signal Processing. Linear Discrimination. Lucas C. Parra Biomedical Engineering Department City College of New York

CCNY. BME I5100: Biomedical Signal Processing. Linear Discrimination. Lucas C. Parra Biomedical Engineering Department City College of New York BME I5100: Biomedical Signal Processing Linear Discrimination Lucas C. Parra Biomedical Engineering Department CCNY 1 Schedule Week 1: Introduction Linear, stationary, normal - the stuff biology is not

More information

These slides follow closely the (English) course textbook Pattern Recognition and Machine Learning by Christopher Bishop

These slides follow closely the (English) course textbook Pattern Recognition and Machine Learning by Christopher Bishop Music and Machine Learning (IFT6080 Winter 08) Prof. Douglas Eck, Université de Montréal These slides follow closely the (English) course textbook Pattern Recognition and Machine Learning by Christopher

More information

Bayesian Phylogeny and Measures of Branch Support

Bayesian Phylogeny and Measures of Branch Support Bayesian Phylogeny and Measures of Branch Support Bayesian Statistics Imagine we have a bag containing 100 dice of which we know that 90 are fair and 10 are biased. The

More information

Estimation and comparison of multiple change-point models

Estimation and comparison of multiple change-point models Journal of Econometrics 86 (1998) 221 241 Estimation and comparison of multiple change-point models Siddhartha Chib* John M. Olin School of Business, Washington University, 1 Brookings Drive, Campus Box

More information

CS 688 Pattern Recognition Lecture 4. Linear Models for Classification

CS 688 Pattern Recognition Lecture 4. Linear Models for Classification CS 688 Pattern Recognition Lecture 4 Linear Models for Classification Probabilistic generative models Probabilistic discriminative models 1 Generative Approach ( x ) p C k p( C k ) Ck p ( ) ( x Ck ) p(

More information

Simple linear regression

Simple linear regression Simple linear regression Introduction Simple linear regression is a statistical method for obtaining a formula to predict values of one variable from another where there is a causal relationship between

More information

Extreme Value Modeling for Detection and Attribution of Climate Extremes

Extreme Value Modeling for Detection and Attribution of Climate Extremes Extreme Value Modeling for Detection and Attribution of Climate Extremes Jun Yan, Yujing Jiang Joint work with Zhuo Wang, Xuebin Zhang Department of Statistics, University of Connecticut February 2, 2016

More information

VI. Introduction to Logistic Regression

VI. Introduction to Logistic Regression VI. Introduction to Logistic Regression We turn our attention now to the topic of modeling a categorical outcome as a function of (possibly) several factors. The framework of generalized linear models

More information

Statistics in Retail Finance. Chapter 6: Behavioural models

Statistics in Retail Finance. Chapter 6: Behavioural models Statistics in Retail Finance 1 Overview > So far we have focussed mainly on application scorecards. In this chapter we shall look at behavioural models. We shall cover the following topics:- Behavioural

More information

Marketing Mix Modelling and Big Data P. M Cain

Marketing Mix Modelling and Big Data P. M Cain 1) Introduction Marketing Mix Modelling and Big Data P. M Cain Big data is generally defined in terms of the volume and variety of structured and unstructured information. Whereas structured data is stored

More information

Description. Textbook. Grading. Objective

Description. Textbook. Grading. Objective EC151.02 Statistics for Business and Economics (MWF 8:00-8:50) Instructor: Chiu Yu Ko Office: 462D, 21 Campenalla Way Phone: 2-6093 Email: kocb@bc.edu Office Hours: by appointment Description This course

More information

Model Selection and Claim Frequency for Workers Compensation Insurance

Model Selection and Claim Frequency for Workers Compensation Insurance Model Selection and Claim Frequency for Workers Compensation Insurance Jisheng Cui, David Pitt and Guoqi Qian Abstract We consider a set of workers compensation insurance claim data where the aggregate

More information

Generalized Linear Models

Generalized Linear Models Generalized Linear Models We have previously worked with regression models where the response variable is quantitative and normally distributed. Now we turn our attention to two types of models where the

More information

A Basic Introduction to Missing Data

A Basic Introduction to Missing Data John Fox Sociology 740 Winter 2014 Outline Why Missing Data Arise Why Missing Data Arise Global or unit non-response. In a survey, certain respondents may be unreachable or may refuse to participate. Item

More information

Introduction to Monte Carlo. Astro 542 Princeton University Shirley Ho

Introduction to Monte Carlo. Astro 542 Princeton University Shirley Ho Introduction to Monte Carlo Astro 542 Princeton University Shirley Ho Agenda Monte Carlo -- definition, examples Sampling Methods (Rejection, Metropolis, Metropolis-Hasting, Exact Sampling) Markov Chains

More information

Geostatistics Exploratory Analysis

Geostatistics Exploratory Analysis Instituto Superior de Estatística e Gestão de Informação Universidade Nova de Lisboa Master of Science in Geospatial Technologies Geostatistics Exploratory Analysis Carlos Alberto Felgueiras cfelgueiras@isegi.unl.pt

More information

EE 570: Location and Navigation

EE 570: Location and Navigation EE 570: Location and Navigation On-Line Bayesian Tracking Aly El-Osery 1 Stephen Bruder 2 1 Electrical Engineering Department, New Mexico Tech Socorro, New Mexico, USA 2 Electrical and Computer Engineering

More information

How To Understand The Theory Of Probability

How To Understand The Theory Of Probability Graduate Programs in Statistics Course Titles STAT 100 CALCULUS AND MATR IX ALGEBRA FOR STATISTICS. Differential and integral calculus; infinite series; matrix algebra STAT 195 INTRODUCTION TO MATHEMATICAL

More information

Additional sources Compilation of sources: http://lrs.ed.uiuc.edu/tseportal/datacollectionmethodologies/jin-tselink/tselink.htm

Additional sources Compilation of sources: http://lrs.ed.uiuc.edu/tseportal/datacollectionmethodologies/jin-tselink/tselink.htm Mgt 540 Research Methods Data Analysis 1 Additional sources Compilation of sources: http://lrs.ed.uiuc.edu/tseportal/datacollectionmethodologies/jin-tselink/tselink.htm http://web.utk.edu/~dap/random/order/start.htm

More information

Adaptive Demand-Forecasting Approach based on Principal Components Time-series an application of data-mining technique to detection of market movement

Adaptive Demand-Forecasting Approach based on Principal Components Time-series an application of data-mining technique to detection of market movement Adaptive Demand-Forecasting Approach based on Principal Components Time-series an application of data-mining technique to detection of market movement Toshio Sugihara Abstract In this study, an adaptive

More information

Analysis and Computation for Finance Time Series - An Introduction

Analysis and Computation for Finance Time Series - An Introduction ECMM703 Analysis and Computation for Finance Time Series - An Introduction Alejandra González Harrison 161 Email: mag208@exeter.ac.uk Time Series - An Introduction A time series is a sequence of observations

More information

Introduction to General and Generalized Linear Models

Introduction to General and Generalized Linear Models Introduction to General and Generalized Linear Models General Linear Models - part I Henrik Madsen Poul Thyregod Informatics and Mathematical Modelling Technical University of Denmark DK-2800 Kgs. Lyngby

More information

A Coefficient of Variation for Skewed and Heavy-Tailed Insurance Losses. Michael R. Powers[ 1 ] Temple University and Tsinghua University

A Coefficient of Variation for Skewed and Heavy-Tailed Insurance Losses. Michael R. Powers[ 1 ] Temple University and Tsinghua University A Coefficient of Variation for Skewed and Heavy-Tailed Insurance Losses Michael R. Powers[ ] Temple University and Tsinghua University Thomas Y. Powers Yale University [June 2009] Abstract We propose a

More information

Incorporating cost in Bayesian Variable Selection, with application to cost-effective measurement of quality of health care.

Incorporating cost in Bayesian Variable Selection, with application to cost-effective measurement of quality of health care. Incorporating cost in Bayesian Variable Selection, with application to cost-effective measurement of quality of health care University of Florida 10th Annual Winter Workshop: Bayesian Model Selection and

More information

STATISTICA Formula Guide: Logistic Regression. Table of Contents

STATISTICA Formula Guide: Logistic Regression. Table of Contents : Table of Contents... 1 Overview of Model... 1 Dispersion... 2 Parameterization... 3 Sigma-Restricted Model... 3 Overparameterized Model... 4 Reference Coding... 4 Model Summary (Summary Tab)... 5 Summary

More information

The HB. How Bayesian methods have changed the face of marketing research. Summer 2004

The HB. How Bayesian methods have changed the face of marketing research. Summer 2004 The HB How Bayesian methods have changed the face of marketing research. 20 Summer 2004 Reprinted with permission from Marketing Research, Summer 2004, published by the American Marketing Association.

More information

Some Essential Statistics The Lure of Statistics

Some Essential Statistics The Lure of Statistics Some Essential Statistics The Lure of Statistics Data Mining Techniques, by M.J.A. Berry and G.S Linoff, 2004 Statistics vs. Data Mining..lie, damn lie, and statistics mining data to support preconceived

More information

Operational Risk Management: Added Value of Advanced Methodologies

Operational Risk Management: Added Value of Advanced Methodologies Operational Risk Management: Added Value of Advanced Methodologies Paris, September 2013 Bertrand HASSANI Head of Major Risks Management & Scenario Analysis Disclaimer: The opinions, ideas and approaches

More information

Variations of Statistical Models

Variations of Statistical Models 38. Statistics 1 38. STATISTICS Revised September 2013 by G. Cowan (RHUL). This chapter gives an overview of statistical methods used in high-energy physics. In statistics, we are interested in using a

More information

Vasicek Single Factor Model

Vasicek Single Factor Model Alexandra Kochendörfer 7. Februar 2011 1 / 33 Problem Setting Consider portfolio with N different credits of equal size 1. Each obligor has an individual default probability. In case of default of the

More information

An introduction to the analysis of extreme values using R and extremes

An introduction to the analysis of extreme values using R and extremes An introduction to the analysis of extreme values using R and extremes Eric Gilleland, National Center for Atmospheric Research, Boulder, Colorado, U.S.A. http://www.ral.ucar.edu/staff/ericg Graybill VIII

More information

Using SAS PROC MCMC to Estimate and Evaluate Item Response Theory Models

Using SAS PROC MCMC to Estimate and Evaluate Item Response Theory Models Using SAS PROC MCMC to Estimate and Evaluate Item Response Theory Models Clement A Stone Abstract Interest in estimating item response theory (IRT) models using Bayesian methods has grown tremendously

More information

Tax Cognizant Portfolio Analysis: A Methodology for Maximizing After Tax Wealth

Tax Cognizant Portfolio Analysis: A Methodology for Maximizing After Tax Wealth Tax Cognizant Portfolio Analysis: A Methodology for Maximizing After Tax Wealth Kenneth A. Blay Director of Research 1st Global, Inc. Dr. Harry M. Markowitz Harry Markowitz Company CFA Society of Austin

More information

Factor analysis. Angela Montanari

Factor analysis. Angela Montanari Factor analysis Angela Montanari 1 Introduction Factor analysis is a statistical model that allows to explain the correlations between a large number of observed correlated variables through a small number

More information

Estimating Industry Multiples

Estimating Industry Multiples Estimating Industry Multiples Malcolm Baker * Harvard University Richard S. Ruback Harvard University First Draft: May 1999 Rev. June 11, 1999 Abstract We analyze industry multiples for the S&P 500 in

More information

Calculating VaR. Capital Market Risk Advisors CMRA

Calculating VaR. Capital Market Risk Advisors CMRA Calculating VaR Capital Market Risk Advisors How is VAR Calculated? Sensitivity Estimate Models - use sensitivity factors such as duration to estimate the change in value of the portfolio to changes in

More information

GLMs: Gompertz s Law. GLMs in R. Gompertz s famous graduation formula is. or log µ x is linear in age, x,

GLMs: Gompertz s Law. GLMs in R. Gompertz s famous graduation formula is. or log µ x is linear in age, x, Computing: an indispensable tool or an insurmountable hurdle? Iain Currie Heriot Watt University, Scotland ATRC, University College Dublin July 2006 Plan of talk General remarks The professional syllabus

More information

Optimising and Adapting the Metropolis Algorithm

Optimising and Adapting the Metropolis Algorithm Optimising and Adapting the Metropolis Algorithm by Jeffrey S. Rosenthal 1 (February 2013; revised March 2013 and May 2013 and June 2013) 1 Introduction Many modern scientific questions involve high dimensional

More information

A random point process model for the score in sport matches

A random point process model for the score in sport matches IMA Journal of Management Mathematics (2009) 20, 121 131 doi:10.1093/imaman/dpn027 Advance Access publication on October 30, 2008 A random point process model for the score in sport matches PETR VOLF Institute

More information

Introduction to Markov Chain Monte Carlo

Introduction to Markov Chain Monte Carlo Introduction to Markov Chain Monte Carlo Monte Carlo: sample from a distribution to estimate the distribution to compute max, mean Markov Chain Monte Carlo: sampling using local information Generic problem

More information

Monte Carlo Methods in Finance

Monte Carlo Methods in Finance Author: Yiyang Yang Advisor: Pr. Xiaolin Li, Pr. Zari Rachev Department of Applied Mathematics and Statistics State University of New York at Stony Brook October 2, 2012 Outline Introduction 1 Introduction

More information

Bayesian Statistics: Indian Buffet Process

Bayesian Statistics: Indian Buffet Process Bayesian Statistics: Indian Buffet Process Ilker Yildirim Department of Brain and Cognitive Sciences University of Rochester Rochester, NY 14627 August 2012 Reference: Most of the material in this note

More information

An Application of Inverse Reinforcement Learning to Medical Records of Diabetes Treatment

An Application of Inverse Reinforcement Learning to Medical Records of Diabetes Treatment An Application of Inverse Reinforcement Learning to Medical Records of Diabetes Treatment Hideki Asoh 1, Masanori Shiro 1 Shotaro Akaho 1, Toshihiro Kamishima 1, Koiti Hasida 1, Eiji Aramaki 2, and Takahide

More information

Treatment of Incomplete Data in the Field of Operational Risk: The Effects on Parameter Estimates, EL and UL Figures

Treatment of Incomplete Data in the Field of Operational Risk: The Effects on Parameter Estimates, EL and UL Figures Chernobai.qxd 2/1/ 1: PM Page 1 Treatment of Incomplete Data in the Field of Operational Risk: The Effects on Parameter Estimates, EL and UL Figures Anna Chernobai; Christian Menn*; Svetlozar T. Rachev;

More information

SAS Certificate Applied Statistics and SAS Programming

SAS Certificate Applied Statistics and SAS Programming SAS Certificate Applied Statistics and SAS Programming SAS Certificate Applied Statistics and Advanced SAS Programming Brigham Young University Department of Statistics offers an Applied Statistics and

More information

Logistic Regression (1/24/13)

Logistic Regression (1/24/13) STA63/CBB540: Statistical methods in computational biology Logistic Regression (/24/3) Lecturer: Barbara Engelhardt Scribe: Dinesh Manandhar Introduction Logistic regression is model for regression used

More information

Multivariate Logistic Regression

Multivariate Logistic Regression 1 Multivariate Logistic Regression As in univariate logistic regression, let π(x) represent the probability of an event that depends on p covariates or independent variables. Then, using an inv.logit formulation

More information

GLM, insurance pricing & big data: paying attention to convergence issues.

GLM, insurance pricing & big data: paying attention to convergence issues. GLM, insurance pricing & big data: paying attention to convergence issues. Michaël NOACK - michael.noack@addactis.com Senior consultant & Manager of ADDACTIS Pricing Copyright 2014 ADDACTIS Worldwide.

More information

Review of the Methods for Handling Missing Data in. Longitudinal Data Analysis

Review of the Methods for Handling Missing Data in. Longitudinal Data Analysis Int. Journal of Math. Analysis, Vol. 5, 2011, no. 1, 1-13 Review of the Methods for Handling Missing Data in Longitudinal Data Analysis Michikazu Nakai and Weiming Ke Department of Mathematics and Statistics

More information