Limitations of Indicator Kriging for Predicting Data with Trend

Size: px
Start display at page:

Download "Limitations of Indicator Kriging for Predicting Data with Trend"

Transcription

1 Limitations of Indicator Kriging for Predicting Data with Trend Andreas Papritz ETH Zurich, Department of Environmental Sciences, Zurich, Switzerland Abstract. Goovaerts and Journel [8] proposed simple indicator kriging with varying local means (siklm) as a way to extend the indicator kriging methodology to variates with an apparent spatial trend. However, contrary to the authors implications, the detrended indicators; i.e., the indicator residuals, are not stationary, and their covariance structure cannot be unbiasedly estimated from a single realization of a random process. Ignoring the non-stationary nature of the covariance of the indicators ruins the usual mean square optimality of kriging. Therefore, siklm is an ad-hoc procedure, which lacks optimality, and its use should be discouraged. INTRODUCTION According to ISI Web of Science R, about 20 journal articles and 0 contributions to conference proceedings have been published about indicator kriging (IK for short) to date. Many of these studies deal with mapping the probability that a spatial variable exceeds a threshold [e.g., 3, ]. This is an important problem in environmental surveillance and monitoring. Some studies apply IK to data with an apparent trend, following advice by Goovaerts and Journel [8] and Goovaerts [7], sec Unfortunately, Goovaerts simple IK with varying local means (siklm in short) is not feasible in practice, as it ask for the modelling of non-stationary covariances. By in practice I mean the case where we consider our measurements as a sample from a single realisation of a random process. The same problem arises if IK is used for data that show unbounded variograms. To substantiate my contention, I highlight and discuss here the limitations of siklm which arise from basic probability theory. Notwithstanding their elementary nature, these limitations seem to have been frequently ignored. I further demonstrate by a simulation that siklm lacks the usual mean square optimality of kriging, which leads me to discourage the use of siklm. 2 COVARIANCES OF INDICATOR TRANSFORMS OF NON- STATIONARY VARIATES Let Z(s) denote a real valued random variable used for modelling an attribute z measured at location s, and let I(s; z ) for a specific cut-off z be the indicator transform I(s; z ) = if Z(s) z and I(s; z ) = 0 otherwise. Then E [I(s; z )] = Prob[Z(s) z ] = F (s; z ), () Var [I(s; z )] = F (s; z ) ( F (s; z )), (2) where F (s; z) is the cumulative distribution function (cdf) of Z(s), E [.] and Var [.] are the expectation and variance operators, and Prob[A] denotes the probability of

2 event A. Further, let Cov [.] and Cor [.] denote the covariance and correlation operators. The (cross-)covariance function of the indicators for two cut-offs z and z, C I (s, s + h; z, z ) = Cov [I(s; z ), I(s + h; z )], is related to the bivariate cumulative distribution function, F (s, s+h; z, z ) = Prob[Z(s) z, Z(s+h) z ], of Z(s) and Z(s + h) by [e.g., 0] C I (s, s + h; z, z ) = F (s, s + h; z, z ) F (s; z ) F (s + h; z ). (3) For a random process with stationary bivariate distributions equations () (3) simplify to E [I(s; z )] = F (z ), (4) Var [I(s; z )] = F (z ) ( F (z )), (5) C I (h; z, z ) = F (h; z, z ) F (z ) F (z ). (6) Clearly, the right-hand sides of (4) (6) do not depend on s, and C I (.) is a function of the lag h only. Notice that equation (4) means that the expectations of the random variables, say E [Z(s)] = µ(s), must not vary in space. Otherwise, the cdfs would not be constant. Furthermore, equations (4) (6) show that we may (at least hope to) infer the first two moments of the indicators when we have data from only one realisation of {Z(s)}. To estimate the expectations and (cross-)covariances of the indicators we replace the averaging of multiple realisations by averaging over space. Spatial averaging, however, is inappropriate in the general case of non-stationary distributions; i.e., for models with moments given by equations () (3). In spite of the above, Goovaerts and Journel [8] proposed to extend the IK methodology to random processes with spatially varying µ(s). They called their method simple IK with varying local means. The terms simple IK with local prior means [7], soft IK [9] or IK with external drift [2] have since been used to denote the approach also. Apparently, the authors realized that the indicators have non-stationary (co-)variances if µ(s) varies spatially. Given an estimate, F (s; z ), of the cdf, they proposed to estimate the variogram of I(s; z ) by fitting model functions to the sample variogram, γ R (s i ; h k ; z, z ) = N(h k ) {r(s i ; z ) r(s i + h k ; z )} 2, (7) 2 N(h k ) i= of the indicator residuals r(s; z ) = i(s; z ) F (s; z ) (i(s; z ) is the indicator transform of a measurement and N(h k ) is the number of data pairs in lag-class h k ). Unfortunately, they failed to recognize that half the expected squared difference of the indicator residuals; i.e., their semivariance, is not independent of s, even if (unrealistically) the true cdf is assumed to be known; i.e. if F (s; z ) = F (s; z ): 2 E [{R(s; z ) R(s + h; z )} 2 ] = 2 Var [R(s; z ) R(s + h; z )] = 2 {F (s; z ) ( F (s; z )) + F (s + h; z ) ( F (s + h; z ))} {F (s, s + h; z, z ) F (s; z ) F (s + h; z )}. (8) As above, F (s; z ) and F (s, s + h; z, z ) are functions of s in the non-stationary case. Hence, the right-hand side of equation (8) still depends on s. Grouping the observed

3 piecewise constant trend, nugget 0. piecewise constant trend, nugget 0. attribute Z(s) E[Z(s)] cutoff indicator I(s ;0) E[I(s ;0)] Var[I(s ;0)] location s location s Figure : Two realisations, shown in red and blue, of a Gaussian random process with a piecewise constant mean function and a cubic variogram with nugget (left panel) and the corresponding indicator transforms of the simulated data for the cut-off z = 0 (right panel) (solid lines: expectations of the random variables; dotted lines: cut-off [left] and variances of indicator random variables [right]). indicator residuals into lag classes and computing a sample variogram by the customary method-of-moments estimator render it meaningless in this instance. The indicator transforms of {Z(s)} with constant µ(s) but unbounded variogram have non-stationary covariances, too. To see this, we consider Gaussian, zero order intrinsic {Z(s)}, s IR, with a linear variogram, γ(h) = h. Two increments, say Z(s) = Z(s) Z(0) and Z(t) = Z(t) Z(0), are then normally distributed with variances Var [ Z(s)] = 2s, Var [ Z(t)] = 2t and correlation ρ = Cor [ Z(s), Z(t)] = min(s, t) s t. (9) Thus, their bivariate density function is equal to [, p. 936] ( g(z s, z t ; s, t, ρ) = 4π s t( ρ 2 ) exp z2 s/s 2 2ρz s z t / s t + zt 2 /t 2 ). (0) 4( ρ 2 ) The covariance of the indicator transforms of the increments is related to g(z s, z t ; s, t, ρ) by [4, p. 400] C I (s, t; z, z ) = min(s,t) s t 0 g(z, z ; s, t, ρ) dρ. () Clearly, C I (s, t; z, z ) depends on s and t not only through the lag h = s t, and the covariance is non-stationary. 3 SIMULATION STUDY I used simulation to illustrate how large the bias between the non-stationary variograms of the indicators and an estimate based on equation (7) can be and to demonstrate that the

4 piecewise constant trend, nugget 0. piecewise constant trend, nugget 0. location s lag distance h semivariance γ(s, h ) expectation of equation (7) lag distance h Figure 2: Non-stationary indicator semivariances, γ I (s i, s i + h k ; 0, 0), for the simulations shown in Fig.. The left panel shows γ I (.) as a function of s and h, and the right panel shows the variograms γ I (s i, s i + h k ; 0, 0) for six locations s i : 0, 40,..., 200 as a function of h, together with the expectation, E [ γ R (s 0, s,... ; h k )], of the estimator given in equation (7). bias leads to a loss of efficiency in simple IK. To this end, I simulated 0 5 realisations of a Gaussian random process at the locations s 0 = 0, s =,..., s 300 = 300 on a line. The process had a piecewise constant mean function and a cubic variogram with range 66, unit total sill and nugget 0.. Piecewise constant mean functions were used by Goovaerts and Journel [8], van Meirvenne and Goovaerts [] and Brus et al. [3]. Figure shows two realisations and the corresponding indicators for the cut-off z = 0. The right panel also shows the estimated expectations of the indicators F (s i ; 0) = j= I(s i ; 0) j and their variances. The subscript j denotes here the jth realisation. For each s i : 0,,..., 200 I estimated the non-stationary covariances of the indicators for the lag distances h k : 0,, 2,..., 00 by Ĉ I (s i, s i + h k ; 0, 0) = 0 5 R(s i ; 0) j R(s i + h k ; 0) j, where R(s; 0) j = I(s; 0) j F (s; 0), and from those estimates I computed the nonstationary semivariances of the indicators by γ I (s i, s i + h k ; 0, 0) = {ĈI (s i, s i ; 0, 0) + 2 ĈI(s i + h k, s i + h k ; 0, 0) } ĈI(s i, s i + h k ; 0, 0). These estimates where then compared with the estimated expectation of the sample variograms of the indicator residuals computed for each realisation by equation (7) E [ γ R (s 0, s,... ; h k )] = j= j= i=0 {R(s i ; 0) j R(s i + h k ; 0) j } 2. The left panel of Figure 2 shows γ I (s i, s i + h k ; 0, 0) as a function of s i and h k. We see abrupt changes of the semivariance for a given h k along the ordinate from s 0 = 0 to

5 simple kriging weights SK computed with non stationary covariances SK computed with covariances estimated by equation (7) relative efficiency of SK computed with covariances estimated by equation (7) location of prediction point s 0 Figure 3: Simple IK weights of 6 measurements at locations 50 (black), 70 (red), 90 (green), 0 (blue), 30 (cyan) and 50 (magenta) as a function of the position of the prediction point s 0. The solid dots are the optimal weights computed from the non-stationary semivariances ( γ I (s i, s i + h k ; 0, 0)), the open squares are the weights computed from the expectation (E [ γ R (s 0, s,... ; h k )]) of the estimator given in equation (7). The solid line is the relative efficiency of siklm. Tickmarks without labels show the boundaries of the subregions with constant means (cf. Fig. ). s 200 = 200. The right panel of the figure shows the change of the semivariance with h k for 6 selected locations. The semivariance does not increase monotonically with h k : there are abrupt changes because of the non-constant variances of the indicators. If we ignore the non-stationary nature of the problem and use equation (7) then these jumps are lost. The discrepancies between E [ γ R (s 0, s,... ; h k )] and γ I (s i, s i + h k ; 0, 0) may seem not very significant. However, Figure 3 shows that they matter if we predict the indicators by simple kriging at the locations s 0 : 50, 5, 52,..., 50 from 6 measurements at s i : 50, 70, 90, 0, 30, 50. Close to the boundaries of the subregions with constant mean we see abrupt changes in the optimal simple IK weights which are lost when we compute them from E [ γ R (s 0, s,... ; h k )]. A loss of efficiency of up to 20% results when we use Goovaerts and Journel s suggestion to estimate the variogram. Thus, the example shows that kriging looses its mean square optimality if we ignore the non-stationary nature of the problem. We can then merely hope that kriging provides better predictions than other ad-hoc procedures such as inverse distance weighting of the indicators. 4 CONCLUSIONS I conclude by stating that any attempt to use IK for data with an apparent trend either explicitly (siklm) or implicitly by using ordinary IK within a local neighbourhood of support points requires the modelling of non-stationary indicator variograms to preserve

6 the mean square optimality of kriging. The same problem arises for random processes with constant means but unbounded variograms, although the loss of efficiency of siklm was smaller in the simulations that I ran as well but did not report here. As we cannot estimate non-stationary variograms from only one realization of {Z(s)}, IK is in practice limited to geostatistical analyses of data without an apparent trend and a bounded variogram; i.e., to models with stationary bivariate distributions. This a serious limitation because in many instances we have full coverage ancillary information that could (and should!) be exploited when predicting Z(s) or any non-linear transform thereof. But fortunately, there is life beyond IK: Diggle et al. [6] showed how to extend geostatistical methodology to non-normal response variates, and related approaches also exist for lattice models [5, chap ], so there is no harm to give up the IK methodology altogether. REFERENCES [] Abramowitz, M. and Stegun, I. A. (965). Handbook of Mathematical Functions. Dover, New York. [2] Bárdossy, A. and Lehmann, W. (998). Spatial distribution of soil moisture in a small catchment.part : Geostatistical analysis. Journal of Hydrology, 206, 5. [3] Brus, D. J., de Gruijter, J. J., Walvoort, D. J. J., de Vries, F., Bronswijk, J. J. B., Römkens, P. F. A. M., and de Vries, W. (2002). Mapping the probability of exceeding critical thresholds for cadmium concentrations in soils in the netherlands. Journal of Environmental Quality, 3, [4] Chilès, J.-P. and Delfiner, P. (999). Geostatistics: Modeling Spatial Uncertainty. John Wiley & Sons, New York. [5] Cressie, N. A. C. (993). Statistics for Spatial Data. John Wiley & Sons, New York, revised edition. [6] Diggle, P. J., Tawn, J. A., and Moyeed, R. A. (998). Model-based geostatistics (with discussions). Applied Statistics, 47(3), [7] Goovaerts, P. (997). Geostatistics for Natural Resources Evaluation. Oxford University Press, New York. [8] Goovaerts, P. and Journel, A. G. (995). Integrating soil map information in modelling the spatial variation of continuous soil properties. European Journal of Soil Science, 46, [9] Grunwald, S., Goovaerts, P., Bliss, C. M., Comerford, N. B., and Lamsal, S. (2006). Incorporation of auxiliary information in the geostatistical similation of soil nitrate nitrogen. Vadose Zone Journal, 5, [0] Journel, A. G. and Posa, D. (990). Characteristic behavior and order relations for indicator variograms. Mathematical Geology, 22(8), [] van Meirvenne, M. and Goovaerts, P. (200). Evaluating the probability of exceeding a site-specific soil cadmium contamination threshold. Geoderma, 02,

INTRODUCTION TO GEOSTATISTICS And VARIOGRAM ANALYSIS

INTRODUCTION TO GEOSTATISTICS And VARIOGRAM ANALYSIS INTRODUCTION TO GEOSTATISTICS And VARIOGRAM ANALYSIS C&PE 940, 17 October 2005 Geoff Bohling Assistant Scientist Kansas Geological Survey geoff@kgs.ku.edu 864-2093 Overheads and other resources available

More information

Spatial sampling effect of laboratory practices in a porphyry copper deposit

Spatial sampling effect of laboratory practices in a porphyry copper deposit Spatial sampling effect of laboratory practices in a porphyry copper deposit Serge Antoine Séguret Centre of Geosciences and Geoengineering/ Geostatistics, MINES ParisTech, Fontainebleau, France ABSTRACT

More information

An Interactive Tool for Residual Diagnostics for Fitting Spatial Dependencies (with Implementation in R)

An Interactive Tool for Residual Diagnostics for Fitting Spatial Dependencies (with Implementation in R) DSC 2003 Working Papers (Draft Versions) http://www.ci.tuwien.ac.at/conferences/dsc-2003/ An Interactive Tool for Residual Diagnostics for Fitting Spatial Dependencies (with Implementation in R) Ernst

More information

Geography 4203 / 5203. GIS Modeling. Class (Block) 9: Variogram & Kriging

Geography 4203 / 5203. GIS Modeling. Class (Block) 9: Variogram & Kriging Geography 4203 / 5203 GIS Modeling Class (Block) 9: Variogram & Kriging Some Updates Today class + one proposal presentation Feb 22 Proposal Presentations Feb 25 Readings discussion (Interpolation) Last

More information

Introduction to Modeling Spatial Processes Using Geostatistical Analyst

Introduction to Modeling Spatial Processes Using Geostatistical Analyst Introduction to Modeling Spatial Processes Using Geostatistical Analyst Konstantin Krivoruchko, Ph.D. Software Development Lead, Geostatistics kkrivoruchko@esri.com Geostatistics is a set of models and

More information

Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model

Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model 1 September 004 A. Introduction and assumptions The classical normal linear regression model can be written

More information

Geostatistical Earth Modeling Software: User s Manual. Nicolas Remy

Geostatistical Earth Modeling Software: User s Manual. Nicolas Remy Geostatistical Earth Modeling Software: User s Manual Nicolas Remy May 2004 Contents 1 General Overview 4 1.1 First Steps with GEMS............................ 4 1.1.1 A quick tour to the graphical user

More information

Least Squares Estimation

Least Squares Estimation Least Squares Estimation SARA A VAN DE GEER Volume 2, pp 1041 1045 in Encyclopedia of Statistics in Behavioral Science ISBN-13: 978-0-470-86080-9 ISBN-10: 0-470-86080-4 Editors Brian S Everitt & David

More information

PROPERTIES OF THE SAMPLE CORRELATION OF THE BIVARIATE LOGNORMAL DISTRIBUTION

PROPERTIES OF THE SAMPLE CORRELATION OF THE BIVARIATE LOGNORMAL DISTRIBUTION PROPERTIES OF THE SAMPLE CORRELATION OF THE BIVARIATE LOGNORMAL DISTRIBUTION Chin-Diew Lai, Department of Statistics, Massey University, New Zealand John C W Rayner, School of Mathematics and Applied Statistics,

More information

Annealing Techniques for Data Integration

Annealing Techniques for Data Integration Reservoir Modeling with GSLIB Annealing Techniques for Data Integration Discuss the Problem of Permeability Prediction Present Annealing Cosimulation More Details on Simulated Annealing Examples SASIM

More information

Bias in the Estimation of Mean Reversion in Continuous-Time Lévy Processes

Bias in the Estimation of Mean Reversion in Continuous-Time Lévy Processes Bias in the Estimation of Mean Reversion in Continuous-Time Lévy Processes Yong Bao a, Aman Ullah b, Yun Wang c, and Jun Yu d a Purdue University, IN, USA b University of California, Riverside, CA, USA

More information

2 An Introduction to Model-Based Geostatistics

2 An Introduction to Model-Based Geostatistics 2 An Introduction to Model-Based Geostatistics Peter J. Diggle Paulo J. Ribeiro Jr. Ole F. Christensen 2.1 Introduction The term geostatistics identifies the part of spatial statistics which is concerned

More information

A logistic approximation to the cumulative normal distribution

A logistic approximation to the cumulative normal distribution A logistic approximation to the cumulative normal distribution Shannon R. Bowling 1 ; Mohammad T. Khasawneh 2 ; Sittichai Kaewkuekool 3 ; Byung Rae Cho 4 1 Old Dominion University (USA); 2 State University

More information

Probability and Random Variables. Generation of random variables (r.v.)

Probability and Random Variables. Generation of random variables (r.v.) Probability and Random Variables Method for generating random variables with a specified probability distribution function. Gaussian And Markov Processes Characterization of Stationary Random Process Linearly

More information

4. Simple regression. QBUS6840 Predictive Analytics. https://www.otexts.org/fpp/4

4. Simple regression. QBUS6840 Predictive Analytics. https://www.otexts.org/fpp/4 4. Simple regression QBUS6840 Predictive Analytics https://www.otexts.org/fpp/4 Outline The simple linear model Least squares estimation Forecasting with regression Non-linear functional forms Regression

More information

LOGIT AND PROBIT ANALYSIS

LOGIT AND PROBIT ANALYSIS LOGIT AND PROBIT ANALYSIS A.K. Vasisht I.A.S.R.I., Library Avenue, New Delhi 110 012 amitvasisht@iasri.res.in In dummy regression variable models, it is assumed implicitly that the dependent variable Y

More information

An introduction to Value-at-Risk Learning Curve September 2003

An introduction to Value-at-Risk Learning Curve September 2003 An introduction to Value-at-Risk Learning Curve September 2003 Value-at-Risk The introduction of Value-at-Risk (VaR) as an accepted methodology for quantifying market risk is part of the evolution of risk

More information

The CUSUM algorithm a small review. Pierre Granjon

The CUSUM algorithm a small review. Pierre Granjon The CUSUM algorithm a small review Pierre Granjon June, 1 Contents 1 The CUSUM algorithm 1.1 Algorithm............................... 1.1.1 The problem......................... 1.1. The different steps......................

More information

What s New in Econometrics? Lecture 8 Cluster and Stratified Sampling

What s New in Econometrics? Lecture 8 Cluster and Stratified Sampling What s New in Econometrics? Lecture 8 Cluster and Stratified Sampling Jeff Wooldridge NBER Summer Institute, 2007 1. The Linear Model with Cluster Effects 2. Estimation with a Small Number of Groups and

More information

8. THE NORMAL DISTRIBUTION

8. THE NORMAL DISTRIBUTION 8. THE NORMAL DISTRIBUTION The normal distribution with mean μ and variance σ 2 has the following density function: The normal distribution is sometimes called a Gaussian Distribution, after its inventor,

More information

Time Series and Forecasting

Time Series and Forecasting Chapter 22 Page 1 Time Series and Forecasting A time series is a sequence of observations of a random variable. Hence, it is a stochastic process. Examples include the monthly demand for a product, the

More information

Extreme Value Modeling for Detection and Attribution of Climate Extremes

Extreme Value Modeling for Detection and Attribution of Climate Extremes Extreme Value Modeling for Detection and Attribution of Climate Extremes Jun Yan, Yujing Jiang Joint work with Zhuo Wang, Xuebin Zhang Department of Statistics, University of Connecticut February 2, 2016

More information

Two Topics in Parametric Integration Applied to Stochastic Simulation in Industrial Engineering

Two Topics in Parametric Integration Applied to Stochastic Simulation in Industrial Engineering Two Topics in Parametric Integration Applied to Stochastic Simulation in Industrial Engineering Department of Industrial Engineering and Management Sciences Northwestern University September 15th, 2014

More information

Geostatistics Exploratory Analysis

Geostatistics Exploratory Analysis Instituto Superior de Estatística e Gestão de Informação Universidade Nova de Lisboa Master of Science in Geospatial Technologies Geostatistics Exploratory Analysis Carlos Alberto Felgueiras cfelgueiras@isegi.unl.pt

More information

PITFALLS IN TIME SERIES ANALYSIS. Cliff Hurvich Stern School, NYU

PITFALLS IN TIME SERIES ANALYSIS. Cliff Hurvich Stern School, NYU PITFALLS IN TIME SERIES ANALYSIS Cliff Hurvich Stern School, NYU The t -Test If x 1,..., x n are independent and identically distributed with mean 0, and n is not too small, then t = x 0 s n has a standard

More information

Modeling the Distribution of Environmental Radon Levels in Iowa: Combining Multiple Sources of Spatially Misaligned Data

Modeling the Distribution of Environmental Radon Levels in Iowa: Combining Multiple Sources of Spatially Misaligned Data Modeling the Distribution of Environmental Radon Levels in Iowa: Combining Multiple Sources of Spatially Misaligned Data Brian J. Smith, Ph.D. The University of Iowa Joint Statistical Meetings August 10,

More information

Introduction to General and Generalized Linear Models

Introduction to General and Generalized Linear Models Introduction to General and Generalized Linear Models General Linear Models - part I Henrik Madsen Poul Thyregod Informatics and Mathematical Modelling Technical University of Denmark DK-2800 Kgs. Lyngby

More information

Chapter 1. Vector autoregressions. 1.1 VARs and the identi cation problem

Chapter 1. Vector autoregressions. 1.1 VARs and the identi cation problem Chapter Vector autoregressions We begin by taking a look at the data of macroeconomics. A way to summarize the dynamics of macroeconomic data is to make use of vector autoregressions. VAR models have become

More information

Linear Programming in Matrix Form

Linear Programming in Matrix Form Linear Programming in Matrix Form Appendix B We first introduce matrix concepts in linear programming by developing a variation of the simplex method called the revised simplex method. This algorithm,

More information

arxiv:physics/0607202v2 [physics.comp-ph] 9 Nov 2006

arxiv:physics/0607202v2 [physics.comp-ph] 9 Nov 2006 Stock price fluctuations and the mimetic behaviors of traders Jun-ichi Maskawa Department of Management Information, Fukuyama Heisei University, Fukuyama, Hiroshima 720-0001, Japan (Dated: February 2,

More information

Lecture 3: Linear methods for classification

Lecture 3: Linear methods for classification Lecture 3: Linear methods for classification Rafael A. Irizarry and Hector Corrada Bravo February, 2010 Today we describe four specific algorithms useful for classification problems: linear regression,

More information

Module 3: Correlation and Covariance

Module 3: Correlation and Covariance Using Statistical Data to Make Decisions Module 3: Correlation and Covariance Tom Ilvento Dr. Mugdim Pašiƒ University of Delaware Sarajevo Graduate School of Business O ften our interest in data analysis

More information

Introduction to Engineering System Dynamics

Introduction to Engineering System Dynamics CHAPTER 0 Introduction to Engineering System Dynamics 0.1 INTRODUCTION The objective of an engineering analysis of a dynamic system is prediction of its behaviour or performance. Real dynamic systems are

More information

Introduction to Geostatistics

Introduction to Geostatistics Introduction to Geostatistics GEOL 5446 Dept. of Geology & Geophysics 3 Credits University of Wyoming Fall, 2013 Instructor: Ye Zhang Grading: A-F Location: ESB1006 Time: TTh (9:35 am~10:50 am), Office

More information

Sensitivity Analysis 3.1 AN EXAMPLE FOR ANALYSIS

Sensitivity Analysis 3.1 AN EXAMPLE FOR ANALYSIS Sensitivity Analysis 3 We have already been introduced to sensitivity analysis in Chapter via the geometry of a simple example. We saw that the values of the decision variables and those of the slack and

More information

Summary of Formulas and Concepts. Descriptive Statistics (Ch. 1-4)

Summary of Formulas and Concepts. Descriptive Statistics (Ch. 1-4) Summary of Formulas and Concepts Descriptive Statistics (Ch. 1-4) Definitions Population: The complete set of numerical information on a particular quantity in which an investigator is interested. We assume

More information

ALGEBRA. sequence, term, nth term, consecutive, rule, relationship, generate, predict, continue increase, decrease finite, infinite

ALGEBRA. sequence, term, nth term, consecutive, rule, relationship, generate, predict, continue increase, decrease finite, infinite ALGEBRA Pupils should be taught to: Generate and describe sequences As outcomes, Year 7 pupils should, for example: Use, read and write, spelling correctly: sequence, term, nth term, consecutive, rule,

More information

ArcGIS Geostatistical Analyst: Statistical Tools for Data Exploration, Modeling, and Advanced Surface Generation

ArcGIS Geostatistical Analyst: Statistical Tools for Data Exploration, Modeling, and Advanced Surface Generation ArcGIS Geostatistical Analyst: Statistical Tools for Data Exploration, Modeling, and Advanced Surface Generation An ESRI White Paper August 2001 ESRI 380 New York St., Redlands, CA 92373-8100, USA TEL

More information

Example: Credit card default, we may be more interested in predicting the probabilty of a default than classifying individuals as default or not.

Example: Credit card default, we may be more interested in predicting the probabilty of a default than classifying individuals as default or not. Statistical Learning: Chapter 4 Classification 4.1 Introduction Supervised learning with a categorical (Qualitative) response Notation: - Feature vector X, - qualitative response Y, taking values in C

More information

6.3 Conditional Probability and Independence

6.3 Conditional Probability and Independence 222 CHAPTER 6. PROBABILITY 6.3 Conditional Probability and Independence Conditional Probability Two cubical dice each have a triangle painted on one side, a circle painted on two sides and a square painted

More information

Spatial Statistics Chapter 3 Basics of areal data and areal data modeling

Spatial Statistics Chapter 3 Basics of areal data and areal data modeling Spatial Statistics Chapter 3 Basics of areal data and areal data modeling Recall areal data also known as lattice data are data Y (s), s D where D is a discrete index set. This usually corresponds to data

More information

Nonlinear Regression:

Nonlinear Regression: Zurich University of Applied Sciences School of Engineering IDP Institute of Data Analysis and Process Design Nonlinear Regression: A Powerful Tool With Considerable Complexity Half-Day : Improved Inference

More information

Stock price fluctuations and the mimetic behaviors of traders

Stock price fluctuations and the mimetic behaviors of traders Physica A 382 (2007) 172 178 www.elsevier.com/locate/physa Stock price fluctuations and the mimetic behaviors of traders Jun-ichi Maskawa Department of Management Information, Fukuyama Heisei University,

More information

Ehlers Filters by John Ehlers

Ehlers Filters by John Ehlers Ehlers Filters by John Ehlers The most common filters used by traders are Moving Averages either Simple Moving Averages (SMA) or Exponential Moving Averages (EMA). These are linear filters. Linear filters

More information

BayesX - Software for Bayesian Inference in Structured Additive Regression

BayesX - Software for Bayesian Inference in Structured Additive Regression BayesX - Software for Bayesian Inference in Structured Additive Regression Thomas Kneib Faculty of Mathematics and Economics, University of Ulm Department of Statistics, Ludwig-Maximilians-University Munich

More information

Data Preparation and Statistical Displays

Data Preparation and Statistical Displays Reservoir Modeling with GSLIB Data Preparation and Statistical Displays Data Cleaning / Quality Control Statistics as Parameters for Random Function Models Univariate Statistics Histograms and Probability

More information

Appendix 1: Time series analysis of peak-rate years and synchrony testing.

Appendix 1: Time series analysis of peak-rate years and synchrony testing. Appendix 1: Time series analysis of peak-rate years and synchrony testing. Overview The raw data are accessible at Figshare ( Time series of global resources, DOI 10.6084/m9.figshare.929619), sources are

More information

Economics of Strategy (ECON 4550) Maymester 2015 Applications of Regression Analysis

Economics of Strategy (ECON 4550) Maymester 2015 Applications of Regression Analysis Economics of Strategy (ECON 4550) Maymester 015 Applications of Regression Analysis Reading: ACME Clinic (ECON 4550 Coursepak, Page 47) and Big Suzy s Snack Cakes (ECON 4550 Coursepak, Page 51) Definitions

More information

Introduction to Regression and Data Analysis

Introduction to Regression and Data Analysis Statlab Workshop Introduction to Regression and Data Analysis with Dan Campbell and Sherlock Campbell October 28, 2008 I. The basics A. Types of variables Your variables may take several forms, and it

More information

Multiple Linear Regression in Data Mining

Multiple Linear Regression in Data Mining Multiple Linear Regression in Data Mining Contents 2.1. A Review of Multiple Linear Regression 2.2. Illustration of the Regression Process 2.3. Subset Selection in Linear Regression 1 2 Chap. 2 Multiple

More information

Vision based Vehicle Tracking using a high angle camera

Vision based Vehicle Tracking using a high angle camera Vision based Vehicle Tracking using a high angle camera Raúl Ignacio Ramos García Dule Shu gramos@clemson.edu dshu@clemson.edu Abstract A vehicle tracking and grouping algorithm is presented in this work

More information

Univariate and Multivariate Methods PEARSON. Addison Wesley

Univariate and Multivariate Methods PEARSON. Addison Wesley Time Series Analysis Univariate and Multivariate Methods SECOND EDITION William W. S. Wei Department of Statistics The Fox School of Business and Management Temple University PEARSON Addison Wesley Boston

More information

The Effect of Environmental Factors on Real Estate Value

The Effect of Environmental Factors on Real Estate Value The Effect of Environmental Factors on Real Estate Value Radoslaw CELLMER, Adam SENETRA, Agnieszka SZCZEPANSKA, Poland Key words: environment, landscape, property value, geostatistics SUMMARY The objective

More information

PLEASE SCROLL DOWN FOR ARTICLE

PLEASE SCROLL DOWN FOR ARTICLE This article was downloaded by:[kent State University] On: 23 October 2007 Access Details: [subscription number 768485448] Publisher: Taylor & Francis Informa Ltd Registered in England and Wales Registered

More information

STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.cs.toronto.edu/~rsalakhu/ Lecture 6 Three Approaches to Classification Construct

More information

Mulliken suggested to split the shared density 50:50. Then the electrons associated with the atom k are given by:

Mulliken suggested to split the shared density 50:50. Then the electrons associated with the atom k are given by: 1 17. Population Analysis Population analysis is the study of charge distribution within molecules. The intention is to accurately model partial charge magnitude and location within a molecule. This can

More information

The VAR models discussed so fare are appropriate for modeling I(0) data, like asset returns or growth rates of macroeconomic time series.

The VAR models discussed so fare are appropriate for modeling I(0) data, like asset returns or growth rates of macroeconomic time series. Cointegration The VAR models discussed so fare are appropriate for modeling I(0) data, like asset returns or growth rates of macroeconomic time series. Economic theory, however, often implies equilibrium

More information

Measurement with Ratios

Measurement with Ratios Grade 6 Mathematics, Quarter 2, Unit 2.1 Measurement with Ratios Overview Number of instructional days: 15 (1 day = 45 minutes) Content to be learned Use ratio reasoning to solve real-world and mathematical

More information

11. Time series and dynamic linear models

11. Time series and dynamic linear models 11. Time series and dynamic linear models Objective To introduce the Bayesian approach to the modeling and forecasting of time series. Recommended reading West, M. and Harrison, J. (1997). models, (2 nd

More information

Solving Linear Programs

Solving Linear Programs Solving Linear Programs 2 In this chapter, we present a systematic procedure for solving linear programs. This procedure, called the simplex method, proceeds by moving from one feasible solution to another,

More information

Using rainfall radar data to improve interpolated maps of dose rate in the Netherlands

Using rainfall radar data to improve interpolated maps of dose rate in the Netherlands Using rainfall radar data to improve interpolated maps of dose rate in the Netherlands Paul H. Hiemstra a,, Edzer J. Pebesma b, Gerard B.M. Heuvelink c, Chris J.W. Twenhöfel d a University of Utrecht,

More information

Handling attrition and non-response in longitudinal data

Handling attrition and non-response in longitudinal data Longitudinal and Life Course Studies 2009 Volume 1 Issue 1 Pp 63-72 Handling attrition and non-response in longitudinal data Harvey Goldstein University of Bristol Correspondence. Professor H. Goldstein

More information

Institute of Actuaries of India Subject CT3 Probability and Mathematical Statistics

Institute of Actuaries of India Subject CT3 Probability and Mathematical Statistics Institute of Actuaries of India Subject CT3 Probability and Mathematical Statistics For 2015 Examinations Aim The aim of the Probability and Mathematical Statistics subject is to provide a grounding in

More information

SYSTEMS OF REGRESSION EQUATIONS

SYSTEMS OF REGRESSION EQUATIONS SYSTEMS OF REGRESSION EQUATIONS 1. MULTIPLE EQUATIONS y nt = x nt n + u nt, n = 1,...,N, t = 1,...,T, x nt is 1 k, and n is k 1. This is a version of the standard regression model where the observations

More information

MULTIVARIATE PROBABILITY DISTRIBUTIONS

MULTIVARIATE PROBABILITY DISTRIBUTIONS MULTIVARIATE PROBABILITY DISTRIBUTIONS. PRELIMINARIES.. Example. Consider an experiment that consists of tossing a die and a coin at the same time. We can consider a number of random variables defined

More information

Algebra 1 2008. Academic Content Standards Grade Eight and Grade Nine Ohio. Grade Eight. Number, Number Sense and Operations Standard

Algebra 1 2008. Academic Content Standards Grade Eight and Grade Nine Ohio. Grade Eight. Number, Number Sense and Operations Standard Academic Content Standards Grade Eight and Grade Nine Ohio Algebra 1 2008 Grade Eight STANDARDS Number, Number Sense and Operations Standard Number and Number Systems 1. Use scientific notation to express

More information

Green = 0,255,0 (Target Color for E.L. Gray Construction) CIELAB RGB Simulation Result for E.L. Gray Match (43,215,35) Equal Luminance Gray for Green

Green = 0,255,0 (Target Color for E.L. Gray Construction) CIELAB RGB Simulation Result for E.L. Gray Match (43,215,35) Equal Luminance Gray for Green Red = 255,0,0 (Target Color for E.L. Gray Construction) CIELAB RGB Simulation Result for E.L. Gray Match (184,27,26) Equal Luminance Gray for Red = 255,0,0 (147,147,147) Mean of Observer Matches to Red=255

More information

University of Ljubljana Doctoral Programme in Statistics Methodology of Statistical Research Written examination February 14 th, 2014.

University of Ljubljana Doctoral Programme in Statistics Methodology of Statistical Research Written examination February 14 th, 2014. University of Ljubljana Doctoral Programme in Statistics ethodology of Statistical Research Written examination February 14 th, 2014 Name and surname: ID number: Instructions Read carefully the wording

More information

Exact Nonparametric Tests for Comparing Means - A Personal Summary

Exact Nonparametric Tests for Comparing Means - A Personal Summary Exact Nonparametric Tests for Comparing Means - A Personal Summary Karl H. Schlag European University Institute 1 December 14, 2006 1 Economics Department, European University Institute. Via della Piazzuola

More information

Machine Learning for Medical Image Analysis. A. Criminisi & the InnerEye team @ MSRC

Machine Learning for Medical Image Analysis. A. Criminisi & the InnerEye team @ MSRC Machine Learning for Medical Image Analysis A. Criminisi & the InnerEye team @ MSRC Medical image analysis the goal Automatic, semantic analysis and quantification of what observed in medical scans Brain

More information

Integration of Geological, Geophysical, and Historical Production Data in Geostatistical Reservoir Modelling

Integration of Geological, Geophysical, and Historical Production Data in Geostatistical Reservoir Modelling Integration of Geological, Geophysical, and Historical Production Data in Geostatistical Reservoir Modelling Clayton V. Deutsch (The University of Alberta) Department of Civil & Environmental Engineering

More information

INTEREST RATE DERIVATIVES IN THE SOUTH AFRICAN MARKET BASED ON THE PRIME RATE

INTEREST RATE DERIVATIVES IN THE SOUTH AFRICAN MARKET BASED ON THE PRIME RATE INTEREST RATE DERIVATIVES IN THE SOUTH AFRICAN MARKET BASED ON THE PRIME RATE G West * D Abstract erivatives linked to the prime rate of interest have become quite relevant with the introduction to the

More information

Module 5: Multiple Regression Analysis

Module 5: Multiple Regression Analysis Using Statistical Data Using to Make Statistical Decisions: Data Multiple to Make Regression Decisions Analysis Page 1 Module 5: Multiple Regression Analysis Tom Ilvento, University of Delaware, College

More information

Basic Statistics and Data Analysis for Health Researchers from Foreign Countries

Basic Statistics and Data Analysis for Health Researchers from Foreign Countries Basic Statistics and Data Analysis for Health Researchers from Foreign Countries Volkert Siersma siersma@sund.ku.dk The Research Unit for General Practice in Copenhagen Dias 1 Content Quantifying association

More information

UNIVERSITY OF WAIKATO. Hamilton New Zealand

UNIVERSITY OF WAIKATO. Hamilton New Zealand UNIVERSITY OF WAIKATO Hamilton New Zealand Can We Trust Cluster-Corrected Standard Errors? An Application of Spatial Autocorrelation with Exact Locations Known John Gibson University of Waikato Bonggeun

More information

Basics of Floating-Point Quantization

Basics of Floating-Point Quantization Chapter 2 Basics of Floating-Point Quantization Representation of physical quantities in terms of floating-point numbers allows one to cover a very wide dynamic range with a relatively small number of

More information

Figure 1. Diode circuit model

Figure 1. Diode circuit model Semiconductor Devices Non-linear Devices Diodes Introduction. The diode is two terminal non linear device whose I-V characteristic besides exhibiting non-linear behavior is also polarity dependent. The

More information

Randomization Based Confidence Intervals For Cross Over and Replicate Designs and for the Analysis of Covariance

Randomization Based Confidence Intervals For Cross Over and Replicate Designs and for the Analysis of Covariance Randomization Based Confidence Intervals For Cross Over and Replicate Designs and for the Analysis of Covariance Winston Richards Schering-Plough Research Institute JSM, Aug, 2002 Abstract Randomization

More information

Physics Lab Report Guidelines

Physics Lab Report Guidelines Physics Lab Report Guidelines Summary The following is an outline of the requirements for a physics lab report. A. Experimental Description 1. Provide a statement of the physical theory or principle observed

More information

MATH2740: Environmental Statistics

MATH2740: Environmental Statistics MATH2740: Environmental Statistics Lecture 6: Distance Methods I February 10, 2016 Table of contents 1 Introduction Problem with quadrat data Distance methods 2 Point-object distances Poisson process case

More information

Spring Force Constant Determination as a Learning Tool for Graphing and Modeling

Spring Force Constant Determination as a Learning Tool for Graphing and Modeling NCSU PHYSICS 205 SECTION 11 LAB II 9 FEBRUARY 2002 Spring Force Constant Determination as a Learning Tool for Graphing and Modeling Newton, I. 1*, Galilei, G. 1, & Einstein, A. 1 (1. PY205_011 Group 4C;

More information

Autocovariance and Autocorrelation

Autocovariance and Autocorrelation Chapter 3 Autocovariance and Autocorrelation If the {X n } process is weakly stationary, the covariance of X n and X n+k depends only on the lag k. This leads to the following definition of the autocovariance

More information

Validating Market Risk Models: A Practical Approach

Validating Market Risk Models: A Practical Approach Validating Market Risk Models: A Practical Approach Doug Gardner Wells Fargo November 2010 The views expressed in this presentation are those of the author and do not necessarily reflect the position of

More information

Outline. Topic 4 - Analysis of Variance Approach to Regression. Partitioning Sums of Squares. Total Sum of Squares. Partitioning sums of squares

Outline. Topic 4 - Analysis of Variance Approach to Regression. Partitioning Sums of Squares. Total Sum of Squares. Partitioning sums of squares Topic 4 - Analysis of Variance Approach to Regression Outline Partitioning sums of squares Degrees of freedom Expected mean squares General linear test - Fall 2013 R 2 and the coefficient of correlation

More information

Trend and Seasonal Components

Trend and Seasonal Components Chapter 2 Trend and Seasonal Components If the plot of a TS reveals an increase of the seasonal and noise fluctuations with the level of the process then some transformation may be necessary before doing

More information

Credit Risk Models: An Overview

Credit Risk Models: An Overview Credit Risk Models: An Overview Paul Embrechts, Rüdiger Frey, Alexander McNeil ETH Zürich c 2003 (Embrechts, Frey, McNeil) A. Multivariate Models for Portfolio Credit Risk 1. Modelling Dependent Defaults:

More information

SENSITIVITY ANALYSIS AND INFERENCE. Lecture 12

SENSITIVITY ANALYSIS AND INFERENCE. Lecture 12 This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike License. Your use of this material constitutes acceptance of that license and the conditions of use of materials on this

More information

Decomposition of Event Sequences into Independent Components

Decomposition of Event Sequences into Independent Components Decomposition of Event Sequences into Independent Components Heikki Mannila and Dmitry Rusakov 1 Introduction Many real-world processes result in an extensive logs of sequences of events, i.e., events

More information

Confidence Intervals for Exponential Reliability

Confidence Intervals for Exponential Reliability Chapter 408 Confidence Intervals for Exponential Reliability Introduction This routine calculates the number of events needed to obtain a specified width of a confidence interval for the reliability (proportion

More information

3.1. Solving linear equations. Introduction. Prerequisites. Learning Outcomes. Learning Style

3.1. Solving linear equations. Introduction. Prerequisites. Learning Outcomes. Learning Style Solving linear equations 3.1 Introduction Many problems in engineering reduce to the solution of an equation or a set of equations. An equation is a type of mathematical expression which contains one or

More information

Discussion. Seppo Laaksonen 1. 1. Introduction

Discussion. Seppo Laaksonen 1. 1. Introduction Journal of Official Statistics, Vol. 23, No. 4, 2007, pp. 467 475 Discussion Seppo Laaksonen 1 1. Introduction Bjørnstad s article is a welcome contribution to the discussion on multiple imputation (MI)

More information

CAPM, Arbitrage, and Linear Factor Models

CAPM, Arbitrage, and Linear Factor Models CAPM, Arbitrage, and Linear Factor Models CAPM, Arbitrage, Linear Factor Models 1/ 41 Introduction We now assume all investors actually choose mean-variance e cient portfolios. By equating these investors

More information

INDIRECT INFERENCE (prepared for: The New Palgrave Dictionary of Economics, Second Edition)

INDIRECT INFERENCE (prepared for: The New Palgrave Dictionary of Economics, Second Edition) INDIRECT INFERENCE (prepared for: The New Palgrave Dictionary of Economics, Second Edition) Abstract Indirect inference is a simulation-based method for estimating the parameters of economic models. Its

More information

Standard errors of marginal effects in the heteroskedastic probit model

Standard errors of marginal effects in the heteroskedastic probit model Standard errors of marginal effects in the heteroskedastic probit model Thomas Cornelißen Discussion Paper No. 320 August 2005 ISSN: 0949 9962 Abstract In non-linear regression models, such as the heteroskedastic

More information

Econometrics Simple Linear Regression

Econometrics Simple Linear Regression Econometrics Simple Linear Regression Burcu Eke UC3M Linear equations with one variable Recall what a linear equation is: y = b 0 + b 1 x is a linear equation with one variable, or equivalently, a straight

More information

( ) = 1 x. ! 2x = 2. The region where that joint density is positive is indicated with dotted lines in the graph below. y = x

( ) = 1 x. ! 2x = 2. The region where that joint density is positive is indicated with dotted lines in the graph below. y = x Errata for the ASM Study Manual for Exam P, Eleventh Edition By Dr. Krzysztof M. Ostaszewski, FSA, CERA, FSAS, CFA, MAAA Web site: http://www.krzysio.net E-mail: krzysio@krzysio.net Posted September 21,

More information

Auxiliary Variables in Mixture Modeling: 3-Step Approaches Using Mplus

Auxiliary Variables in Mixture Modeling: 3-Step Approaches Using Mplus Auxiliary Variables in Mixture Modeling: 3-Step Approaches Using Mplus Tihomir Asparouhov and Bengt Muthén Mplus Web Notes: No. 15 Version 8, August 5, 2014 1 Abstract This paper discusses alternatives

More information

Hedge Effectiveness Testing

Hedge Effectiveness Testing Hedge Effectiveness Testing Using Regression Analysis Ira G. Kawaller, Ph.D. Kawaller & Company, LLC Reva B. Steinberg BDO Seidman LLP When companies use derivative instruments to hedge economic exposures,

More information

NCSS Statistical Software Principal Components Regression. In ordinary least squares, the regression coefficients are estimated using the formula ( )

NCSS Statistical Software Principal Components Regression. In ordinary least squares, the regression coefficients are estimated using the formula ( ) Chapter 340 Principal Components Regression Introduction is a technique for analyzing multiple regression data that suffer from multicollinearity. When multicollinearity occurs, least squares estimates

More information

X X X a) perfect linear correlation b) no correlation c) positive correlation (r = 1) (r = 0) (0 < r < 1)

X X X a) perfect linear correlation b) no correlation c) positive correlation (r = 1) (r = 0) (0 < r < 1) CORRELATION AND REGRESSION / 47 CHAPTER EIGHT CORRELATION AND REGRESSION Correlation and regression are statistical methods that are commonly used in the medical literature to compare two or more variables.

More information