Limitations of Indicator Kriging for Predicting Data with Trend
|
|
- Amie Jones
- 7 years ago
- Views:
Transcription
1 Limitations of Indicator Kriging for Predicting Data with Trend Andreas Papritz ETH Zurich, Department of Environmental Sciences, Zurich, Switzerland Abstract. Goovaerts and Journel [8] proposed simple indicator kriging with varying local means (siklm) as a way to extend the indicator kriging methodology to variates with an apparent spatial trend. However, contrary to the authors implications, the detrended indicators; i.e., the indicator residuals, are not stationary, and their covariance structure cannot be unbiasedly estimated from a single realization of a random process. Ignoring the non-stationary nature of the covariance of the indicators ruins the usual mean square optimality of kriging. Therefore, siklm is an ad-hoc procedure, which lacks optimality, and its use should be discouraged. INTRODUCTION According to ISI Web of Science R, about 20 journal articles and 0 contributions to conference proceedings have been published about indicator kriging (IK for short) to date. Many of these studies deal with mapping the probability that a spatial variable exceeds a threshold [e.g., 3, ]. This is an important problem in environmental surveillance and monitoring. Some studies apply IK to data with an apparent trend, following advice by Goovaerts and Journel [8] and Goovaerts [7], sec Unfortunately, Goovaerts simple IK with varying local means (siklm in short) is not feasible in practice, as it ask for the modelling of non-stationary covariances. By in practice I mean the case where we consider our measurements as a sample from a single realisation of a random process. The same problem arises if IK is used for data that show unbounded variograms. To substantiate my contention, I highlight and discuss here the limitations of siklm which arise from basic probability theory. Notwithstanding their elementary nature, these limitations seem to have been frequently ignored. I further demonstrate by a simulation that siklm lacks the usual mean square optimality of kriging, which leads me to discourage the use of siklm. 2 COVARIANCES OF INDICATOR TRANSFORMS OF NON- STATIONARY VARIATES Let Z(s) denote a real valued random variable used for modelling an attribute z measured at location s, and let I(s; z ) for a specific cut-off z be the indicator transform I(s; z ) = if Z(s) z and I(s; z ) = 0 otherwise. Then E [I(s; z )] = Prob[Z(s) z ] = F (s; z ), () Var [I(s; z )] = F (s; z ) ( F (s; z )), (2) where F (s; z) is the cumulative distribution function (cdf) of Z(s), E [.] and Var [.] are the expectation and variance operators, and Prob[A] denotes the probability of
2 event A. Further, let Cov [.] and Cor [.] denote the covariance and correlation operators. The (cross-)covariance function of the indicators for two cut-offs z and z, C I (s, s + h; z, z ) = Cov [I(s; z ), I(s + h; z )], is related to the bivariate cumulative distribution function, F (s, s+h; z, z ) = Prob[Z(s) z, Z(s+h) z ], of Z(s) and Z(s + h) by [e.g., 0] C I (s, s + h; z, z ) = F (s, s + h; z, z ) F (s; z ) F (s + h; z ). (3) For a random process with stationary bivariate distributions equations () (3) simplify to E [I(s; z )] = F (z ), (4) Var [I(s; z )] = F (z ) ( F (z )), (5) C I (h; z, z ) = F (h; z, z ) F (z ) F (z ). (6) Clearly, the right-hand sides of (4) (6) do not depend on s, and C I (.) is a function of the lag h only. Notice that equation (4) means that the expectations of the random variables, say E [Z(s)] = µ(s), must not vary in space. Otherwise, the cdfs would not be constant. Furthermore, equations (4) (6) show that we may (at least hope to) infer the first two moments of the indicators when we have data from only one realisation of {Z(s)}. To estimate the expectations and (cross-)covariances of the indicators we replace the averaging of multiple realisations by averaging over space. Spatial averaging, however, is inappropriate in the general case of non-stationary distributions; i.e., for models with moments given by equations () (3). In spite of the above, Goovaerts and Journel [8] proposed to extend the IK methodology to random processes with spatially varying µ(s). They called their method simple IK with varying local means. The terms simple IK with local prior means [7], soft IK [9] or IK with external drift [2] have since been used to denote the approach also. Apparently, the authors realized that the indicators have non-stationary (co-)variances if µ(s) varies spatially. Given an estimate, F (s; z ), of the cdf, they proposed to estimate the variogram of I(s; z ) by fitting model functions to the sample variogram, γ R (s i ; h k ; z, z ) = N(h k ) {r(s i ; z ) r(s i + h k ; z )} 2, (7) 2 N(h k ) i= of the indicator residuals r(s; z ) = i(s; z ) F (s; z ) (i(s; z ) is the indicator transform of a measurement and N(h k ) is the number of data pairs in lag-class h k ). Unfortunately, they failed to recognize that half the expected squared difference of the indicator residuals; i.e., their semivariance, is not independent of s, even if (unrealistically) the true cdf is assumed to be known; i.e. if F (s; z ) = F (s; z ): 2 E [{R(s; z ) R(s + h; z )} 2 ] = 2 Var [R(s; z ) R(s + h; z )] = 2 {F (s; z ) ( F (s; z )) + F (s + h; z ) ( F (s + h; z ))} {F (s, s + h; z, z ) F (s; z ) F (s + h; z )}. (8) As above, F (s; z ) and F (s, s + h; z, z ) are functions of s in the non-stationary case. Hence, the right-hand side of equation (8) still depends on s. Grouping the observed
3 piecewise constant trend, nugget 0. piecewise constant trend, nugget 0. attribute Z(s) E[Z(s)] cutoff indicator I(s ;0) E[I(s ;0)] Var[I(s ;0)] location s location s Figure : Two realisations, shown in red and blue, of a Gaussian random process with a piecewise constant mean function and a cubic variogram with nugget (left panel) and the corresponding indicator transforms of the simulated data for the cut-off z = 0 (right panel) (solid lines: expectations of the random variables; dotted lines: cut-off [left] and variances of indicator random variables [right]). indicator residuals into lag classes and computing a sample variogram by the customary method-of-moments estimator render it meaningless in this instance. The indicator transforms of {Z(s)} with constant µ(s) but unbounded variogram have non-stationary covariances, too. To see this, we consider Gaussian, zero order intrinsic {Z(s)}, s IR, with a linear variogram, γ(h) = h. Two increments, say Z(s) = Z(s) Z(0) and Z(t) = Z(t) Z(0), are then normally distributed with variances Var [ Z(s)] = 2s, Var [ Z(t)] = 2t and correlation ρ = Cor [ Z(s), Z(t)] = min(s, t) s t. (9) Thus, their bivariate density function is equal to [, p. 936] ( g(z s, z t ; s, t, ρ) = 4π s t( ρ 2 ) exp z2 s/s 2 2ρz s z t / s t + zt 2 /t 2 ). (0) 4( ρ 2 ) The covariance of the indicator transforms of the increments is related to g(z s, z t ; s, t, ρ) by [4, p. 400] C I (s, t; z, z ) = min(s,t) s t 0 g(z, z ; s, t, ρ) dρ. () Clearly, C I (s, t; z, z ) depends on s and t not only through the lag h = s t, and the covariance is non-stationary. 3 SIMULATION STUDY I used simulation to illustrate how large the bias between the non-stationary variograms of the indicators and an estimate based on equation (7) can be and to demonstrate that the
4 piecewise constant trend, nugget 0. piecewise constant trend, nugget 0. location s lag distance h semivariance γ(s, h ) expectation of equation (7) lag distance h Figure 2: Non-stationary indicator semivariances, γ I (s i, s i + h k ; 0, 0), for the simulations shown in Fig.. The left panel shows γ I (.) as a function of s and h, and the right panel shows the variograms γ I (s i, s i + h k ; 0, 0) for six locations s i : 0, 40,..., 200 as a function of h, together with the expectation, E [ γ R (s 0, s,... ; h k )], of the estimator given in equation (7). bias leads to a loss of efficiency in simple IK. To this end, I simulated 0 5 realisations of a Gaussian random process at the locations s 0 = 0, s =,..., s 300 = 300 on a line. The process had a piecewise constant mean function and a cubic variogram with range 66, unit total sill and nugget 0.. Piecewise constant mean functions were used by Goovaerts and Journel [8], van Meirvenne and Goovaerts [] and Brus et al. [3]. Figure shows two realisations and the corresponding indicators for the cut-off z = 0. The right panel also shows the estimated expectations of the indicators F (s i ; 0) = j= I(s i ; 0) j and their variances. The subscript j denotes here the jth realisation. For each s i : 0,,..., 200 I estimated the non-stationary covariances of the indicators for the lag distances h k : 0,, 2,..., 00 by Ĉ I (s i, s i + h k ; 0, 0) = 0 5 R(s i ; 0) j R(s i + h k ; 0) j, where R(s; 0) j = I(s; 0) j F (s; 0), and from those estimates I computed the nonstationary semivariances of the indicators by γ I (s i, s i + h k ; 0, 0) = {ĈI (s i, s i ; 0, 0) + 2 ĈI(s i + h k, s i + h k ; 0, 0) } ĈI(s i, s i + h k ; 0, 0). These estimates where then compared with the estimated expectation of the sample variograms of the indicator residuals computed for each realisation by equation (7) E [ γ R (s 0, s,... ; h k )] = j= j= i=0 {R(s i ; 0) j R(s i + h k ; 0) j } 2. The left panel of Figure 2 shows γ I (s i, s i + h k ; 0, 0) as a function of s i and h k. We see abrupt changes of the semivariance for a given h k along the ordinate from s 0 = 0 to
5 simple kriging weights SK computed with non stationary covariances SK computed with covariances estimated by equation (7) relative efficiency of SK computed with covariances estimated by equation (7) location of prediction point s 0 Figure 3: Simple IK weights of 6 measurements at locations 50 (black), 70 (red), 90 (green), 0 (blue), 30 (cyan) and 50 (magenta) as a function of the position of the prediction point s 0. The solid dots are the optimal weights computed from the non-stationary semivariances ( γ I (s i, s i + h k ; 0, 0)), the open squares are the weights computed from the expectation (E [ γ R (s 0, s,... ; h k )]) of the estimator given in equation (7). The solid line is the relative efficiency of siklm. Tickmarks without labels show the boundaries of the subregions with constant means (cf. Fig. ). s 200 = 200. The right panel of the figure shows the change of the semivariance with h k for 6 selected locations. The semivariance does not increase monotonically with h k : there are abrupt changes because of the non-constant variances of the indicators. If we ignore the non-stationary nature of the problem and use equation (7) then these jumps are lost. The discrepancies between E [ γ R (s 0, s,... ; h k )] and γ I (s i, s i + h k ; 0, 0) may seem not very significant. However, Figure 3 shows that they matter if we predict the indicators by simple kriging at the locations s 0 : 50, 5, 52,..., 50 from 6 measurements at s i : 50, 70, 90, 0, 30, 50. Close to the boundaries of the subregions with constant mean we see abrupt changes in the optimal simple IK weights which are lost when we compute them from E [ γ R (s 0, s,... ; h k )]. A loss of efficiency of up to 20% results when we use Goovaerts and Journel s suggestion to estimate the variogram. Thus, the example shows that kriging looses its mean square optimality if we ignore the non-stationary nature of the problem. We can then merely hope that kriging provides better predictions than other ad-hoc procedures such as inverse distance weighting of the indicators. 4 CONCLUSIONS I conclude by stating that any attempt to use IK for data with an apparent trend either explicitly (siklm) or implicitly by using ordinary IK within a local neighbourhood of support points requires the modelling of non-stationary indicator variograms to preserve
6 the mean square optimality of kriging. The same problem arises for random processes with constant means but unbounded variograms, although the loss of efficiency of siklm was smaller in the simulations that I ran as well but did not report here. As we cannot estimate non-stationary variograms from only one realization of {Z(s)}, IK is in practice limited to geostatistical analyses of data without an apparent trend and a bounded variogram; i.e., to models with stationary bivariate distributions. This a serious limitation because in many instances we have full coverage ancillary information that could (and should!) be exploited when predicting Z(s) or any non-linear transform thereof. But fortunately, there is life beyond IK: Diggle et al. [6] showed how to extend geostatistical methodology to non-normal response variates, and related approaches also exist for lattice models [5, chap ], so there is no harm to give up the IK methodology altogether. REFERENCES [] Abramowitz, M. and Stegun, I. A. (965). Handbook of Mathematical Functions. Dover, New York. [2] Bárdossy, A. and Lehmann, W. (998). Spatial distribution of soil moisture in a small catchment.part : Geostatistical analysis. Journal of Hydrology, 206, 5. [3] Brus, D. J., de Gruijter, J. J., Walvoort, D. J. J., de Vries, F., Bronswijk, J. J. B., Römkens, P. F. A. M., and de Vries, W. (2002). Mapping the probability of exceeding critical thresholds for cadmium concentrations in soils in the netherlands. Journal of Environmental Quality, 3, [4] Chilès, J.-P. and Delfiner, P. (999). Geostatistics: Modeling Spatial Uncertainty. John Wiley & Sons, New York. [5] Cressie, N. A. C. (993). Statistics for Spatial Data. John Wiley & Sons, New York, revised edition. [6] Diggle, P. J., Tawn, J. A., and Moyeed, R. A. (998). Model-based geostatistics (with discussions). Applied Statistics, 47(3), [7] Goovaerts, P. (997). Geostatistics for Natural Resources Evaluation. Oxford University Press, New York. [8] Goovaerts, P. and Journel, A. G. (995). Integrating soil map information in modelling the spatial variation of continuous soil properties. European Journal of Soil Science, 46, [9] Grunwald, S., Goovaerts, P., Bliss, C. M., Comerford, N. B., and Lamsal, S. (2006). Incorporation of auxiliary information in the geostatistical similation of soil nitrate nitrogen. Vadose Zone Journal, 5, [0] Journel, A. G. and Posa, D. (990). Characteristic behavior and order relations for indicator variograms. Mathematical Geology, 22(8), [] van Meirvenne, M. and Goovaerts, P. (200). Evaluating the probability of exceeding a site-specific soil cadmium contamination threshold. Geoderma, 02,
INTRODUCTION TO GEOSTATISTICS And VARIOGRAM ANALYSIS
INTRODUCTION TO GEOSTATISTICS And VARIOGRAM ANALYSIS C&PE 940, 17 October 2005 Geoff Bohling Assistant Scientist Kansas Geological Survey geoff@kgs.ku.edu 864-2093 Overheads and other resources available
More informationSpatial sampling effect of laboratory practices in a porphyry copper deposit
Spatial sampling effect of laboratory practices in a porphyry copper deposit Serge Antoine Séguret Centre of Geosciences and Geoengineering/ Geostatistics, MINES ParisTech, Fontainebleau, France ABSTRACT
More informationAn Interactive Tool for Residual Diagnostics for Fitting Spatial Dependencies (with Implementation in R)
DSC 2003 Working Papers (Draft Versions) http://www.ci.tuwien.ac.at/conferences/dsc-2003/ An Interactive Tool for Residual Diagnostics for Fitting Spatial Dependencies (with Implementation in R) Ernst
More informationGeography 4203 / 5203. GIS Modeling. Class (Block) 9: Variogram & Kriging
Geography 4203 / 5203 GIS Modeling Class (Block) 9: Variogram & Kriging Some Updates Today class + one proposal presentation Feb 22 Proposal Presentations Feb 25 Readings discussion (Interpolation) Last
More informationIntroduction to Modeling Spatial Processes Using Geostatistical Analyst
Introduction to Modeling Spatial Processes Using Geostatistical Analyst Konstantin Krivoruchko, Ph.D. Software Development Lead, Geostatistics kkrivoruchko@esri.com Geostatistics is a set of models and
More informationOverview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model
Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model 1 September 004 A. Introduction and assumptions The classical normal linear regression model can be written
More informationGeostatistical Earth Modeling Software: User s Manual. Nicolas Remy
Geostatistical Earth Modeling Software: User s Manual Nicolas Remy May 2004 Contents 1 General Overview 4 1.1 First Steps with GEMS............................ 4 1.1.1 A quick tour to the graphical user
More informationLeast Squares Estimation
Least Squares Estimation SARA A VAN DE GEER Volume 2, pp 1041 1045 in Encyclopedia of Statistics in Behavioral Science ISBN-13: 978-0-470-86080-9 ISBN-10: 0-470-86080-4 Editors Brian S Everitt & David
More informationPROPERTIES OF THE SAMPLE CORRELATION OF THE BIVARIATE LOGNORMAL DISTRIBUTION
PROPERTIES OF THE SAMPLE CORRELATION OF THE BIVARIATE LOGNORMAL DISTRIBUTION Chin-Diew Lai, Department of Statistics, Massey University, New Zealand John C W Rayner, School of Mathematics and Applied Statistics,
More informationAnnealing Techniques for Data Integration
Reservoir Modeling with GSLIB Annealing Techniques for Data Integration Discuss the Problem of Permeability Prediction Present Annealing Cosimulation More Details on Simulated Annealing Examples SASIM
More informationBias in the Estimation of Mean Reversion in Continuous-Time Lévy Processes
Bias in the Estimation of Mean Reversion in Continuous-Time Lévy Processes Yong Bao a, Aman Ullah b, Yun Wang c, and Jun Yu d a Purdue University, IN, USA b University of California, Riverside, CA, USA
More information2 An Introduction to Model-Based Geostatistics
2 An Introduction to Model-Based Geostatistics Peter J. Diggle Paulo J. Ribeiro Jr. Ole F. Christensen 2.1 Introduction The term geostatistics identifies the part of spatial statistics which is concerned
More informationA logistic approximation to the cumulative normal distribution
A logistic approximation to the cumulative normal distribution Shannon R. Bowling 1 ; Mohammad T. Khasawneh 2 ; Sittichai Kaewkuekool 3 ; Byung Rae Cho 4 1 Old Dominion University (USA); 2 State University
More informationProbability and Random Variables. Generation of random variables (r.v.)
Probability and Random Variables Method for generating random variables with a specified probability distribution function. Gaussian And Markov Processes Characterization of Stationary Random Process Linearly
More information4. Simple regression. QBUS6840 Predictive Analytics. https://www.otexts.org/fpp/4
4. Simple regression QBUS6840 Predictive Analytics https://www.otexts.org/fpp/4 Outline The simple linear model Least squares estimation Forecasting with regression Non-linear functional forms Regression
More informationLOGIT AND PROBIT ANALYSIS
LOGIT AND PROBIT ANALYSIS A.K. Vasisht I.A.S.R.I., Library Avenue, New Delhi 110 012 amitvasisht@iasri.res.in In dummy regression variable models, it is assumed implicitly that the dependent variable Y
More informationAn introduction to Value-at-Risk Learning Curve September 2003
An introduction to Value-at-Risk Learning Curve September 2003 Value-at-Risk The introduction of Value-at-Risk (VaR) as an accepted methodology for quantifying market risk is part of the evolution of risk
More informationThe CUSUM algorithm a small review. Pierre Granjon
The CUSUM algorithm a small review Pierre Granjon June, 1 Contents 1 The CUSUM algorithm 1.1 Algorithm............................... 1.1.1 The problem......................... 1.1. The different steps......................
More informationWhat s New in Econometrics? Lecture 8 Cluster and Stratified Sampling
What s New in Econometrics? Lecture 8 Cluster and Stratified Sampling Jeff Wooldridge NBER Summer Institute, 2007 1. The Linear Model with Cluster Effects 2. Estimation with a Small Number of Groups and
More information8. THE NORMAL DISTRIBUTION
8. THE NORMAL DISTRIBUTION The normal distribution with mean μ and variance σ 2 has the following density function: The normal distribution is sometimes called a Gaussian Distribution, after its inventor,
More informationTime Series and Forecasting
Chapter 22 Page 1 Time Series and Forecasting A time series is a sequence of observations of a random variable. Hence, it is a stochastic process. Examples include the monthly demand for a product, the
More informationExtreme Value Modeling for Detection and Attribution of Climate Extremes
Extreme Value Modeling for Detection and Attribution of Climate Extremes Jun Yan, Yujing Jiang Joint work with Zhuo Wang, Xuebin Zhang Department of Statistics, University of Connecticut February 2, 2016
More informationTwo Topics in Parametric Integration Applied to Stochastic Simulation in Industrial Engineering
Two Topics in Parametric Integration Applied to Stochastic Simulation in Industrial Engineering Department of Industrial Engineering and Management Sciences Northwestern University September 15th, 2014
More informationGeostatistics Exploratory Analysis
Instituto Superior de Estatística e Gestão de Informação Universidade Nova de Lisboa Master of Science in Geospatial Technologies Geostatistics Exploratory Analysis Carlos Alberto Felgueiras cfelgueiras@isegi.unl.pt
More informationPITFALLS IN TIME SERIES ANALYSIS. Cliff Hurvich Stern School, NYU
PITFALLS IN TIME SERIES ANALYSIS Cliff Hurvich Stern School, NYU The t -Test If x 1,..., x n are independent and identically distributed with mean 0, and n is not too small, then t = x 0 s n has a standard
More informationModeling the Distribution of Environmental Radon Levels in Iowa: Combining Multiple Sources of Spatially Misaligned Data
Modeling the Distribution of Environmental Radon Levels in Iowa: Combining Multiple Sources of Spatially Misaligned Data Brian J. Smith, Ph.D. The University of Iowa Joint Statistical Meetings August 10,
More informationIntroduction to General and Generalized Linear Models
Introduction to General and Generalized Linear Models General Linear Models - part I Henrik Madsen Poul Thyregod Informatics and Mathematical Modelling Technical University of Denmark DK-2800 Kgs. Lyngby
More informationChapter 1. Vector autoregressions. 1.1 VARs and the identi cation problem
Chapter Vector autoregressions We begin by taking a look at the data of macroeconomics. A way to summarize the dynamics of macroeconomic data is to make use of vector autoregressions. VAR models have become
More informationLinear Programming in Matrix Form
Linear Programming in Matrix Form Appendix B We first introduce matrix concepts in linear programming by developing a variation of the simplex method called the revised simplex method. This algorithm,
More informationarxiv:physics/0607202v2 [physics.comp-ph] 9 Nov 2006
Stock price fluctuations and the mimetic behaviors of traders Jun-ichi Maskawa Department of Management Information, Fukuyama Heisei University, Fukuyama, Hiroshima 720-0001, Japan (Dated: February 2,
More informationLecture 3: Linear methods for classification
Lecture 3: Linear methods for classification Rafael A. Irizarry and Hector Corrada Bravo February, 2010 Today we describe four specific algorithms useful for classification problems: linear regression,
More informationModule 3: Correlation and Covariance
Using Statistical Data to Make Decisions Module 3: Correlation and Covariance Tom Ilvento Dr. Mugdim Pašiƒ University of Delaware Sarajevo Graduate School of Business O ften our interest in data analysis
More informationIntroduction to Engineering System Dynamics
CHAPTER 0 Introduction to Engineering System Dynamics 0.1 INTRODUCTION The objective of an engineering analysis of a dynamic system is prediction of its behaviour or performance. Real dynamic systems are
More informationIntroduction to Geostatistics
Introduction to Geostatistics GEOL 5446 Dept. of Geology & Geophysics 3 Credits University of Wyoming Fall, 2013 Instructor: Ye Zhang Grading: A-F Location: ESB1006 Time: TTh (9:35 am~10:50 am), Office
More informationSensitivity Analysis 3.1 AN EXAMPLE FOR ANALYSIS
Sensitivity Analysis 3 We have already been introduced to sensitivity analysis in Chapter via the geometry of a simple example. We saw that the values of the decision variables and those of the slack and
More informationSummary of Formulas and Concepts. Descriptive Statistics (Ch. 1-4)
Summary of Formulas and Concepts Descriptive Statistics (Ch. 1-4) Definitions Population: The complete set of numerical information on a particular quantity in which an investigator is interested. We assume
More informationALGEBRA. sequence, term, nth term, consecutive, rule, relationship, generate, predict, continue increase, decrease finite, infinite
ALGEBRA Pupils should be taught to: Generate and describe sequences As outcomes, Year 7 pupils should, for example: Use, read and write, spelling correctly: sequence, term, nth term, consecutive, rule,
More informationArcGIS Geostatistical Analyst: Statistical Tools for Data Exploration, Modeling, and Advanced Surface Generation
ArcGIS Geostatistical Analyst: Statistical Tools for Data Exploration, Modeling, and Advanced Surface Generation An ESRI White Paper August 2001 ESRI 380 New York St., Redlands, CA 92373-8100, USA TEL
More informationExample: Credit card default, we may be more interested in predicting the probabilty of a default than classifying individuals as default or not.
Statistical Learning: Chapter 4 Classification 4.1 Introduction Supervised learning with a categorical (Qualitative) response Notation: - Feature vector X, - qualitative response Y, taking values in C
More information6.3 Conditional Probability and Independence
222 CHAPTER 6. PROBABILITY 6.3 Conditional Probability and Independence Conditional Probability Two cubical dice each have a triangle painted on one side, a circle painted on two sides and a square painted
More informationSpatial Statistics Chapter 3 Basics of areal data and areal data modeling
Spatial Statistics Chapter 3 Basics of areal data and areal data modeling Recall areal data also known as lattice data are data Y (s), s D where D is a discrete index set. This usually corresponds to data
More informationNonlinear Regression:
Zurich University of Applied Sciences School of Engineering IDP Institute of Data Analysis and Process Design Nonlinear Regression: A Powerful Tool With Considerable Complexity Half-Day : Improved Inference
More informationStock price fluctuations and the mimetic behaviors of traders
Physica A 382 (2007) 172 178 www.elsevier.com/locate/physa Stock price fluctuations and the mimetic behaviors of traders Jun-ichi Maskawa Department of Management Information, Fukuyama Heisei University,
More informationEhlers Filters by John Ehlers
Ehlers Filters by John Ehlers The most common filters used by traders are Moving Averages either Simple Moving Averages (SMA) or Exponential Moving Averages (EMA). These are linear filters. Linear filters
More informationBayesX - Software for Bayesian Inference in Structured Additive Regression
BayesX - Software for Bayesian Inference in Structured Additive Regression Thomas Kneib Faculty of Mathematics and Economics, University of Ulm Department of Statistics, Ludwig-Maximilians-University Munich
More informationData Preparation and Statistical Displays
Reservoir Modeling with GSLIB Data Preparation and Statistical Displays Data Cleaning / Quality Control Statistics as Parameters for Random Function Models Univariate Statistics Histograms and Probability
More informationAppendix 1: Time series analysis of peak-rate years and synchrony testing.
Appendix 1: Time series analysis of peak-rate years and synchrony testing. Overview The raw data are accessible at Figshare ( Time series of global resources, DOI 10.6084/m9.figshare.929619), sources are
More informationEconomics of Strategy (ECON 4550) Maymester 2015 Applications of Regression Analysis
Economics of Strategy (ECON 4550) Maymester 015 Applications of Regression Analysis Reading: ACME Clinic (ECON 4550 Coursepak, Page 47) and Big Suzy s Snack Cakes (ECON 4550 Coursepak, Page 51) Definitions
More informationIntroduction to Regression and Data Analysis
Statlab Workshop Introduction to Regression and Data Analysis with Dan Campbell and Sherlock Campbell October 28, 2008 I. The basics A. Types of variables Your variables may take several forms, and it
More informationMultiple Linear Regression in Data Mining
Multiple Linear Regression in Data Mining Contents 2.1. A Review of Multiple Linear Regression 2.2. Illustration of the Regression Process 2.3. Subset Selection in Linear Regression 1 2 Chap. 2 Multiple
More informationVision based Vehicle Tracking using a high angle camera
Vision based Vehicle Tracking using a high angle camera Raúl Ignacio Ramos García Dule Shu gramos@clemson.edu dshu@clemson.edu Abstract A vehicle tracking and grouping algorithm is presented in this work
More informationUnivariate and Multivariate Methods PEARSON. Addison Wesley
Time Series Analysis Univariate and Multivariate Methods SECOND EDITION William W. S. Wei Department of Statistics The Fox School of Business and Management Temple University PEARSON Addison Wesley Boston
More informationThe Effect of Environmental Factors on Real Estate Value
The Effect of Environmental Factors on Real Estate Value Radoslaw CELLMER, Adam SENETRA, Agnieszka SZCZEPANSKA, Poland Key words: environment, landscape, property value, geostatistics SUMMARY The objective
More informationPLEASE SCROLL DOWN FOR ARTICLE
This article was downloaded by:[kent State University] On: 23 October 2007 Access Details: [subscription number 768485448] Publisher: Taylor & Francis Informa Ltd Registered in England and Wales Registered
More informationSTA 4273H: Statistical Machine Learning
STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.cs.toronto.edu/~rsalakhu/ Lecture 6 Three Approaches to Classification Construct
More informationMulliken suggested to split the shared density 50:50. Then the electrons associated with the atom k are given by:
1 17. Population Analysis Population analysis is the study of charge distribution within molecules. The intention is to accurately model partial charge magnitude and location within a molecule. This can
More informationThe VAR models discussed so fare are appropriate for modeling I(0) data, like asset returns or growth rates of macroeconomic time series.
Cointegration The VAR models discussed so fare are appropriate for modeling I(0) data, like asset returns or growth rates of macroeconomic time series. Economic theory, however, often implies equilibrium
More informationMeasurement with Ratios
Grade 6 Mathematics, Quarter 2, Unit 2.1 Measurement with Ratios Overview Number of instructional days: 15 (1 day = 45 minutes) Content to be learned Use ratio reasoning to solve real-world and mathematical
More information11. Time series and dynamic linear models
11. Time series and dynamic linear models Objective To introduce the Bayesian approach to the modeling and forecasting of time series. Recommended reading West, M. and Harrison, J. (1997). models, (2 nd
More informationSolving Linear Programs
Solving Linear Programs 2 In this chapter, we present a systematic procedure for solving linear programs. This procedure, called the simplex method, proceeds by moving from one feasible solution to another,
More informationUsing rainfall radar data to improve interpolated maps of dose rate in the Netherlands
Using rainfall radar data to improve interpolated maps of dose rate in the Netherlands Paul H. Hiemstra a,, Edzer J. Pebesma b, Gerard B.M. Heuvelink c, Chris J.W. Twenhöfel d a University of Utrecht,
More informationHandling attrition and non-response in longitudinal data
Longitudinal and Life Course Studies 2009 Volume 1 Issue 1 Pp 63-72 Handling attrition and non-response in longitudinal data Harvey Goldstein University of Bristol Correspondence. Professor H. Goldstein
More informationInstitute of Actuaries of India Subject CT3 Probability and Mathematical Statistics
Institute of Actuaries of India Subject CT3 Probability and Mathematical Statistics For 2015 Examinations Aim The aim of the Probability and Mathematical Statistics subject is to provide a grounding in
More informationSYSTEMS OF REGRESSION EQUATIONS
SYSTEMS OF REGRESSION EQUATIONS 1. MULTIPLE EQUATIONS y nt = x nt n + u nt, n = 1,...,N, t = 1,...,T, x nt is 1 k, and n is k 1. This is a version of the standard regression model where the observations
More informationMULTIVARIATE PROBABILITY DISTRIBUTIONS
MULTIVARIATE PROBABILITY DISTRIBUTIONS. PRELIMINARIES.. Example. Consider an experiment that consists of tossing a die and a coin at the same time. We can consider a number of random variables defined
More informationAlgebra 1 2008. Academic Content Standards Grade Eight and Grade Nine Ohio. Grade Eight. Number, Number Sense and Operations Standard
Academic Content Standards Grade Eight and Grade Nine Ohio Algebra 1 2008 Grade Eight STANDARDS Number, Number Sense and Operations Standard Number and Number Systems 1. Use scientific notation to express
More informationGreen = 0,255,0 (Target Color for E.L. Gray Construction) CIELAB RGB Simulation Result for E.L. Gray Match (43,215,35) Equal Luminance Gray for Green
Red = 255,0,0 (Target Color for E.L. Gray Construction) CIELAB RGB Simulation Result for E.L. Gray Match (184,27,26) Equal Luminance Gray for Red = 255,0,0 (147,147,147) Mean of Observer Matches to Red=255
More informationUniversity of Ljubljana Doctoral Programme in Statistics Methodology of Statistical Research Written examination February 14 th, 2014.
University of Ljubljana Doctoral Programme in Statistics ethodology of Statistical Research Written examination February 14 th, 2014 Name and surname: ID number: Instructions Read carefully the wording
More informationExact Nonparametric Tests for Comparing Means - A Personal Summary
Exact Nonparametric Tests for Comparing Means - A Personal Summary Karl H. Schlag European University Institute 1 December 14, 2006 1 Economics Department, European University Institute. Via della Piazzuola
More informationMachine Learning for Medical Image Analysis. A. Criminisi & the InnerEye team @ MSRC
Machine Learning for Medical Image Analysis A. Criminisi & the InnerEye team @ MSRC Medical image analysis the goal Automatic, semantic analysis and quantification of what observed in medical scans Brain
More informationIntegration of Geological, Geophysical, and Historical Production Data in Geostatistical Reservoir Modelling
Integration of Geological, Geophysical, and Historical Production Data in Geostatistical Reservoir Modelling Clayton V. Deutsch (The University of Alberta) Department of Civil & Environmental Engineering
More informationINTEREST RATE DERIVATIVES IN THE SOUTH AFRICAN MARKET BASED ON THE PRIME RATE
INTEREST RATE DERIVATIVES IN THE SOUTH AFRICAN MARKET BASED ON THE PRIME RATE G West * D Abstract erivatives linked to the prime rate of interest have become quite relevant with the introduction to the
More informationModule 5: Multiple Regression Analysis
Using Statistical Data Using to Make Statistical Decisions: Data Multiple to Make Regression Decisions Analysis Page 1 Module 5: Multiple Regression Analysis Tom Ilvento, University of Delaware, College
More informationBasic Statistics and Data Analysis for Health Researchers from Foreign Countries
Basic Statistics and Data Analysis for Health Researchers from Foreign Countries Volkert Siersma siersma@sund.ku.dk The Research Unit for General Practice in Copenhagen Dias 1 Content Quantifying association
More informationUNIVERSITY OF WAIKATO. Hamilton New Zealand
UNIVERSITY OF WAIKATO Hamilton New Zealand Can We Trust Cluster-Corrected Standard Errors? An Application of Spatial Autocorrelation with Exact Locations Known John Gibson University of Waikato Bonggeun
More informationBasics of Floating-Point Quantization
Chapter 2 Basics of Floating-Point Quantization Representation of physical quantities in terms of floating-point numbers allows one to cover a very wide dynamic range with a relatively small number of
More informationFigure 1. Diode circuit model
Semiconductor Devices Non-linear Devices Diodes Introduction. The diode is two terminal non linear device whose I-V characteristic besides exhibiting non-linear behavior is also polarity dependent. The
More informationRandomization Based Confidence Intervals For Cross Over and Replicate Designs and for the Analysis of Covariance
Randomization Based Confidence Intervals For Cross Over and Replicate Designs and for the Analysis of Covariance Winston Richards Schering-Plough Research Institute JSM, Aug, 2002 Abstract Randomization
More informationPhysics Lab Report Guidelines
Physics Lab Report Guidelines Summary The following is an outline of the requirements for a physics lab report. A. Experimental Description 1. Provide a statement of the physical theory or principle observed
More informationMATH2740: Environmental Statistics
MATH2740: Environmental Statistics Lecture 6: Distance Methods I February 10, 2016 Table of contents 1 Introduction Problem with quadrat data Distance methods 2 Point-object distances Poisson process case
More informationSpring Force Constant Determination as a Learning Tool for Graphing and Modeling
NCSU PHYSICS 205 SECTION 11 LAB II 9 FEBRUARY 2002 Spring Force Constant Determination as a Learning Tool for Graphing and Modeling Newton, I. 1*, Galilei, G. 1, & Einstein, A. 1 (1. PY205_011 Group 4C;
More informationAutocovariance and Autocorrelation
Chapter 3 Autocovariance and Autocorrelation If the {X n } process is weakly stationary, the covariance of X n and X n+k depends only on the lag k. This leads to the following definition of the autocovariance
More informationValidating Market Risk Models: A Practical Approach
Validating Market Risk Models: A Practical Approach Doug Gardner Wells Fargo November 2010 The views expressed in this presentation are those of the author and do not necessarily reflect the position of
More informationOutline. Topic 4 - Analysis of Variance Approach to Regression. Partitioning Sums of Squares. Total Sum of Squares. Partitioning sums of squares
Topic 4 - Analysis of Variance Approach to Regression Outline Partitioning sums of squares Degrees of freedom Expected mean squares General linear test - Fall 2013 R 2 and the coefficient of correlation
More informationTrend and Seasonal Components
Chapter 2 Trend and Seasonal Components If the plot of a TS reveals an increase of the seasonal and noise fluctuations with the level of the process then some transformation may be necessary before doing
More informationCredit Risk Models: An Overview
Credit Risk Models: An Overview Paul Embrechts, Rüdiger Frey, Alexander McNeil ETH Zürich c 2003 (Embrechts, Frey, McNeil) A. Multivariate Models for Portfolio Credit Risk 1. Modelling Dependent Defaults:
More informationSENSITIVITY ANALYSIS AND INFERENCE. Lecture 12
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike License. Your use of this material constitutes acceptance of that license and the conditions of use of materials on this
More informationDecomposition of Event Sequences into Independent Components
Decomposition of Event Sequences into Independent Components Heikki Mannila and Dmitry Rusakov 1 Introduction Many real-world processes result in an extensive logs of sequences of events, i.e., events
More informationConfidence Intervals for Exponential Reliability
Chapter 408 Confidence Intervals for Exponential Reliability Introduction This routine calculates the number of events needed to obtain a specified width of a confidence interval for the reliability (proportion
More information3.1. Solving linear equations. Introduction. Prerequisites. Learning Outcomes. Learning Style
Solving linear equations 3.1 Introduction Many problems in engineering reduce to the solution of an equation or a set of equations. An equation is a type of mathematical expression which contains one or
More informationDiscussion. Seppo Laaksonen 1. 1. Introduction
Journal of Official Statistics, Vol. 23, No. 4, 2007, pp. 467 475 Discussion Seppo Laaksonen 1 1. Introduction Bjørnstad s article is a welcome contribution to the discussion on multiple imputation (MI)
More informationCAPM, Arbitrage, and Linear Factor Models
CAPM, Arbitrage, and Linear Factor Models CAPM, Arbitrage, Linear Factor Models 1/ 41 Introduction We now assume all investors actually choose mean-variance e cient portfolios. By equating these investors
More informationINDIRECT INFERENCE (prepared for: The New Palgrave Dictionary of Economics, Second Edition)
INDIRECT INFERENCE (prepared for: The New Palgrave Dictionary of Economics, Second Edition) Abstract Indirect inference is a simulation-based method for estimating the parameters of economic models. Its
More informationStandard errors of marginal effects in the heteroskedastic probit model
Standard errors of marginal effects in the heteroskedastic probit model Thomas Cornelißen Discussion Paper No. 320 August 2005 ISSN: 0949 9962 Abstract In non-linear regression models, such as the heteroskedastic
More informationEconometrics Simple Linear Regression
Econometrics Simple Linear Regression Burcu Eke UC3M Linear equations with one variable Recall what a linear equation is: y = b 0 + b 1 x is a linear equation with one variable, or equivalently, a straight
More information( ) = 1 x. ! 2x = 2. The region where that joint density is positive is indicated with dotted lines in the graph below. y = x
Errata for the ASM Study Manual for Exam P, Eleventh Edition By Dr. Krzysztof M. Ostaszewski, FSA, CERA, FSAS, CFA, MAAA Web site: http://www.krzysio.net E-mail: krzysio@krzysio.net Posted September 21,
More informationAuxiliary Variables in Mixture Modeling: 3-Step Approaches Using Mplus
Auxiliary Variables in Mixture Modeling: 3-Step Approaches Using Mplus Tihomir Asparouhov and Bengt Muthén Mplus Web Notes: No. 15 Version 8, August 5, 2014 1 Abstract This paper discusses alternatives
More informationHedge Effectiveness Testing
Hedge Effectiveness Testing Using Regression Analysis Ira G. Kawaller, Ph.D. Kawaller & Company, LLC Reva B. Steinberg BDO Seidman LLP When companies use derivative instruments to hedge economic exposures,
More informationNCSS Statistical Software Principal Components Regression. In ordinary least squares, the regression coefficients are estimated using the formula ( )
Chapter 340 Principal Components Regression Introduction is a technique for analyzing multiple regression data that suffer from multicollinearity. When multicollinearity occurs, least squares estimates
More informationX X X a) perfect linear correlation b) no correlation c) positive correlation (r = 1) (r = 0) (0 < r < 1)
CORRELATION AND REGRESSION / 47 CHAPTER EIGHT CORRELATION AND REGRESSION Correlation and regression are statistical methods that are commonly used in the medical literature to compare two or more variables.
More information