# Estimating The Theoretical Semivariogram From Finite Numbers of Measurements

Save this PDF as:

Size: px
Start display at page:

## Transcription

1 Estimating The Theoretical Semivariogram From Finite Numbers of Measurements Li Zheng Kansas Geological Survey, University of Kansas, Lawrence, KS Stephen E. Silliman Department of Civil Engineering and Geological Sciences, University of Notre Dame, Notre Dame, IN

2 Abstract We investigate from a theoretical basis the impacts of the number, location, and correlation among measurement points on the quality of an estimate of the semivariogram. The unbiased nature of the semivariogram estimator, γˆ, is first established for a general random process, Z( x). The variance of γˆ Z is then derived as a function of the sampling parameters (the number of measurements and their locations). In applying this function to the case of estimating the semivariograms of the transmissivity and the hydraulic head field, it is shown that the estimation error depends on the number of the data pairs, the correlation among the data pairs [which in turn are determined by the form of the underlying semivariogram, γ, the relative locations of the data pairs, and the separation distance at which the semivariogram is to be estimated. Thus, design of an optimal sampling program for semivariogram estimation should include consideration of each of these factors. Further, the function derived for the variance of γˆ Z is useful in determining the reliability of a semivariogram developed from a previously established sampling design.

3 3 Introduction Given a trend-removed random field, Z( x), the theoretical semivariogram, γ Z, is defined as γ --var[ Z( x) Z( x -- [ Z( x) Z( x+ r) () where x is location, r is the vector lag distance, var stands for the variance and the bracket designates the ensemble averaging. Although the determination of the theoretical semivariogram requires ensemble averaging, we are often limited in practice to one realization and a finite number of measurements of Z( x). Thus, the ensemble average is commonly estimated from the spatial average by assuming ergodicity. For example, given n measurements { Z( x i ), i,..., n} from a single realization, the most commonly used semivariogram estimator, γˆ, for a specified vectorial lag distance, r,is (Matheron, 965), γˆ Z x [ ( i ) Z( x i i () where is the number of the data pairs, Z( x i ) and Z( x i, available from the n measurements. The quality of the estimate of the semivariogram for a given r thus depends on the bias and precision of γˆ, as an estimator for γ. Many have used, the number of the data pairs, Z( x i ) and Z( x i, as an index for measuring the reliability of γˆ. For example, a common rule applied to estimation of the semivariogram is that at least 30 pairs of

4 4 measurements, Z( x i ) and Z( x i, are required for each lag distance r in order to ensure a reliable semi-variogram estimate (e.g., Journel and Huijbregt, 978). Review of the literature indicates that this rule is derived from a simplification of earlier work by Matheron (965). Webster and Oliver (99) have studied this rule through numerical sampling of a random field generated with a known semivariogram. Using a uniform, square sampling grid and comparing sample semivariograms against the known underlying semivariogram, these authors argued that at least measurements are needed to estimate a semivariogram reliably. This emphasis on has also resulted in efforts to devise algorithms to maximize by adjusting the placement of a fixed number of measurements. Russo (984) and Warrick and Myers (987), for example, present algorithms which optimize the location of sampling points based on a series of constraints, including constraints on the number of sample points in each lag distance. A recent paper by Conwell et al. (997) extended this earlier work by taking into account the role of measurement instruments in the sampling design. Morris (99) argued that accurate estimation of the semivariogram depended not only on but also on the correlation among the measurements. He proposed an alternative index, the maximum equivalent uncorrelated pairs, as a measure of the estimation accuracy. However, this index applies only to the special case of a concave semivariogram. Other methods for determining the accuracy of semivariogram estimates involve Monte Carlo simulation (Russo and Jury, 987; Corsten and Stein, 994; etc.) and the subsampling method (Chung, 984; Shafer and Varljen, 990). The Monte Carlo approach obtains the confidence interval via repeated sampling from multiple realizations of a random field. The subsampling method involves subdividing a measurement set into sub-samples, and then estimating the semivariogram from each subgroup of samples, thus allowing characterization of

5 5 the variation of parameter estimates between sub-samples. In the present paper, the variance of the semivariogram estimate, σ γˆ, is determined theoretically. This theoretical result is used to obtain a functional relationship between σ γˆ and the sampling parameters (e.g., the number of the measurements and their locations). This general relationship then enables us to examine the interactions between the magnitude of σ γˆ and its various controlling mechanisms. The interactions among the number of measurements, sampling locations, the underlying semivariogram, and measurement error are illustrated through applying this relationship to the estimation of the semivariograms of the transmissivity and the hydraulic head field. Finally, the implications of these results are discussed and extended to possible applications for sampling design and interpretation of the reliability of semivariogram estimates. Theoretical Background The bias of the semivariogram estimator, γˆ, can be examined for a random field by determining the ensemble mean value of both sides of (). This leads to γˆ Z( x [ i ) Z( x i i i -- [ Z( x i ) Z( x i (3) Applying () to (3), γˆ γ γ ( r ) (4) Thus γˆ, as given in (), is an unbiased estimator for γ. Since γˆ is an unbiased estimator, σ γˆ can then be written as,

6 6 σ γˆ r ( ) [ γˆ γ (5) Substituting () into (5) provides, N r ( ) [ Z( x i ) Z( x i γ i N r ( ) [ Z( x i ) Z( x i N γ σ γˆ i (6) To simplify the notation, we define a new random variable, S( x, r), as, S( x, r) [ Z( x) Z( x (7) and obtain S( x, r) γ (8) Substituting (7) and (8) into (6) results in N r ( ) N [ S( x i S( x i i N σ S ( xi N cov[ S( x i, S( x j σ γˆ i j> i i (9) where cov[ S( x i, S( x j is the covariance between S( x i and S( x j, and σ S( x i is the variance of S( x i. Equation (9) is essentially identical to equations provided in classical textbooks on statistics for the variance of the sample mean (e.g., equation (4..), Page 378 in Benjamin and Cornell, 970) with the exception that here the variable is S( x i rather than Z( x i ). The key to

7 7 understanding the importance of (9) is to determine the dependence of cov[ S( x i, S( x j and on sampling design. σ S( x i As given in appendix, for a gaussian random variable Z( x), we can derive cov[ S( x i, S( x j -- [ γ ( r + R ij ) + γ( r R ij ) γ ( R ij ) (0) where R ij x j x i is the separation distance between S( x i and S( x j. Since γ is not a function of x, cov[ S( x i, S( x j as given in (0) is also independent of x, but depends on the relative separation between x i and x j. It is noted further that cov[ S( x i, S( x j is anisotropic even if γ is isotropic. When R ij 0 in (0), S has variance of σ S( x, r) σ Sr ( ) γ () Substituting () into (9), the expression for may now be written as σ γˆ σ γˆ γ N cov[ S( x i, S( x j j> i i () where the expression for cov[ S( x i, S( x j is given in (0). A similar expression for σ γˆ r ( ) is used by Cressie (984) in the context of performing weighted least-square fitting. To facilitate the comparison of various s at different lag distances, we divide σ γˆ σ γˆ by γ, to obtain the coefficient of variation, ρ γˆ, ρ γˆ N coe[ S( x i, S( x j j> i i -- (3)

8 8 where coe[ S( x i, S( x j cov[ S( x i, S( x j γ cov[ S( x i, S( x j σ S( x i (4) Hence, ρ γˆ consists of two parts. The first part is completely determined by, the number of data pairs. The second part is a function of both the number of pairs and the correlation among the S( x i s. In two special cases, the expression for ρ γˆ can be simplified. First, when the S( x i s are independent, all coe[ S( x i, S( x j s are equal to zero, and ρ γˆ (5) In this case, the estimation accuracy depends only on. This relationship is identical to the result from classical statistics, where all the samples are taken independently. The second special case is when S( x i s are completely correlated with each other; all coe[ S( x i, S( x j s ρ γˆ are then equal to one. Thus reduces to a constant as follows, N ρ γˆ -- (6) That is the estimation accuracy becomes independent of sampling. An example of such extreme cases can be found in a random field whose semivariogram grows quadratically with the increasing separation distance. Other than in these special cases, the estimation precision of γˆ (defined here as the ρ γˆ r magnitude of ( ) ) is determined not only by but also by the summation of the

9 9 coe[ S( x i, S( x j, the correlation coefficients among the squared increments, [ Z( x i ) Z( x i. This correlation coefficient, according to (0), is determined by the separation distance between x i and x j ( i, j,, ), the lag distance, r, at which the semivariogram is to be estimated, and the underlying semivariogram, γ. Thus, (3) gives an explicit expression describing the interactions among the variance of the estimate of the semivariogram, the number of data, and the sampling design. It is a general relationship that holds in any random field, Z(x), provided that the increments of Z(x) are normally distributed. Illustration Using Hydraulic Head and Transmissivity Fields In order to illustrate the impact of (3) on problems of interest to hydrogeologist, a comparison is made between estimating the semivariogram of the transmissivity (T) and estimating the semivariogram of the head residual (h, the hydraulic head minus the mean trend in the head). These particular parameters were chosen as they are two of the most frequently studied spatial processes in groundwater hydrology. For the following illustration, it is assumed that T is log-normally distributed. A new, normally distributed random variable, Y, is therefore defined as Yln(T). It is further assumed that Y is second-order stationary, exists within an infinite spatial flow domain, and is characterized by an isotropic, exponential covariance function, γ Y σ Y[ exp( r' ) (7) r where σ Y is the variance of Y, and r' ---- is the magnitude of the separation vector r λ

10 0 normalized by the integral scale, λ, of Y. Dagan (985) has presented the first and second moments of the distribution of the head residual under mean uniform flow within the type of transmissivity field described above. Of particular interest here is Dagan s expression for the semivariogram of the head residual γ h J λ σ y ( cosψ) exp( r') ( r' + 3r' + 3) 3 ( ) r' + (8) ( cosψ) -- Ei( r' ) + ln( r' ) + exp( r' ) + ( e ) where J is the magnitude of the mean regional head gradient J, ψ is the direction of r relative to the orientation of J, Ei is the exponential integral function, and e is Euler s constant (0.57). J λ σ Figure shows plots of γ Y (normalized by σ y ) and γ h (normalized by Y ) versus the normalized separation distance r'. The anisotropic γ h is plotted for two different π directions, one being parallel with J, i.e., ψ 0, and the other perpendicular to J, i.e., ψ --. As shown in the figure, γ Y asymptotically approaches a sill as the separation distance increases, whereas γ h grows logarithmically with the increasing separation distance. γ Y also has a finite integral scale, λ, while the hydraulic head field is correlated over a much longer distance and no finite integral scale can be defined. Hence, these two variables have fundamentally different structure in their semivariograms. This difference in structure has dramatic impact on the potential to estimate these semivariograms using data collected from a single realization. In order to simplify a comparison

11 of the estimation of the semivariogram for the Y and h fields, the semivariograms are evaluated only for the cases in which the orientation of the lag r is parallel to the direction of the mean gradient, J. (It is straightforward to extend the analysis and its results to other directions if necessary.) Under this condition, and for fixed r, the expressions for the theoretical semivariograms {(7) and (8), respectively} can be substituted into (4) and (0) to obtain coe[ S( x i, S( x j as a function of R ij. Figures a and b illustrate the correlation functions for S Y ( x, r) and S h ( x, r). As R ij is a two-dimensional vector, these are three dimensional plots with the horizontal axes defining the directional components of R ij, and the vertical axis providing the magnitude of the correlation coefficient. In both figures, r has a magnitude of 0 units and is oriented parallel to J. Both R ij and r are normalized by λ. Figures a and b show that the correlation structures for both ( x, r) and S h ( x, r) are anisotropic. Further, a local maximum in correlation exists at r. Finally, the figures show that S h ( x, r) is correlated R ij over longer distances than is ( x, r). S Y S Y The contribution of the correlation between sample pairs can be illustrated only for a given sampling pattern. Hence, a uniform square grid sampling network is here utilized to illustrate the cumulative contribution of coe[ S( x i, S( x j to the magnitude of ρ γˆ. This sampling network is set to be aligned with the direction of J, contains n sample points in a square array (where n is the number of points at which T and h are measured) and has minimum spacing, d, between sample points (see Figure 3 for an example in which n 36 and d.0). For the discussion below, the minimum separation distance between the S( x i s, m, was set to be equal to d. Thus, specific to this sampling scheme, it is straight forward to show that a general

12 relationship between the number of S( x i s (i.e., the number of measurement pairs for a particular r) and the number of sample points, n, exists as r n n d (9) Using the sampling scheme as defined above, it is possible to calculate ρ γ according to (3) for a specific r. By varying n, it is possible to adjust N(r) and evaluate as a function of N(r). Figure 4 shows and ρ γˆ h versus when r0 along the direction of J. This plot ρ γˆ Y also includes the curve for ρ γˆ ρ γ , the result obtained by assuming zero correlation among the S( x i s (see 5). From this plot, it is apparent that the correlation among S( x i s strongly influences the rate of reduction of ρ γˆ with increasing N(r). In order to achieve, for example, a value for ρ γˆ of 0.8, N(r) need only be around 5for both uncorrelated data and the transmissivity field. For the head residual, this number increases to approximately 00. Hence, nearly 0 times the data pairs are required to achieve the same coefficient of variation for semivariogram of the head residuals as would be required for an uncorrelated variable or the random fields with short correlation range. One interesting dependence which was further investigated was the relationship between the coefficients of variation, and, and r. Based on the same sampling scheme with ρ γˆ Y ρ γˆ h 64, ρ γˆ Y and ρ γˆ h vary with r as shown in Figure 5. Once again, ρ γˆ h is consistently larger than ρ γˆ Y for all r, a result of the head pairs being correlated over longer distances than transmissivity pairs. Further, increasing r appears to have a greater adverse impact on the head residuals than on the transmissivity (i.e., appears to grow with r whereas appears to ρ γˆ h ρ γˆ Y

13 3 be relatively insensitive to r). These results imply that not only will the separation distance among sample points need to be modified for estimating the semivariogram of Y versus the semivariogram of h, the basic design of the measurement locations must be modified as well (with approximately uniform distribution of data pairs among lag classes for Y and increasing number of data pairs with increasing lag distance required for h). Discussion and Conclusions The theoretical analysis of the sample semivariogram shows that the semivariogram ρ γˆ estimator as given in () is unbiased, but that the coefficient of variation of the estimator,, depends not only on the number of the data pairs,, but also on the correlation among the data pairs. This correlation is, in turn, related to the form of the underlying semivariogram, γ, the relative locations of the data pairs, and the lag distance, r, at which the semivariogram is to be estimated. When the increment, S( x, r), is Gaussian, knowledge of γ is sufficient to define the correlation structure among the squared increments according to equation (0). Equation (0) leads to at least three significant observations. First, the reliability of a semivariogram estimate derived from measured data is dependent not only on the number of data points collected, but also on the parameter being measured (through the semivariogram of that parameter). Second, random variables exhibiting correlation over large distances are very likely to have squared increments which are highly correlated. Thus the sample semivariogram estimate for a random variable which is correlated at large distances will tend to be unreliable and caution should be used in interpreting sample semivariograms exhibiting a long range correlation structure (e.g., a power law semivariogram). Should a field data set imply such a long range

14 4 correlation structure, equation (0), or a modification thereof for non-gaussian random variables, should be used to determine an estimate of the coefficient of variation of the sample semivariogram at each lag distance, thus providing a measure of confidence on the structure observed in the sample data. Third, the optimal distribution of data pairs over the different lag distances at which the sample semivariogram is to be estimated is also a function of the underlying semivariogram. As was shown, a reliable estimate of the semivariogram at various lags for the transmissivity (subject to the constraints outlined above) could be accomplished with relatively uniform numbers of data pairs in each lag class. In contrast, estimating the semivariogram of the head residual, with equal coefficient of variation in each lag class, would require increasing numbers of data pairs as the magnitude of the lag distance is increased. Appendix: Detailed Derivation for Equation (0) Given the definition of S( x i as in (7) and its ensemble mean (8), an expression for cov[ S( x i, S( x j can be written as cov[ S( x i, S( x j -- [ Z( x 4 i ) Z( x i [ Z( x j ) Z( x j γ (0) Using the joint moment generating function, Papoulis (example 7-6 in page 58,984) shows that, if two random variables, X and X, are jointly normal with zero mean, the following relationship holds: X X X X + X X ()

15 5 Assuming that the increment, Z( x) Z( x+ r), is jointly normally distributed (which holds, at least, for the case in which Z( x) is jointly normal), the application of () to (0) leads to, cov[ S( x i, S( x j -- [ Z( x 4 i ) Z( x i [ Z( x j ) Z( x j γ -- [ Z( x 4 i ) Z( x i [ Z( x j ) Z( x j + -- [ Z( x i ) Z( x i [ Z( x j ) Z( x j γ -- [ Z( x i ) Z( x i [ Z( x j ) Z( x j () Further expanding [ Z( x i ) Z( x i [ Z( x j ) Z( x j into four terms results in cov[ S( x i, S( x j (3) [ Z( x -- i ) Z( x j [ Z( x j ) Z( x i [ Z( x i ) Z( x j ) [ Z( x i Z( x j [ γ ( r + x j x i ) + γ( r x j + x i ) γ ( x j x i ) -- [ γ ( r + R ij ) + γ( r R ij ) γ ( R ij ) where R ij x j x i is the separation distance between S( x i and S( x j. Acknowledgments. This research is partially supported by U. S. Department of Energy s Subsurface Science Program. The authors also like to thank T. Yegulalp and an anonymous reviewer for their thorough and thoughtful review.

16 6 References Benjamin, J. R., and C. A. Cornell, Probability, Statistics, and Decision for Civil Engineers., McGraw-Hill Book Company, New York, 970. Chung, C.F., Use of the Jackknife method to estimate autocorrelation functions (or variograms), in Geostatistics for Natural Resources Characterization, vol. I, edited by G. Verly, M. David, A. G. Journel, and A. Marechal, pp , D. Reidel Publishing Company, Dordrecht, 984. Conwell, P. M., S. E. Silliman, and L. Zheng, Design of a piezometer network for estimation of the variogram of the hydraulic gradient: The role of the instrument, Water Resour. Res., 33(), , 997. Corsten, L.C.A., and A. Stein, Nested sampling for estimating spatial semivariograms compared to other designs, Applied Stochastic Models and Data Analysis, 0, 03-, 994. Cressie, N., Statistics for Spatial Data, Rev. ed., Wiley, New York, 993. Dagan, G., Stochastic modeling of groundwater flow by conditional and unconditional probabilities: the inverse problem, Water Resour. Res., (), 65-7, 985. Journel, A. G., and C. J. Huijbregts, Mining Geostatistics, Academic Press, London, England, 978. Matheron, G., Les Variables Regionalisees et leur Estimation, Masson, Paris, 965. Morris, M. D., On counting the number of data pairs for semivariogram estimation, Math. Geol., 3(7), , 99. Papoulis, A., Probability, Random Variables, and Stochastic Processes, nd edition, McGraw Hill, New York, 984. Russo, D., Design of an optimal sampling network for estimating the variogram, Soil Sci. Soc. Am. J., 48, , 984. Russo, D., and W. A. Jury, A theoretical study of the estimation of the correlation scale in spatially variable fields,, Stationary fields, Water Resour. Res., 3(7), 57-68, 987. Shafer, J. M., and M. D. Varljen, Approximation of confidence limits on sample semivariograms from single realizations of spatially correlated random fields, Water Resour. Res., 6(8), , 990. Warrick, A.W., and D.E. Myers, Optimization of sampling locations for variogram calculations, Water Resour. Res., 3(3), , 987. Webster, R., and M. A. Oliver, Sample adequately to estimate variograms for soil properties, J. Soil Sci., 43, 77-9, 99.

17 7 List of Figures Figure. Comparison of the head residual semivariogram with the log-transmissivity semivariogram. R ij Figure. Distribution of the correlation coefficient as a function of the separation vector,, with r 0 along the direction of J: (a) For the transmissivity, (b) For the head residual. Figure 3. An example of a uniform grid sampling network with an example in which n 36 and d m.0. Figure 4. Plots showing and ρ γˆ h versus with r 0 along the direction of ρ γˆ Y J. This figure also shows ρ γˆ versus for uncorrelated samples. Figure 5. ρ γˆ versus r for γˆ Y and γˆ h respectively.

### INTRODUCTION TO GEOSTATISTICS And VARIOGRAM ANALYSIS

INTRODUCTION TO GEOSTATISTICS And VARIOGRAM ANALYSIS C&PE 940, 17 October 2005 Geoff Bohling Assistant Scientist Kansas Geological Survey geoff@kgs.ku.edu 864-2093 Overheads and other resources available

### Basics of Point-Referenced Data Models

Basics of Point-Referenced Data Models Basic tool is that of a spatial process, {Y (s),s D}, where D R r Note that time series follows this approach with r = 1; we will usually have r = 2 or 3 We begin

### Limitations of Indicator Kriging for Predicting Data with Trend

Limitations of Indicator Kriging for Predicting Data with Trend Andreas Papritz ETH Zurich, Department of Environmental Sciences, Zurich, Switzerland papritz@env.ethz.ch Abstract. Goovaerts and Journel

### ELEC-E8104 Stochastics models and estimation, Lecture 3b: Linear Estimation in Static Systems

Stochastics models and estimation, Lecture 3b: Linear Estimation in Static Systems Minimum Mean Square Error (MMSE) MMSE estimation of Gaussian random vectors Linear MMSE estimator for arbitrarily distributed

### Geography 4203 / 5203. GIS Modeling. Class (Block) 9: Variogram & Kriging

Geography 4203 / 5203 GIS Modeling Class (Block) 9: Variogram & Kriging Some Updates Today class + one proposal presentation Feb 22 Proposal Presentations Feb 25 Readings discussion (Interpolation) Last

### Annealing Techniques for Data Integration

Reservoir Modeling with GSLIB Annealing Techniques for Data Integration Discuss the Problem of Permeability Prediction Present Annealing Cosimulation More Details on Simulated Annealing Examples SASIM

### Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model

Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model 1 September 004 A. Introduction and assumptions The classical normal linear regression model can be written

### AMARILLO BY MORNING: DATA VISUALIZATION IN GEOSTATISTICS

AMARILLO BY MORNING: DATA VISUALIZATION IN GEOSTATISTICS William V. Harper 1 and Isobel Clark 2 1 Otterbein College, United States of America 2 Alloa Business Centre, United Kingdom wharper@otterbein.edu

### Kriging Interpolation

Kriging Interpolation Kriging is a geostatistical interpolation technique that considers both the distance and the degree of variation between known data points when estimating values in unknown areas.

### Wooldridge, Introductory Econometrics, 4th ed. Chapter 15: Instrumental variables and two stage least squares

Wooldridge, Introductory Econometrics, 4th ed. Chapter 15: Instrumental variables and two stage least squares Many economic models involve endogeneity: that is, a theoretical relationship does not fit

### An introduction to Value-at-Risk Learning Curve September 2003

An introduction to Value-at-Risk Learning Curve September 2003 Value-at-Risk The introduction of Value-at-Risk (VaR) as an accepted methodology for quantifying market risk is part of the evolution of risk

### MATH BOOK OF PROBLEMS SERIES. New from Pearson Custom Publishing!

MATH BOOK OF PROBLEMS SERIES New from Pearson Custom Publishing! The Math Book of Problems Series is a database of math problems for the following courses: Pre-algebra Algebra Pre-calculus Calculus Statistics

### Measuring Line Edge Roughness: Fluctuations in Uncertainty

Tutor6.doc: Version 5/6/08 T h e L i t h o g r a p h y E x p e r t (August 008) Measuring Line Edge Roughness: Fluctuations in Uncertainty Line edge roughness () is the deviation of a feature edge (as

### Higher Education Math Placement

Higher Education Math Placement Placement Assessment Problem Types 1. Whole Numbers, Fractions, and Decimals 1.1 Operations with Whole Numbers Addition with carry Subtraction with borrowing Multiplication

### Statistical and Geostatistical Analysis of Rain Fall in Central Japan

amedas-iamg21-shoji-proc.doc 6/29/1 (2:39 PM) Statistical and Geostatistical Analysis of Rain Fall in Central Japan Tetsuya Shoji and Hisashi Kitaura Department of Environment Systems School of Frontier

### Autocovariance and Autocorrelation

Chapter 3 Autocovariance and Autocorrelation If the {X n } process is weakly stationary, the covariance of X n and X n+k depends only on the lag k. This leads to the following definition of the autocovariance

### L10: Probability, statistics, and estimation theory

L10: Probability, statistics, and estimation theory Review of probability theory Bayes theorem Statistics and the Normal distribution Least Squares Error estimation Maximum Likelihood estimation Bayesian

### Using Hierarchical Linear Models to Measure Growth. Measurement Incorporated Hierarchical Linear Models Workshop

Using Hierarchical Linear Models to Measure Growth Measurement Incorporated Hierarchical Linear Models Workshop Chapter 6 Motivation Linear Growth Curves Quadratic Growth Curves Some other Growth Curves

### Time Series Analysis

Time Series Analysis Time series and stochastic processes Andrés M. Alonso Carolina García-Martos Universidad Carlos III de Madrid Universidad Politécnica de Madrid June July, 2012 Alonso and García-Martos

Ronald Christensen Advanced Linear Modeling Multivariate, Time Series, and Spatial Data; Nonparametric Regression and Response Surface Maximization Second Edition Springer Preface to the Second Edition

### An Interactive Tool for Residual Diagnostics for Fitting Spatial Dependencies (with Implementation in R)

DSC 2003 Working Papers (Draft Versions) http://www.ci.tuwien.ac.at/conferences/dsc-2003/ An Interactive Tool for Residual Diagnostics for Fitting Spatial Dependencies (with Implementation in R) Ernst

### Robustness of the Ljung-Box Test and its Rank Equivalent

Robustness of the Test and its Rank Equivalent Patrick Burns patrick@burns-stat.com http://www.burns-stat.com This Draft: 6 October 2002 Abstract: The test is known to be robust. This paper reports on

### Current Standard: Mathematical Concepts and Applications Shape, Space, and Measurement- Primary

Shape, Space, and Measurement- Primary A student shall apply concepts of shape, space, and measurement to solve problems involving two- and three-dimensional shapes by demonstrating an understanding of:

### Data Preparation and Statistical Displays

Reservoir Modeling with GSLIB Data Preparation and Statistical Displays Data Cleaning / Quality Control Statistics as Parameters for Random Function Models Univariate Statistics Histograms and Probability

### Chapter 8 Graphs and Functions:

Chapter 8 Graphs and Functions: Cartesian axes, coordinates and points 8.1 Pictorially we plot points and graphs in a plane (flat space) using a set of Cartesian axes traditionally called the x and y axes

### FE670 Algorithmic Trading Strategies. Stevens Institute of Technology

FE670 Algorithmic Trading Strategies Lecture 6. Portfolio Optimization: Basic Theory and Practice Steve Yang Stevens Institute of Technology 10/03/2013 Outline 1 Mean-Variance Analysis: Overview 2 Classical

### Least Squares Estimation

Least Squares Estimation SARA A VAN DE GEER Volume 2, pp 1041 1045 in Encyclopedia of Statistics in Behavioral Science ISBN-13: 978-0-470-86080-9 ISBN-10: 0-470-86080-4 Editors Brian S Everitt & David

### Probability and Random Variables. Generation of random variables (r.v.)

Probability and Random Variables Method for generating random variables with a specified probability distribution function. Gaussian And Markov Processes Characterization of Stationary Random Process Linearly

### SYSM 6304: Risk and Decision Analysis Lecture 3 Monte Carlo Simulation

SYSM 6304: Risk and Decision Analysis Lecture 3 Monte Carlo Simulation M. Vidyasagar Cecil & Ida Green Chair The University of Texas at Dallas Email: M.Vidyasagar@utdallas.edu September 19, 2015 Outline

### Variogram Fitting by Generalized Least Squares Using an Explicit Formula for the Covariance Structure 1. Marc G. Genton 2

Mathematical Geology, Vol. 30, No. 4, 1998 Variogram Fitting by Generalized Least Squares Using an Explicit Formula for the Covariance Structure 1 Marc G. Genton 2 In the context of spatial statistics,

### Review of Fundamental Mathematics

Review of Fundamental Mathematics As explained in the Preface and in Chapter 1 of your textbook, managerial economics applies microeconomic theory to business decision making. The decision-making tools

Lecture 5: Linear least-squares Regression III: Advanced Methods William G. Jacoby Department of Political Science Michigan State University http://polisci.msu.edu/jacoby/icpsr/regress3 Simple Linear Regression

### Least-Squares Intersection of Lines

Least-Squares Intersection of Lines Johannes Traa - UIUC 2013 This write-up derives the least-squares solution for the intersection of lines. In the general case, a set of lines will not intersect at a

Statistics Graduate Courses STAT 7002--Topics in Statistics-Biological/Physical/Mathematics (cr.arr.).organized study of selected topics. Subjects and earnable credit may vary from semester to semester.

### RANDOM VIBRATION AN OVERVIEW by Barry Controls, Hopkinton, MA

RANDOM VIBRATION AN OVERVIEW by Barry Controls, Hopkinton, MA ABSTRACT Random vibration is becoming increasingly recognized as the most realistic method of simulating the dynamic environment of military

### Problem Set 4: Covariance functions and Gaussian random fields

Problem Set : Covariance functions and Gaussian random fields GEOS 7: Inverse Problems and Parameter Estimation, Carl Tape Assigned: February 9, 15 Due: February 1, 15 Last compiled: February 5, 15 Overview

### The Delta Method and Applications

Chapter 5 The Delta Method and Applications 5.1 Linear approximations of functions In the simplest form of the central limit theorem, Theorem 4.18, we consider a sequence X 1, X,... of independent and

### Instrumental Variables & 2SLS

Instrumental Variables & 2SLS y 1 = β 0 + β 1 y 2 + β 2 z 1 +... β k z k + u y 2 = π 0 + π 1 z k+1 + π 2 z 1 +... π k z k + v Economics 20 - Prof. Schuetze 1 Why Use Instrumental Variables? Instrumental

### Appendix E: Graphing Data

You will often make scatter diagrams and line graphs to illustrate the data that you collect. Scatter diagrams are often used to show the relationship between two variables. For example, in an absorbance

### UNCERT: GEOSTATISTICAL, GROUND WATER MODELING, AND VISUALIZATION SOFTWARE

CHAPTER 6 UNCERT: GEOSTATISTICAL, GROUND WATER MODELING, AND VISUALIZATION SOFTWARE UNCERT is an uncertainty analysis, geostatistical, ground water modeling, and visualization software package. It was

### Applied Spatial Statistics in R, Section 5

Applied Spatial Statistics in R, Section 5 Geostatistics Yuri M. Zhukov IQSS, Harvard University January 16, 2010 Yuri M. Zhukov (IQSS, Harvard University) Applied Spatial Statistics in R, Section 5 January

### Instrumental Variables & 2SLS

Instrumental Variables & 2SLS y 1 = β 0 + β 1 y 2 + β 2 z 1 +... β k z k + u y 2 = π 0 + π 1 z k+1 + π 2 z 1 +... π k z k + v Economics 20 - Prof. Schuetze 1 Why Use Instrumental Variables? Instrumental

### QUADRATIC, EXPONENTIAL AND LOGARITHMIC FUNCTIONS

QUADRATIC, EXPONENTIAL AND LOGARITHMIC FUNCTIONS Content 1. Parabolas... 1 1.1. Top of a parabola... 2 1.2. Orientation of a parabola... 2 1.3. Intercept of a parabola... 3 1.4. Roots (or zeros) of a parabola...

### Supplement to Call Centers with Delay Information: Models and Insights

Supplement to Call Centers with Delay Information: Models and Insights Oualid Jouini 1 Zeynep Akşin 2 Yves Dallery 1 1 Laboratoire Genie Industriel, Ecole Centrale Paris, Grande Voie des Vignes, 92290

### Overview of Math Standards

Algebra 2 Welcome to math curriculum design maps for Manhattan- Ogden USD 383, striving to produce learners who are: Effective Communicators who clearly express ideas and effectively communicate with diverse

### Lecture 1: Introduction to Random Walks and Diffusion

Lecture : Introduction to Random Walks and Diffusion Scribe: Chris H. Rycroft (and Martin Z. Bazant) Department of Mathematics, MIT February, 5 History The term random walk was originally proposed by Karl

### European Options Pricing Using Monte Carlo Simulation

European Options Pricing Using Monte Carlo Simulation Alexandros Kyrtsos Division of Materials Science and Engineering, Boston University akyrtsos@bu.edu European options can be priced using the analytical

### Lecture 1: Univariate Time Series B41910: Autumn Quarter, 2008, by Mr. Ruey S. Tsay

Lecture 1: Univariate Time Series B41910: Autumn Quarter, 2008, by Mr. Ruey S. Tsay 1 Some Basic Concepts 1. Time Series: A sequence of random variables measuring certain quantity of interest over time.

### College Algebra. Barnett, Raymond A., Michael R. Ziegler, and Karl E. Byleen. College Algebra, 8th edition, McGraw-Hill, 2008, ISBN: 978-0-07-286738-1

College Algebra Course Text Barnett, Raymond A., Michael R. Ziegler, and Karl E. Byleen. College Algebra, 8th edition, McGraw-Hill, 2008, ISBN: 978-0-07-286738-1 Course Description This course provides

### Principal Components Analysis (PCA)

Principal Components Analysis (PCA) Janette Walde janette.walde@uibk.ac.at Department of Statistics University of Innsbruck Outline I Introduction Idea of PCA Principle of the Method Decomposing an Association

### Inferential Statistics

Inferential Statistics Sampling and the normal distribution Z-scores Confidence levels and intervals Hypothesis testing Commonly used statistical methods Inferential Statistics Descriptive statistics are

### Jitter Measurements in Serial Data Signals

Jitter Measurements in Serial Data Signals Michael Schnecker, Product Manager LeCroy Corporation Introduction The increasing speed of serial data transmission systems places greater importance on measuring

### HLM: A Gentle Introduction

HLM: A Gentle Introduction D. Betsy McCoach, Ph.D. Associate Professor, Educational Psychology Neag School of Education University of Connecticut Other names for HLMs Multilevel Linear Models incorporate

Academic Content Standards Grade Eight and Grade Nine Ohio Algebra 1 2008 Grade Eight STANDARDS Number, Number Sense and Operations Standard Number and Number Systems 1. Use scientific notation to express

### Deriving character tables: Where do all the numbers come from?

3.4. Characters and Character Tables 3.4.1. Deriving character tables: Where do all the numbers come from? A general and rigorous method for deriving character tables is based on five theorems which in

### INTRODUCTORY STATISTICS

INTRODUCTORY STATISTICS FIFTH EDITION Thomas H. Wonnacott University of Western Ontario Ronald J. Wonnacott University of Western Ontario WILEY JOHN WILEY & SONS New York Chichester Brisbane Toronto Singapore

### Spatial sampling effect of laboratory practices in a porphyry copper deposit

Spatial sampling effect of laboratory practices in a porphyry copper deposit Serge Antoine Séguret Centre of Geosciences and Geoengineering/ Geostatistics, MINES ParisTech, Fontainebleau, France ABSTRACT

### Imputing Missing Data using SAS

ABSTRACT Paper 3295-2015 Imputing Missing Data using SAS Christopher Yim, California Polytechnic State University, San Luis Obispo Missing data is an unfortunate reality of statistics. However, there are

### Introduction to Monte-Carlo Methods

Introduction to Monte-Carlo Methods Bernard Lapeyre Halmstad January 2007 Monte-Carlo methods are extensively used in financial institutions to compute European options prices to evaluate sensitivities

### Integrated Resource Plan

Integrated Resource Plan March 19, 2004 PREPARED FOR KAUA I ISLAND UTILITY COOPERATIVE LCG Consulting 4962 El Camino Real, Suite 112 Los Altos, CA 94022 650-962-9670 1 IRP 1 ELECTRIC LOAD FORECASTING 1.1

### Testing for serial correlation in linear panel-data models

The Stata Journal (2003) 3, Number 2, pp. 168 177 Testing for serial correlation in linear panel-data models David M. Drukker Stata Corporation Abstract. Because serial correlation in linear panel-data

### Probability and Statistics

CHAPTER 2: RANDOM VARIABLES AND ASSOCIATED FUNCTIONS 2b - 0 Probability and Statistics Kristel Van Steen, PhD 2 Montefiore Institute - Systems and Modeling GIGA - Bioinformatics ULg kristel.vansteen@ulg.ac.be

### Increasing for all. Convex for all. ( ) Increasing for all (remember that the log function is only defined for ). ( ) Concave for all.

1. Differentiation The first derivative of a function measures by how much changes in reaction to an infinitesimal shift in its argument. The largest the derivative (in absolute value), the faster is evolving.

### ATTACHMENT 8: Quality Assurance Hydrogeologic Characterization of the Eastern Turlock Subbasin

ATTACHMENT 8: Quality Assurance Hydrogeologic Characterization of the Eastern Turlock Subbasin Quality assurance and quality control (QA/QC) policies and procedures will ensure that the technical services

### Java Modules for Time Series Analysis

Java Modules for Time Series Analysis Agenda Clustering Non-normal distributions Multifactor modeling Implied ratings Time series prediction 1. Clustering + Cluster 1 Synthetic Clustering + Time series

### Understanding Poles and Zeros

MASSACHUSETTS INSTITUTE OF TECHNOLOGY DEPARTMENT OF MECHANICAL ENGINEERING 2.14 Analysis and Design of Feedback Control Systems Understanding Poles and Zeros 1 System Poles and Zeros The transfer function

### 4.3 Moving Average Process MA(q)

66 CHAPTER 4. STATIONARY TS MODELS 4.3 Moving Average Process MA(q) Definition 4.5. {X t } is a moving-average process of order q if X t = Z t + θ 1 Z t 1 +... + θ q Z t q, (4.9) where and θ 1,...,θ q

### Tutorial on Markov Chain Monte Carlo

Tutorial on Markov Chain Monte Carlo Kenneth M. Hanson Los Alamos National Laboratory Presented at the 29 th International Workshop on Bayesian Inference and Maximum Entropy Methods in Science and Technology,

### Joint Probability Distributions and Random Samples. Week 5, 2011 Stat 4570/5570 Material from Devore s book (Ed 8), and Cengage

5 Joint Probability Distributions and Random Samples Week 5, 2011 Stat 4570/5570 Material from Devore s book (Ed 8), and Cengage Two Discrete Random Variables The probability mass function (pmf) of a single

### NEW YORK STATE TEACHER CERTIFICATION EXAMINATIONS

NEW YORK STATE TEACHER CERTIFICATION EXAMINATIONS TEST DESIGN AND FRAMEWORK September 2014 Authorized for Distribution by the New York State Education Department This test design and framework document

### Algebra and Geometry Review (61 topics, no due date)

Course Name: Math 112 Credit Exam LA Tech University Course Code: ALEKS Course: Trigonometry Instructor: Course Dates: Course Content: 159 topics Algebra and Geometry Review (61 topics, no due date) Properties

### Standard Deviation Calculator

CSS.com Chapter 35 Standard Deviation Calculator Introduction The is a tool to calculate the standard deviation from the data, the standard error, the range, percentiles, the COV, confidence limits, or

### Predictability of Average Inflation over Long Time Horizons

Predictability of Average Inflation over Long Time Horizons Allan Crawford, Research Department Uncertainty about the level of future inflation adversely affects the economy because it distorts savings

### 1 Teaching notes on GMM 1.

Bent E. Sørensen January 23, 2007 1 Teaching notes on GMM 1. Generalized Method of Moment (GMM) estimation is one of two developments in econometrics in the 80ies that revolutionized empirical work in

### Geostatistics Exploratory Analysis

Instituto Superior de Estatística e Gestão de Informação Universidade Nova de Lisboa Master of Science in Geospatial Technologies Geostatistics Exploratory Analysis Carlos Alberto Felgueiras cfelgueiras@isegi.unl.pt

### Estimation and Inference in Cointegration Models Economics 582

Estimation and Inference in Cointegration Models Economics 582 Eric Zivot May 17, 2012 Tests for Cointegration Let the ( 1) vector Y be (1). Recall, Y is cointegrated with 0 cointegrating vectors if there

### What are the place values to the left of the decimal point and their associated powers of ten?

The verbal answers to all of the following questions should be memorized before completion of algebra. Answers that are not memorized will hinder your ability to succeed in geometry and algebra. (Everything

### 2.2 Creaseness operator

2.2. Creaseness operator 31 2.2 Creaseness operator Antonio López, a member of our group, has studied for his PhD dissertation the differential operators described in this section [72]. He has compared

### Master s Theory Exam Spring 2006

Spring 2006 This exam contains 7 questions. You should attempt them all. Each question is divided into parts to help lead you through the material. You should attempt to complete as much of each problem

### Monte Carlo Methods in Finance

Author: Yiyang Yang Advisor: Pr. Xiaolin Li, Pr. Zari Rachev Department of Applied Mathematics and Statistics State University of New York at Stony Brook October 2, 2012 Outline Introduction 1 Introduction

### Multi scale random field simulation program

Multi scale random field simulation program 1.15. 2010 (Updated 12.22.2010) Andrew Seifried, Stanford University Introduction This is a supporting document for the series of Matlab scripts used to perform

### Tutorial - PEST. Visual MODFLOW Flex. Integrated Conceptual & Numerical Groundwater Modeling

Tutorial - PEST Visual MODFLOW Flex Integrated Conceptual & Numerical Groundwater Modeling PEST with Pilot Points Tutorial This exercise demonstrates some of the advanced and exiting opportunities for

### Chapter 4: Vector Autoregressive Models

Chapter 4: Vector Autoregressive Models 1 Contents: Lehrstuhl für Department Empirische of Wirtschaftsforschung Empirical Research and und Econometrics Ökonometrie IV.1 Vector Autoregressive Models (VAR)...

### Prentice Hall Mathematics: Algebra 1 2007 Correlated to: Michigan Merit Curriculum for Algebra 1

STRAND 1: QUANTITATIVE LITERACY AND LOGIC STANDARD L1: REASONING ABOUT NUMBERS, SYSTEMS, AND QUANTITATIVE SITUATIONS Based on their knowledge of the properties of arithmetic, students understand and reason

### Accuracy of the coherent potential approximation for a onedimensional array with a Gaussian distribution of fluctuations in the on-site potential

Accuracy of the coherent potential approximation for a onedimensional array with a Gaussian distribution of fluctuations in the on-site potential I. Avgin Department of Electrical and Electronics Engineering,

### Econometric Analysis of Cross Section and Panel Data Second Edition. Jeffrey M. Wooldridge. The MIT Press Cambridge, Massachusetts London, England

Econometric Analysis of Cross Section and Panel Data Second Edition Jeffrey M. Wooldridge The MIT Press Cambridge, Massachusetts London, England Preface Acknowledgments xxi xxix I INTRODUCTION AND BACKGROUND

### Time Series Analysis

Time Series Analysis hm@imm.dtu.dk Informatics and Mathematical Modelling Technical University of Denmark DK-2800 Kgs. Lyngby 1 Outline of the lecture Identification of univariate time series models, cont.:

### **BEGINNING OF EXAMINATION** The annual number of claims for an insured has probability function: , 0 < q < 1.

**BEGINNING OF EXAMINATION** 1. You are given: (i) The annual number of claims for an insured has probability function: 3 p x q q x x ( ) = ( 1 ) 3 x, x = 0,1,, 3 (ii) The prior density is π ( q) = q,

### Treatment and analysis of data Applied statistics Lecture 3: Sampling and descriptive statistics

Treatment and analysis of data Applied statistics Lecture 3: Sampling and descriptive statistics Topics covered: Parameters and statistics Sample mean and sample standard deviation Order statistics and

### CHAPTER 2 Estimating Probabilities

CHAPTER 2 Estimating Probabilities Machine Learning Copyright c 2016. Tom M. Mitchell. All rights reserved. *DRAFT OF January 24, 2016* *PLEASE DO NOT DISTRIBUTE WITHOUT AUTHOR S PERMISSION* This is a

### San Jose State University Engineering 10 1

KY San Jose State University Engineering 10 1 Select Insert from the main menu Plotting in Excel Select All Chart Types San Jose State University Engineering 10 2 Definition: A chart that consists of multiple

### 15.062 Data Mining: Algorithms and Applications Matrix Math Review

.6 Data Mining: Algorithms and Applications Matrix Math Review The purpose of this document is to give a brief review of selected linear algebra concepts that will be useful for the course and to develop

Lecture 16: Generalized Additive Models Regression III: Advanced Methods Bill Jacoby Michigan State University http://polisci.msu.edu/jacoby/icpsr/regress3 Goals of the Lecture Introduce Additive Models

### Instrumental Variables Regression. Instrumental Variables (IV) estimation is used when the model has endogenous s.

Instrumental Variables Regression Instrumental Variables (IV) estimation is used when the model has endogenous s. IV can thus be used to address the following important threats to internal validity: Omitted

### 18.6.1 Terms concerned with internal quality control procedures

18.6.1 Terms concerned with internal quality control procedures Quality assurance in analytical laboratories Quality assurance is the essential organisational infrastructure that underlies all reliable

### The Bivariate Normal Distribution

The Bivariate Normal Distribution This is Section 4.7 of the st edition (2002) of the book Introduction to Probability, by D. P. Bertsekas and J. N. Tsitsiklis. The material in this section was not included

### Florida Math for College Readiness

Core Florida Math for College Readiness Florida Math for College Readiness provides a fourth-year math curriculum focused on developing the mastery of skills identified as critical to postsecondary readiness

### STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.cs.toronto.edu/~rsalakhu/ Lecture 6 Three Approaches to Classification Construct

### 1. χ 2 minimization 2. Fits in case of of systematic errors

Data fitting Volker Blobel University of Hamburg March 2005 1. χ 2 minimization 2. Fits in case of of systematic errors Keys during display: enter = next page; = next page; = previous page; home = first

### Distributed computing of failure probabilities for structures in civil engineering

Distributed computing of failure probabilities for structures in civil engineering Andrés Wellmann Jelic, University of Bochum (andres.wellmann@rub.de) Matthias Baitsch, University of Bochum (matthias.baitsch@rub.de)