Spatial Interpolation

Size: px
Start display at page:

Download "Spatial Interpolation"

Transcription

1 Spatial Interpolation Inverse Distance Weighting The Variogram Kriging Much thanks to Bill Harper for his insights in Practical Geostatistics 000 and personal conversation

2 Geostatistics Includes a wide variety of techniques, including IDW, nearest neighbor analysis and linear or nonlinear kriging, using one or more variables. Commonly used to identify and map spatial patterns across a landscape. Can be used to determine if spatial autocorrelation exists between data points. For this, the most common function used is the (semi)variogram. The variogram is a mathematical description of the relationship between the variance of pairs of observations (data points) and the distance separating these observations (h). Spatial autocorrelation can then be used to make better estimates for unsampled data points (inference = kriging).

3 Objectives In this session we will evaluate a dataset and attempt to: Explore the theory and implementation of inverse distance weighting Evaluate issues with IDW interpolation Explore the theory and implementation of the semi-variogram and it s applicability to interpolation Explore the theory and implementation of kriging and it s applicability to interpolation

4 Data Set Simulated Borehole data (PG 000) Iron concentration Need to interpolate iron content for unsampled areas General Statistics 47 samples Mean value: 36.3 S.D.: 3.73

5 General Statistics Histogram shows the relative distribution of the data Generally follows a normal distribution Other observations Minor skew, no big deal

6 Data Set The best unbiased estimate for the standard deviation is 3.76 (see right) Therefore, we are 90% confident that a point drawn at random would be: 30 < T < 4.6 This is based on consulting a students t distribution with 47 samples

7 Subset of Area (northwest area) Subset of borehole data Upper left side General Statistics 7 samples Mean value: 40 S.D.:.8 Getting somewhat better

8 The best unbiased estimate for the standard deviation is 3.05 (see right) Therefore, we are 90% confident that a point drawn at random would be: 34. < T < 45.7 This is based on consulting a students t distribution with 7 samples Now, the question is, do some of the points exhibit more influence than others? Probably, so lets evaluate the point taking nearness into account

9 Inverse Distance Weighting IDW works by using an unbiased weight matrix based on the distances from an unknown value to known values. Weights may be defined a number of different ways

10 IDW ArcGIS provides a nice interface to view points This example looks at 7 neighbors Now, lets look at it the old fashioned way

11 IDW Using 7 neighboring points allows us to interpolate a value based on distances Interpolated value is 39.9 So, our calculation is the same as that in ArcGIS its just math.

12 IDW standard Error We will compute it, without considering the autocorrelation in the data: Standard error.75 Therefore, we are 90% confident that a point drawn at random would be: 34.7 < T < 45.1 This is based on consulting a students t distribution with 7 samples Caveat: we are treating IDW like weighted mean, and the standard deviation like a weighted standard deviation. In reality, you shouldn t develop confidence intervals for data that is autocorrelated

13 IDW Methods Power =, search = 600 Power =, search = 30 Power = 4, search = 600 Power =, search = 150

14 10 Questions to Evaluate 1 What function of distance should we use? How do we handle different continuity in different directions? How many samples should we include in the estimation? How do we compensate for irregularly spaced or highly clustered sampling? How far should we go to include samples in our estimation process? Should we honor the sample values? How reliable is the estimate when we have it? Why is our map too smooth? What happens if our sample data is not Normal? What happens if there is a strong trend in the values? 1 Clark and Harper Practical Geostatistics 000. Ecosse North America, Llc

15 Answering the 10 Questions The Variogram

16 What is a Semi-Variogram The semi-variogram is a function that relates semi-variance (or dissimilarity) of data points to the distance that separates them. It is the mathematical description of the relationship between the variance of pairs of observations (data points) and the distance separating these observations (h). d 1 1 If we can understand the difference between an unknown quantity and a known quantity, we we can estimate the unknown point

17 Estimating via semivariogram Lets assume the relationship between the unknown and known point depends on distance 11 feet NE/SW If these two points have the same relationship as the other points, we can look at the other points that are 11 feet NE/SW

18 Computing the standard differences For all 31 pairs we can compute the standard deviation We are assuming a mean of 0, and a normal distribution s s (35 37) (37 36) (4 40) 1 = (38 39) 31 (37 41) (36 33) (30 8) =.74 11, NE s + (38 37) + (3 35) = + (36 38) + (43 4) + (35 39) + (37 37) N 1 11, NE + (38 35) + (37 4) + (35 38) + (34 38) + (33 36) + (9 9) ( g i + (37 43) + (36 35) + (35 37) + (30 33) + (41 36) + (33 38) g j ) + (38 37) + (35 35) + (39 39) + (39 37) + (37 40) + (35 34)

19 Computing the standard differences The single point we are looking at is 37% Fe. If our original samples come from a normal distribution, the differences will be normal, so we be 90% confident that a point drawn at random would be: t.05, < s 11, NE T < < g 1 T 41.6% Fe < t.05,31 s 11, NE

20 Taking the semi-variogram further Chances are, we won t get to sample our data on a regular grid. We have to algebraically define some function of distance with the differences in value Therefore, we will assign h to the distance 1 1 = sh ( g h i g j ) N h

21 Variograms Variogram: γ(h) = ½ var [ Z(x) Z(x+h) ] In practice: = ½ E [ {Z(x) Z(x+h)} ] γ(h) = 1 N(h) N(h) i= 1 [Z(x i ) Z(x i + h)] Where: N(h) is the total number of pairs of observations separated by a distance h. The fitted curve minimizes the variance of the errors.

22 Variogram components Nugget variance: a non-zero value for γ when h = 0. Produced by various sources of unexplained error (e.g. measurement error). Sill: for large values of h the variogram levels out, indicating that there no longer is any correlation between data points. The sill should be equal to the variance of the data set. Range: is the value of h where the sill occurs (or 95% of the value of the sill). In general, 30 or more pairs per point are needed to generate a reasonable sample variogram. The most important part of a variogram is its shape near the origin, as the closest points are given more weight in the interpolation process.

23 Variogram models Variogram models must be positive definite so that the covariance matrix based on it can be inverted (which occurs in the kriging process). Because of this, only certain models can be used.

24 Semi-variogram models We can enter some numbers in Mathcad and see how the variogram changes.

25 Effect of lag size on variograms Variogram with a lag size of 0m. Variogram with a lag size of 00m.

26 Anisotropy There may be higher spatial autocorrelation in one direction than in others, which is called anisotropy: The figure shows a case of geometric anisotropy, which is incorporated in the variogram model by means of a linear transformation.

27 Semi-variogram tips We are assuming a normal distribution Gives us a picture of the relationship of data values with distance. If you don t have a good spatial structure in the semi-variogram, don t revert to IDW this is stupid!!!

28 Comparing Software for Computing the Semi-Variogram Practical Geostatistics 000 ArcGIS Geostatistical Analyst

29 Assessing Fit of the Variogram Cressie Goodness of Fit For each point used to create the variogram, match how well the model actually fits it

30 Kriging Kriging is based on the idea that you can make inferences regarding a random function Z(x), given data points Z(x 1 ), Z(x ), Z(x n ). Z(x) = m(x) + γ(h) + ε 3 components: structural (constant mean), random spatially correlated component and residual error.

31 Kriging This is our variogram from the borehole data To discuss the mathematics of kriging, we will look at a simple example of 3 points, and get back to our data in a moment

32 Kriging Numerical Example of Iron Ore Data From Practical Geostatistics 000

33 Data Set Iron Ore Data, based on sample set from PG 000 Three point example for simplicity

34 Calculating Distances The first thing we do is determine the distances between each point Also calculate difference in Z values between all points

35 Semi Variogram We apply the GLM, based on other tests performed on the data The values chosen give the best Cressie statistics for fit on all data points Note: Mathcad is not great at creating semivariograms!!!

36 Computing Weights Using basic matrix algebra, we can solve for the weights. The weights will add to one, due to our eventual slight of hand with the last row.

37

38 Solving the Unknown Basic matrix algebra will solve for the unknown value We also compute the standard error and variance

39 Solving Our Borehole Data Start with our original example Since we have 7 points rather than 3, the screens will be busier

40 Borehole Data The ability to create semi-variograms in MathCad is pretty bad, but this allows us to visualize the mathematics Here we are using the spherical model

41 Borehole Data Again, we can see with this dataset the weights also add up to one

42 Solution Here we ve computed the value of the unknown point, and the standard error This was based on the limited set of 7 points, now we ll do it with the rest.

43 Predicting the Point ArcGIS has a good interface for evaluating the weights of the points, in addition to predicting a test location

44 Kriging Results ESRI Geostatistical Analyst Interpolated value 41.6 Standard error.16 PG 000 Interpolated value Standard error.11

45 Standard Errors Based on Kriging results, we can assume the true value of the unknown point, with 90% confidence as: 37.6 < < %Fe So, we are getting better results, better looking maps, and smaller confidence intervals

46 IDW vs. Kriging Kriging appears to give a more natural look to the data Kriging avoids the bulls eye effect Kriging also give us a standard error

47 Results Descriptive Statistics N.W. Corner IDW Variogram Kriging Mean Limits Range

48 Review of 10 Questions to ask 1 What function of distance should we use? The variogram shows us the spatial structure, and association of the data, and will give us a hint as to what function to possibly use. How do we handle different continuity in different directions? Here again, the variogram will tell us whether there is any spatial association, and we can determine which direction by evaluating whether anisotropy exists. How many samples should we include in the estimation? Again, we can look at the variogram How do we compensate for irregularly spaced or highly clustered sampling? The variogram defines the relationship between points and their distances from other points. Calculating weights in Kriging takes the distances among all points into account. 1 Clark and Harper Practical Geostatistics 000. Ecosse North America, Llc

49 10 Questions to ask 1 How far should we go to include samples in our estimation process? By looking at the variogram we can identify the sill (that area where the spatial correlation has little value). The range tells us the distance where the points are no longer correlated. Should we honor the sample values? Still lots of debate on this one. IDW says yes, that s why we get the bullseye. The nugget effect in Kriging allows us to say no. But, we can set the nugget to zero with Kriging. How reliable is the estimate when we have it? Kriging allows us to compute the standard error Why is our IDW map too smooth? In IDW when you include points far away they become part of the weights. Since the weights have to add up to one, you are basically taking power away from the closer ones. 1 Clark and Harper Practical Geostatistics 000. Ecosse North America, Llc

50 10 Questions to Ask What happens if our sample data is not Normal? Basically, make the data normal What happens if there is a strong trend in the values? First, remove the trend, then re-interpolate the points (see ESRI Calif. Ozone example, or Clark and Harper Wolfcamp Data)

51 Conclusions It is possible to interpolate an unknown point based on other points in a data set While it can be done with descriptive statistics, other methods are clearly better The variogram helps answer many questions related to our data, and provides a wealth of information related to the spatial structure of the data More robust (geostatistical) methods for interpolation appear to provide better results

Geography 4203 / 5203. GIS Modeling. Class (Block) 9: Variogram & Kriging

Geography 4203 / 5203. GIS Modeling. Class (Block) 9: Variogram & Kriging Geography 4203 / 5203 GIS Modeling Class (Block) 9: Variogram & Kriging Some Updates Today class + one proposal presentation Feb 22 Proposal Presentations Feb 25 Readings discussion (Interpolation) Last

More information

INTRODUCTION TO GEOSTATISTICS And VARIOGRAM ANALYSIS

INTRODUCTION TO GEOSTATISTICS And VARIOGRAM ANALYSIS INTRODUCTION TO GEOSTATISTICS And VARIOGRAM ANALYSIS C&PE 940, 17 October 2005 Geoff Bohling Assistant Scientist Kansas Geological Survey geoff@kgs.ku.edu 864-2093 Overheads and other resources available

More information

AMARILLO BY MORNING: DATA VISUALIZATION IN GEOSTATISTICS

AMARILLO BY MORNING: DATA VISUALIZATION IN GEOSTATISTICS AMARILLO BY MORNING: DATA VISUALIZATION IN GEOSTATISTICS William V. Harper 1 and Isobel Clark 2 1 Otterbein College, United States of America 2 Alloa Business Centre, United Kingdom wharper@otterbein.edu

More information

EXPLORING SPATIAL PATTERNS IN YOUR DATA

EXPLORING SPATIAL PATTERNS IN YOUR DATA EXPLORING SPATIAL PATTERNS IN YOUR DATA OBJECTIVES Learn how to examine your data using the Geostatistical Analysis tools in ArcMap. Learn how to use descriptive statistics in ArcMap and Geoda to analyze

More information

Introduction to Modeling Spatial Processes Using Geostatistical Analyst

Introduction to Modeling Spatial Processes Using Geostatistical Analyst Introduction to Modeling Spatial Processes Using Geostatistical Analyst Konstantin Krivoruchko, Ph.D. Software Development Lead, Geostatistics kkrivoruchko@esri.com Geostatistics is a set of models and

More information

Geostatistical Analyst Tutorial

Geostatistical Analyst Tutorial Copyright 1995-2012 Esri All rights reserved. Table of Contents Introduction to the ArcGIS Geostatistical Analyst Tutorial................... 0 Exercise 1: Creating a surface using default parameters...................

More information

Spatial sampling effect of laboratory practices in a porphyry copper deposit

Spatial sampling effect of laboratory practices in a porphyry copper deposit Spatial sampling effect of laboratory practices in a porphyry copper deposit Serge Antoine Séguret Centre of Geosciences and Geoengineering/ Geostatistics, MINES ParisTech, Fontainebleau, France ABSTRACT

More information

ArcGIS 9. Geostatistical Analyst

ArcGIS 9. Geostatistical Analyst ArcGIS 9 Using ArcGIS Geostatistical Analyst Copyright 2001, 2003 ESRI All Rights Reserved. Printed in the United States of America. The information contained in this document is the exclusive property

More information

Week 4: Standard Error and Confidence Intervals

Week 4: Standard Error and Confidence Intervals Health Sciences M.Sc. Programme Applied Biostatistics Week 4: Standard Error and Confidence Intervals Sampling Most research data come from subjects we think of as samples drawn from a larger population.

More information

Session 7 Bivariate Data and Analysis

Session 7 Bivariate Data and Analysis Session 7 Bivariate Data and Analysis Key Terms for This Session Previously Introduced mean standard deviation New in This Session association bivariate analysis contingency table co-variation least squares

More information

A Primer on Mathematical Statistics and Univariate Distributions; The Normal Distribution; The GLM with the Normal Distribution

A Primer on Mathematical Statistics and Univariate Distributions; The Normal Distribution; The GLM with the Normal Distribution A Primer on Mathematical Statistics and Univariate Distributions; The Normal Distribution; The GLM with the Normal Distribution PSYC 943 (930): Fundamentals of Multivariate Modeling Lecture 4: September

More information

ArcGIS Geostatistical Analyst: Statistical Tools for Data Exploration, Modeling, and Advanced Surface Generation

ArcGIS Geostatistical Analyst: Statistical Tools for Data Exploration, Modeling, and Advanced Surface Generation ArcGIS Geostatistical Analyst: Statistical Tools for Data Exploration, Modeling, and Advanced Surface Generation An ESRI White Paper August 2001 ESRI 380 New York St., Redlands, CA 92373-8100, USA TEL

More information

Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model

Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model 1 September 004 A. Introduction and assumptions The classical normal linear regression model can be written

More information

15.062 Data Mining: Algorithms and Applications Matrix Math Review

15.062 Data Mining: Algorithms and Applications Matrix Math Review .6 Data Mining: Algorithms and Applications Matrix Math Review The purpose of this document is to give a brief review of selected linear algebra concepts that will be useful for the course and to develop

More information

A full analysis example Multiple correlations Partial correlations

A full analysis example Multiple correlations Partial correlations A full analysis example Multiple correlations Partial correlations New Dataset: Confidence This is a dataset taken of the confidence scales of 41 employees some years ago using 4 facets of confidence (Physical,

More information

Simple Regression Theory II 2010 Samuel L. Baker

Simple Regression Theory II 2010 Samuel L. Baker SIMPLE REGRESSION THEORY II 1 Simple Regression Theory II 2010 Samuel L. Baker Assessing how good the regression equation is likely to be Assignment 1A gets into drawing inferences about how close the

More information

HYPOTHESIS TESTING: CONFIDENCE INTERVALS, T-TESTS, ANOVAS, AND REGRESSION

HYPOTHESIS TESTING: CONFIDENCE INTERVALS, T-TESTS, ANOVAS, AND REGRESSION HYPOTHESIS TESTING: CONFIDENCE INTERVALS, T-TESTS, ANOVAS, AND REGRESSION HOD 2990 10 November 2010 Lecture Background This is a lightning speed summary of introductory statistical methods for senior undergraduate

More information

Chapter 10. Key Ideas Correlation, Correlation Coefficient (r),

Chapter 10. Key Ideas Correlation, Correlation Coefficient (r), Chapter 0 Key Ideas Correlation, Correlation Coefficient (r), Section 0-: Overview We have already explored the basics of describing single variable data sets. However, when two quantitative variables

More information

Annealing Techniques for Data Integration

Annealing Techniques for Data Integration Reservoir Modeling with GSLIB Annealing Techniques for Data Integration Discuss the Problem of Permeability Prediction Present Annealing Cosimulation More Details on Simulated Annealing Examples SASIM

More information

An Interactive Tool for Residual Diagnostics for Fitting Spatial Dependencies (with Implementation in R)

An Interactive Tool for Residual Diagnostics for Fitting Spatial Dependencies (with Implementation in R) DSC 2003 Working Papers (Draft Versions) http://www.ci.tuwien.ac.at/conferences/dsc-2003/ An Interactive Tool for Residual Diagnostics for Fitting Spatial Dependencies (with Implementation in R) Ernst

More information

Geostatistics Exploratory Analysis

Geostatistics Exploratory Analysis Instituto Superior de Estatística e Gestão de Informação Universidade Nova de Lisboa Master of Science in Geospatial Technologies Geostatistics Exploratory Analysis Carlos Alberto Felgueiras cfelgueiras@isegi.unl.pt

More information

An Introduction to Point Pattern Analysis using CrimeStat

An Introduction to Point Pattern Analysis using CrimeStat Introduction An Introduction to Point Pattern Analysis using CrimeStat Luc Anselin Spatial Analysis Laboratory Department of Agricultural and Consumer Economics University of Illinois, Urbana-Champaign

More information

What s New in Econometrics? Lecture 8 Cluster and Stratified Sampling

What s New in Econometrics? Lecture 8 Cluster and Stratified Sampling What s New in Econometrics? Lecture 8 Cluster and Stratified Sampling Jeff Wooldridge NBER Summer Institute, 2007 1. The Linear Model with Cluster Effects 2. Estimation with a Small Number of Groups and

More information

Obesity in America: A Growing Trend

Obesity in America: A Growing Trend Obesity in America: A Growing Trend David Todd P e n n s y l v a n i a S t a t e U n i v e r s i t y Utilizing Geographic Information Systems (GIS) to explore obesity in America, this study aims to determine

More information

A Bi-Stage Modeling Solution. to Cloud Detection in the Arctic Region

A Bi-Stage Modeling Solution. to Cloud Detection in the Arctic Region University of California Los Angeles A Bi-Stage Modeling Solution to Cloud Detection in the Arctic Region A thesis submitted in partial satisfaction of the requirements for the degree Master of Science

More information

Multiple Regression: What Is It?

Multiple Regression: What Is It? Multiple Regression Multiple Regression: What Is It? Multiple regression is a collection of techniques in which there are multiple predictors of varying kinds and a single outcome We are interested in

More information

SQUARES AND SQUARE ROOTS

SQUARES AND SQUARE ROOTS 1. Squares and Square Roots SQUARES AND SQUARE ROOTS In this lesson, students link the geometric concepts of side length and area of a square to the algebra concepts of squares and square roots of numbers.

More information

Multi scale random field simulation program

Multi scale random field simulation program Multi scale random field simulation program 1.15. 2010 (Updated 12.22.2010) Andrew Seifried, Stanford University Introduction This is a supporting document for the series of Matlab scripts used to perform

More information

DESCRIPTIVE STATISTICS. The purpose of statistics is to condense raw data to make it easier to answer specific questions; test hypotheses.

DESCRIPTIVE STATISTICS. The purpose of statistics is to condense raw data to make it easier to answer specific questions; test hypotheses. DESCRIPTIVE STATISTICS The purpose of statistics is to condense raw data to make it easier to answer specific questions; test hypotheses. DESCRIPTIVE VS. INFERENTIAL STATISTICS Descriptive To organize,

More information

Part 2: Analysis of Relationship Between Two Variables

Part 2: Analysis of Relationship Between Two Variables Part 2: Analysis of Relationship Between Two Variables Linear Regression Linear correlation Significance Tests Multiple regression Linear Regression Y = a X + b Dependent Variable Independent Variable

More information

Algebra 1 2008. Academic Content Standards Grade Eight and Grade Nine Ohio. Grade Eight. Number, Number Sense and Operations Standard

Algebra 1 2008. Academic Content Standards Grade Eight and Grade Nine Ohio. Grade Eight. Number, Number Sense and Operations Standard Academic Content Standards Grade Eight and Grade Nine Ohio Algebra 1 2008 Grade Eight STANDARDS Number, Number Sense and Operations Standard Number and Number Systems 1. Use scientific notation to express

More information

Location matters. 3 techniques to incorporate geo-spatial effects in one's predictive model

Location matters. 3 techniques to incorporate geo-spatial effects in one's predictive model Location matters. 3 techniques to incorporate geo-spatial effects in one's predictive model Xavier Conort xavier.conort@gear-analytics.com Motivation Location matters! Observed value at one location is

More information

1) Write the following as an algebraic expression using x as the variable: Triple a number subtracted from the number

1) Write the following as an algebraic expression using x as the variable: Triple a number subtracted from the number 1) Write the following as an algebraic expression using x as the variable: Triple a number subtracted from the number A. 3(x - x) B. x 3 x C. 3x - x D. x - 3x 2) Write the following as an algebraic expression

More information

Confidence Intervals for One Standard Deviation Using Standard Deviation

Confidence Intervals for One Standard Deviation Using Standard Deviation Chapter 640 Confidence Intervals for One Standard Deviation Using Standard Deviation Introduction This routine calculates the sample size necessary to achieve a specified interval width or distance from

More information

SENSITIVITY ANALYSIS AND INFERENCE. Lecture 12

SENSITIVITY ANALYSIS AND INFERENCE. Lecture 12 This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike License. Your use of this material constitutes acceptance of that license and the conditions of use of materials on this

More information

Chapter 13 Introduction to Linear Regression and Correlation Analysis

Chapter 13 Introduction to Linear Regression and Correlation Analysis Chapter 3 Student Lecture Notes 3- Chapter 3 Introduction to Linear Regression and Correlation Analsis Fall 2006 Fundamentals of Business Statistics Chapter Goals To understand the methods for displaing

More information

Definition 8.1 Two inequalities are equivalent if they have the same solution set. Add or Subtract the same value on both sides of the inequality.

Definition 8.1 Two inequalities are equivalent if they have the same solution set. Add or Subtract the same value on both sides of the inequality. 8 Inequalities Concepts: Equivalent Inequalities Linear and Nonlinear Inequalities Absolute Value Inequalities (Sections 4.6 and 1.1) 8.1 Equivalent Inequalities Definition 8.1 Two inequalities are equivalent

More information

CALCULATIONS & STATISTICS

CALCULATIONS & STATISTICS CALCULATIONS & STATISTICS CALCULATION OF SCORES Conversion of 1-5 scale to 0-100 scores When you look at your report, you will notice that the scores are reported on a 0-100 scale, even though respondents

More information

Module 5: Multiple Regression Analysis

Module 5: Multiple Regression Analysis Using Statistical Data Using to Make Statistical Decisions: Data Multiple to Make Regression Decisions Analysis Page 1 Module 5: Multiple Regression Analysis Tom Ilvento, University of Delaware, College

More information

STAT 360 Probability and Statistics. Fall 2012

STAT 360 Probability and Statistics. Fall 2012 STAT 360 Probability and Statistics Fall 2012 1) General information: Crosslisted course offered as STAT 360, MATH 360 Semester: Fall 2012, Aug 20--Dec 07 Course name: Probability and Statistics Number

More information

Institute of Actuaries of India Subject CT3 Probability and Mathematical Statistics

Institute of Actuaries of India Subject CT3 Probability and Mathematical Statistics Institute of Actuaries of India Subject CT3 Probability and Mathematical Statistics For 2015 Examinations Aim The aim of the Probability and Mathematical Statistics subject is to provide a grounding in

More information

Stat 411/511 THE RANDOMIZATION TEST. Charlotte Wickham. stat511.cwick.co.nz. Oct 16 2015

Stat 411/511 THE RANDOMIZATION TEST. Charlotte Wickham. stat511.cwick.co.nz. Oct 16 2015 Stat 411/511 THE RANDOMIZATION TEST Oct 16 2015 Charlotte Wickham stat511.cwick.co.nz Today Review randomization model Conduct randomization test What about CIs? Using a t-distribution as an approximation

More information

Answer: C. The strength of a correlation does not change if units change by a linear transformation such as: Fahrenheit = 32 + (5/9) * Centigrade

Answer: C. The strength of a correlation does not change if units change by a linear transformation such as: Fahrenheit = 32 + (5/9) * Centigrade Statistics Quiz Correlation and Regression -- ANSWERS 1. Temperature and air pollution are known to be correlated. We collect data from two laboratories, in Boston and Montreal. Boston makes their measurements

More information

Risk Analysis and Quantification

Risk Analysis and Quantification Risk Analysis and Quantification 1 What is Risk Analysis? 2. Risk Analysis Methods 3. The Monte Carlo Method 4. Risk Model 5. What steps must be taken for the development of a Risk Model? 1.What is Risk

More information

The Assumption(s) of Normality

The Assumption(s) of Normality The Assumption(s) of Normality Copyright 2000, 2011, J. Toby Mordkoff This is very complicated, so I ll provide two versions. At a minimum, you should know the short one. It would be great if you knew

More information

Variables Control Charts

Variables Control Charts MINITAB ASSISTANT WHITE PAPER This paper explains the research conducted by Minitab statisticians to develop the methods and data checks used in the Assistant in Minitab 17 Statistical Software. Variables

More information

Lecture 19: Chapter 8, Section 1 Sampling Distributions: Proportions

Lecture 19: Chapter 8, Section 1 Sampling Distributions: Proportions Lecture 19: Chapter 8, Section 1 Sampling Distributions: Proportions Typical Inference Problem Definition of Sampling Distribution 3 Approaches to Understanding Sampling Dist. Applying 68-95-99.7 Rule

More information

How To Use Statgraphics Centurion Xvii (Version 17) On A Computer Or A Computer (For Free)

How To Use Statgraphics Centurion Xvii (Version 17) On A Computer Or A Computer (For Free) Statgraphics Centurion XVII (currently in beta test) is a major upgrade to Statpoint's flagship data analysis and visualization product. It contains 32 new statistical procedures and significant upgrades

More information

Simple linear regression

Simple linear regression Simple linear regression Introduction Simple linear regression is a statistical method for obtaining a formula to predict values of one variable from another where there is a causal relationship between

More information

The right edge of the box is the third quartile, Q 3, which is the median of the data values above the median. Maximum Median

The right edge of the box is the third quartile, Q 3, which is the median of the data values above the median. Maximum Median CONDENSED LESSON 2.1 Box Plots In this lesson you will create and interpret box plots for sets of data use the interquartile range (IQR) to identify potential outliers and graph them on a modified box

More information

Module 3: Correlation and Covariance

Module 3: Correlation and Covariance Using Statistical Data to Make Decisions Module 3: Correlation and Covariance Tom Ilvento Dr. Mugdim Pašiƒ University of Delaware Sarajevo Graduate School of Business O ften our interest in data analysis

More information

6.4 Normal Distribution

6.4 Normal Distribution Contents 6.4 Normal Distribution....................... 381 6.4.1 Characteristics of the Normal Distribution....... 381 6.4.2 The Standardized Normal Distribution......... 385 6.4.3 Meaning of Areas under

More information

II. DISTRIBUTIONS distribution normal distribution. standard scores

II. DISTRIBUTIONS distribution normal distribution. standard scores Appendix D Basic Measurement And Statistics The following information was developed by Steven Rothke, PhD, Department of Psychology, Rehabilitation Institute of Chicago (RIC) and expanded by Mary F. Schmidt,

More information

The Effects of Start Prices on the Performance of the Certainty Equivalent Pricing Policy

The Effects of Start Prices on the Performance of the Certainty Equivalent Pricing Policy BMI Paper The Effects of Start Prices on the Performance of the Certainty Equivalent Pricing Policy Faculty of Sciences VU University Amsterdam De Boelelaan 1081 1081 HV Amsterdam Netherlands Author: R.D.R.

More information

An introduction to Value-at-Risk Learning Curve September 2003

An introduction to Value-at-Risk Learning Curve September 2003 An introduction to Value-at-Risk Learning Curve September 2003 Value-at-Risk The introduction of Value-at-Risk (VaR) as an accepted methodology for quantifying market risk is part of the evolution of risk

More information

Multivariate Normal Distribution

Multivariate Normal Distribution Multivariate Normal Distribution Lecture 4 July 21, 2011 Advanced Multivariate Statistical Methods ICPSR Summer Session #2 Lecture #4-7/21/2011 Slide 1 of 41 Last Time Matrices and vectors Eigenvalues

More information

Correlating PSI and CUP Denton Bramwell

Correlating PSI and CUP Denton Bramwell Correlating PSI and CUP Denton Bramwell Having inherited the curiosity gene, I just can t resist fiddling with things. And one of the things I can t resist fiddling with is firearms. I think I am the only

More information

TEACHER NOTES MATH NSPIRED

TEACHER NOTES MATH NSPIRED Math Objectives Students will understand that normal distributions can be used to approximate binomial distributions whenever both np and n(1 p) are sufficiently large. Students will understand that when

More information

Fairfield Public Schools

Fairfield Public Schools Mathematics Fairfield Public Schools AP Statistics AP Statistics BOE Approved 04/08/2014 1 AP STATISTICS Critical Areas of Focus AP Statistics is a rigorous course that offers advanced students an opportunity

More information

Math 728 Lesson Plan

Math 728 Lesson Plan Math 728 Lesson Plan Tatsiana Maskalevich January 27, 2011 Topic: Probability involving sampling without replacement and dependent trials. Grade Level: 8-12 Objective: Compute the probability of winning

More information

Equations, Lenses and Fractions

Equations, Lenses and Fractions 46 Equations, Lenses and Fractions The study of lenses offers a good real world example of a relation with fractions we just can t avoid! Different uses of a simple lens that you may be familiar with are

More information

Vertical Alignment Colorado Academic Standards 6 th - 7 th - 8 th

Vertical Alignment Colorado Academic Standards 6 th - 7 th - 8 th Vertical Alignment Colorado Academic Standards 6 th - 7 th - 8 th Standard 3: Data Analysis, Statistics, and Probability 6 th Prepared Graduates: 1. Solve problems and make decisions that depend on un

More information

Regression III: Advanced Methods

Regression III: Advanced Methods Lecture 16: Generalized Additive Models Regression III: Advanced Methods Bill Jacoby Michigan State University http://polisci.msu.edu/jacoby/icpsr/regress3 Goals of the Lecture Introduce Additive Models

More information

Chicago Booth BUSINESS STATISTICS 41000 Final Exam Fall 2011

Chicago Booth BUSINESS STATISTICS 41000 Final Exam Fall 2011 Chicago Booth BUSINESS STATISTICS 41000 Final Exam Fall 2011 Name: Section: I pledge my honor that I have not violated the Honor Code Signature: This exam has 34 pages. You have 3 hours to complete this

More information

Exercise 1.12 (Pg. 22-23)

Exercise 1.12 (Pg. 22-23) Individuals: The objects that are described by a set of data. They may be people, animals, things, etc. (Also referred to as Cases or Records) Variables: The characteristics recorded about each individual.

More information

Introduction to Regression and Data Analysis

Introduction to Regression and Data Analysis Statlab Workshop Introduction to Regression and Data Analysis with Dan Campbell and Sherlock Campbell October 28, 2008 I. The basics A. Types of variables Your variables may take several forms, and it

More information

Nonparametric statistics and model selection

Nonparametric statistics and model selection Chapter 5 Nonparametric statistics and model selection In Chapter, we learned about the t-test and its variations. These were designed to compare sample means, and relied heavily on assumptions of normality.

More information

COMMON CORE STATE STANDARDS FOR

COMMON CORE STATE STANDARDS FOR COMMON CORE STATE STANDARDS FOR Mathematics (CCSSM) High School Statistics and Probability Mathematics High School Statistics and Probability Decisions or predictions are often based on data numbers in

More information

CHAPTER 14 NONPARAMETRIC TESTS

CHAPTER 14 NONPARAMETRIC TESTS CHAPTER 14 NONPARAMETRIC TESTS Everything that we have done up until now in statistics has relied heavily on one major fact: that our data is normally distributed. We have been able to make inferences

More information

Using Spatial Statistics In GIS

Using Spatial Statistics In GIS Using Spatial Statistics In GIS K. Krivoruchko a and C.A. Gotway b a Environmental Systems Research Institute, 380 New York Street, Redlands, CA 92373-8100, USA b Centers for Disease Control and Prevention;

More information

Introduction to Geostatistics

Introduction to Geostatistics Introduction to Geostatistics GEOL 5446 Dept. of Geology & Geophysics 3 Credits University of Wyoming Fall, 2013 Instructor: Ye Zhang Grading: A-F Location: ESB1006 Time: TTh (9:35 am~10:50 am), Office

More information

Session 7 Fractions and Decimals

Session 7 Fractions and Decimals Key Terms in This Session Session 7 Fractions and Decimals Previously Introduced prime number rational numbers New in This Session period repeating decimal terminating decimal Introduction In this session,

More information

Chapter 5 Analysis of variance SPSS Analysis of variance

Chapter 5 Analysis of variance SPSS Analysis of variance Chapter 5 Analysis of variance SPSS Analysis of variance Data file used: gss.sav How to get there: Analyze Compare Means One-way ANOVA To test the null hypothesis that several population means are equal,

More information

Lab 11. Simulations. The Concept

Lab 11. Simulations. The Concept Lab 11 Simulations In this lab you ll learn how to create simulations to provide approximate answers to probability questions. We ll make use of a particular kind of structure, called a box model, that

More information

Data Exploration Data Visualization

Data Exploration Data Visualization Data Exploration Data Visualization What is data exploration? A preliminary exploration of the data to better understand its characteristics. Key motivations of data exploration include Helping to select

More information

This unit will lay the groundwork for later units where the students will extend this knowledge to quadratic and exponential functions.

This unit will lay the groundwork for later units where the students will extend this knowledge to quadratic and exponential functions. Algebra I Overview View unit yearlong overview here Many of the concepts presented in Algebra I are progressions of concepts that were introduced in grades 6 through 8. The content presented in this course

More information

Using row reduction to calculate the inverse and the determinant of a square matrix

Using row reduction to calculate the inverse and the determinant of a square matrix Using row reduction to calculate the inverse and the determinant of a square matrix Notes for MATH 0290 Honors by Prof. Anna Vainchtein 1 Inverse of a square matrix An n n square matrix A is called invertible

More information

Promotional Forecast Demonstration

Promotional Forecast Demonstration Exhibit 2: Promotional Forecast Demonstration Consider the problem of forecasting for a proposed promotion that will start in December 1997 and continues beyond the forecast horizon. Assume that the promotion

More information

Risk Decomposition of Investment Portfolios. Dan dibartolomeo Northfield Webinar January 2014

Risk Decomposition of Investment Portfolios. Dan dibartolomeo Northfield Webinar January 2014 Risk Decomposition of Investment Portfolios Dan dibartolomeo Northfield Webinar January 2014 Main Concepts for Today Investment practitioners rely on a decomposition of portfolio risk into factors to guide

More information

Indicator 2: Use a variety of algebraic concepts and methods to solve equations and inequalities.

Indicator 2: Use a variety of algebraic concepts and methods to solve equations and inequalities. 3 rd Grade Math Learning Targets Algebra: Indicator 1: Use procedures to transform algebraic expressions. 3.A.1.1. Students are able to explain the relationship between repeated addition and multiplication.

More information

Simulating Investment Portfolios

Simulating Investment Portfolios Page 5 of 9 brackets will now appear around your formula. Array formulas control multiple cells at once. When gen_resample is used as an array formula, it assures that the random sample taken from the

More information

Big Ideas in Mathematics

Big Ideas in Mathematics Big Ideas in Mathematics which are important to all mathematics learning. (Adapted from the NCTM Curriculum Focal Points, 2006) The Mathematics Big Ideas are organized using the PA Mathematics Standards

More information

Petrel TIPS&TRICKS from SCM

Petrel TIPS&TRICKS from SCM Petrel TIPS&TRICKS from SCM Knowledge Worth Sharing Histograms and SGS Modeling Histograms are used daily for interpretation, quality control, and modeling in Petrel. This TIPS&TRICKS document briefly

More information

Projects Involving Statistics (& SPSS)

Projects Involving Statistics (& SPSS) Projects Involving Statistics (& SPSS) Academic Skills Advice Starting a project which involves using statistics can feel confusing as there seems to be many different things you can do (charts, graphs,

More information

Introduction to Longitudinal Data Analysis

Introduction to Longitudinal Data Analysis Introduction to Longitudinal Data Analysis Longitudinal Data Analysis Workshop Section 1 University of Georgia: Institute for Interdisciplinary Research in Education and Human Development Section 1: Introduction

More information

The Effect of Environmental Factors on Real Estate Value

The Effect of Environmental Factors on Real Estate Value The Effect of Environmental Factors on Real Estate Value Radoslaw CELLMER, Adam SENETRA, Agnieszka SZCZEPANSKA, Poland Key words: environment, landscape, property value, geostatistics SUMMARY The objective

More information

Algebra 1 Course Information

Algebra 1 Course Information Course Information Course Description: Students will study patterns, relations, and functions, and focus on the use of mathematical models to understand and analyze quantitative relationships. Through

More information

OUTLIER ANALYSIS. Data Mining 1

OUTLIER ANALYSIS. Data Mining 1 OUTLIER ANALYSIS Data Mining 1 What Are Outliers? Outlier: A data object that deviates significantly from the normal objects as if it were generated by a different mechanism Ex.: Unusual credit card purchase,

More information

Summary of Formulas and Concepts. Descriptive Statistics (Ch. 1-4)

Summary of Formulas and Concepts. Descriptive Statistics (Ch. 1-4) Summary of Formulas and Concepts Descriptive Statistics (Ch. 1-4) Definitions Population: The complete set of numerical information on a particular quantity in which an investigator is interested. We assume

More information

Chapter 7 Section 7.1: Inference for the Mean of a Population

Chapter 7 Section 7.1: Inference for the Mean of a Population Chapter 7 Section 7.1: Inference for the Mean of a Population Now let s look at a similar situation Take an SRS of size n Normal Population : N(, ). Both and are unknown parameters. Unlike what we used

More information

Balancing Chemical Equations

Balancing Chemical Equations Balancing Chemical Equations A mathematical equation is simply a sentence that states that two expressions are equal. One or both of the expressions will contain a variable whose value must be determined

More information

Imputing Missing Data using SAS

Imputing Missing Data using SAS ABSTRACT Paper 3295-2015 Imputing Missing Data using SAS Christopher Yim, California Polytechnic State University, San Luis Obispo Missing data is an unfortunate reality of statistics. However, there are

More information

9.07 Introduction to Statistical Methods Homework 4. Name:

9.07 Introduction to Statistical Methods Homework 4. Name: 1. Estimating the population standard deviation and variance. Homework #2 contained a problem (#4) on estimating the population standard deviation. In that problem, you showed that the method of estimating

More information

CONTENTS OF DAY 2. II. Why Random Sampling is Important 9 A myth, an urban legend, and the real reason NOTES FOR SUMMER STATISTICS INSTITUTE COURSE

CONTENTS OF DAY 2. II. Why Random Sampling is Important 9 A myth, an urban legend, and the real reason NOTES FOR SUMMER STATISTICS INSTITUTE COURSE 1 2 CONTENTS OF DAY 2 I. More Precise Definition of Simple Random Sample 3 Connection with independent random variables 3 Problems with small populations 8 II. Why Random Sampling is Important 9 A myth,

More information

Comparison of Non-linear Dimensionality Reduction Techniques for Classification with Gene Expression Microarray Data

Comparison of Non-linear Dimensionality Reduction Techniques for Classification with Gene Expression Microarray Data CMPE 59H Comparison of Non-linear Dimensionality Reduction Techniques for Classification with Gene Expression Microarray Data Term Project Report Fatma Güney, Kübra Kalkan 1/15/2013 Keywords: Non-linear

More information

Inclusion and Exclusion Criteria

Inclusion and Exclusion Criteria Inclusion and Exclusion Criteria Inclusion criteria = attributes of subjects that are essential for their selection to participate. Inclusion criteria function remove the influence of specific confounding

More information

8. THE NORMAL DISTRIBUTION

8. THE NORMAL DISTRIBUTION 8. THE NORMAL DISTRIBUTION The normal distribution with mean μ and variance σ 2 has the following density function: The normal distribution is sometimes called a Gaussian Distribution, after its inventor,

More information

hp calculators HP 50g Trend Lines The STAT menu Trend Lines Practice predicting the future using trend lines

hp calculators HP 50g Trend Lines The STAT menu Trend Lines Practice predicting the future using trend lines The STAT menu Trend Lines Practice predicting the future using trend lines The STAT menu The Statistics menu is accessed from the ORANGE shifted function of the 5 key by pressing Ù. When pressed, a CHOOSE

More information

Data analysis and regression in Stata

Data analysis and regression in Stata Data analysis and regression in Stata This handout shows how the weekly beer sales series might be analyzed with Stata (the software package now used for teaching stats at Kellogg), for purposes of comparing

More information

Data Preparation and Statistical Displays

Data Preparation and Statistical Displays Reservoir Modeling with GSLIB Data Preparation and Statistical Displays Data Cleaning / Quality Control Statistics as Parameters for Random Function Models Univariate Statistics Histograms and Probability

More information