MATH2740: Environmental Statistics

Size: px
Start display at page:

Download "MATH2740: Environmental Statistics"

Transcription

1 MATH2740: Environmental Statistics Lecture 6: Distance Methods I February 10, 2016

2 Table of contents 1 Introduction Problem with quadrat data Distance methods 2 Point-object distances Poisson process case Rayleigh distribution Distribution of object-object distances 3 Clark-Evans test Clark-Evans test of randomness Problems with the Clark-Evans test Examples of Clark and Evans test Problems with Clark-Evans test II

3 Problems with quadrat data Quadrat methods can be inefficient to use in some circumstances: Time and cost to lay out and search all quadrats. Choice of quadrat size potentially influencing conclusions. Quadrat counts do not depend on underlying point pattern. Plots have same quadrat counts but different spatial pattern.

4 Distance methods Using distance methods tries to overcome some of the problems associated with quadrat counting methods.

5 Types of distance measurement Distance measurements involve measuring: Distances from randomly selected points to the nearest neighbouring object, giving a point-object distance. Distances from a randomly selected object to the nearest neighbouring object, giving an object-object distance. This procedure requires us to know the locations of all objects within the study area to allow selecting objects randomly.

6 Example: Types of distance measurement Have a Poisson process with 30 objects within a unit square. Left: distances from four randomly selected objects in the study area to their nearest object. Gives object-object distances. Right: distances from four randomly located points in the study area to their nearest object. Gives point-object distances.

7 Other types of distance measurement I (NOT examined) Other types of distance measurement can be considered: Random object to the nth nearest neighbour. Random point to the nth nearest neighbour. Besag and Gleaves (1973) T-square sampling.

8 Other types of distance measurement II (NOT examined) Besag and Gleaves (1973) T-square sampling. Find distance from a random point O to the nearest object P. Find distance to nearest object Q from P, where Q is located in the half-plane beyond O. Gives a point-object distance and an object-object distance. Q P O

9 Point-object distances I Suppose object locations occur as a Poisson process with intensity λ (mean number of objects per unit area is λ). Number X(A) of objects in a region A with size A has a Poisson distribution with mean µ = λ A so pr{x(a) = x} = µx e µ x! = (λ A )x e λ A, x = 0,1,2,.... x! In particular, pr{x(a) = 0} = e λ A.

10 Point-object distances II Let R denote distance from a random point to nearest object. Consider a circle of radius r centred on the random point. r Distance R from a random point to the nearest object is greater than r if the circle of radius r and area πr 2 contains no objects.

11 Point-object distances III Distance R from random point to nearest object satisfies pr{r > r} = pr{no objects inside circle of radius r} = pr{x(a) = 0} where X(A) Poisson(µ = λ A ) with A = πr 2. Hence pr{r > r} = exp( λπr 2 ). Cumulative distribution function of R is F R (r) = pr{r r} = 1 pr{r > r} = 1 exp( λπr 2 ), r > 0. The probability density function f R (r) of R is f R (r) = df R(r) dr = 2λπr exp( λπr 2 ), r > 0.

12 Rayleigh distribution I f R (r) = df R(r) = 2λπr exp( λπr 2 ), r > 0. dr This is probability density function of a Rayleigh distribution. It is a special case of the Weibull distribution with probability density function f X (x) = abx b 1 exp( ax b ), for x > 0, where a > 0 and b > 0. Here a = λπ and b = 2. Plots show λ = 0.1 (left), λ = 0.2 (centre) and λ = 0.4 (right). Pdf Pdf Pdf r r r

13 Rayleigh distribution II E[R] = r=0 rf R (r)dr = r=0 2λπr 2 exp( λπr 2 )dr. Let y = λπr 2 so dy = 2λπrdr and dr = dy 2 λπy so E[R] = 2ye y 2 λπy dy = 1 2 λ y 0.5 e y y=0 = 1 2 λ y=0 Γ ( 3 2 y=0 ) dy = 1 2 λ y 0.5 e y dy π since Γ( 3 2 ) = 1 2 Γ(1 2 ) = 1 2 π and area under a gamma(α = 3 2,1) distribution integrates to one, so that y=0 y 0.5 e y Γ ( ) 3 dy =

14 Rayleigh distribution III: revision of gamma distribution A gamma(α,λ) distribution has probability density function f Y (y) = λα y α 1 e λy Γ(α) for y > 0, where the gamma function satisfies Γ(α) = (α 1)Γ(α 1) with Γ(1) = 1 and Γ ( 1 2) = π.

15 Rayleigh distribution IV E[R 2 ] = r=0 r 2 f R (r)dr = r=0 2λπr 3 exp( λπr 2 )dr. Putting y = λπr 2 and dy = 2λπrdr gives E[R 2 2ye y ] = y=0 2λπ dy = 1 ye y dy = 1 λπ y=0 λπ as area under a gamma(α = 2,1) distribution integrates to one ( ) ye y so, with Γ(2) = 1, dy = 1. y=0 Γ(2) Hence Var[R] = E[R 2 ] {E[R]} 2 = 1 λπ 1 4λ = 4 π 4λπ. ( ) Or recall for Y exponential(1), E[Y ] = y=0 ye y dy = 1.

16 Object-object distances Given a large number N of objects in the study area A, the distribution of the distance between a random object and the nearest neighbouring object is the same as the point-object distance. Suppose A contains N objects randomly positioned within A. Probability any object is located in a small region a A is a / A. Probability any object is not located in a is 1 a / A. If a = πr 2, probability that none of remaining N 1 objects are within a distance r of a randomly chosen object is (1 πr 2 / A ) N 1 by independence. Writing λ N/ A gives pr{r r} 1 (1 λπr 2 /N) N 1. As N this gives same as point-object distribution function.

17 Clark-Evans test I Have N object-object nearest neighbour distances r i, i = 1,2,...,N, with sample mean r. If randomness (Poisson process) assumption is true, then for large N, Clark and Evans (1954) assume ( 1 r N 2 λ, 4 π ) 4λπN where E[R] = 1 2 λ and Var[R] = 4 π 4λπ. Hence Z = r 1 2 λ 4 π 4λπN N(0,1). Reject randomness hypothesis at 5% level if Z > 1.96.

18 Clark-Evans test II For small N Clark and Evans would suggest using a suitable gamma distribution as an approximation to the distribution of r.

19 Clark-Evans measure of randomness Clark and Evans use 1 φ R = r E[R] = 2 λ r as a measure of randomness. φ R 1 for a random process, φ R < 1 for a clustered (aggregated) process, and φ R > 1 for a regularly located process 2. 1 Clark and Evans used the symbol R for their randomness measure but to avoid confusion with the random variable R the symbol φ R is used here. 2 Most extreme case has objects on a hexagonal grid, each object the same distance r from six others. This hexagon has area 3 3 r 2 /2 and is associated with 3 data points, the central point and a weight one third for each of the six surrounding points, so λ = 3/(3 3 r 2 /2). Thus r = / λ so φ R =

20 Problems with Clark-Evans test I Intensity λ should be known to carry out the test. Could be estimated using the mean number of objects per unit area from the study region. Clark-Evans test uses all N object-object distances. These distances are not independent but Diggle (1976) and Donnelly (1978) showed that the correlations are small 3. Correlations between the object-object distances mean central limit theorem does NOT apply. However Z N(0,1) as shown by Donnelly (1978). 3 Donnelly (1978) obtained better approximations for mean and variance of the object-object distances, but for large N these give E[R] and Var[R] as obtained by assuming object-object distances are independent.

21 Using a border region I Clark and Evans (1954) advise having a border around the study region to avoid bias. For points near the edge of a study region the calculated object-object distance to objects within the study region will tend to be larger than it should be. This will have the effect of biasing the test statistic Z upwards, rejecting the randomness hypothesis and suggesting regularity of the data points.

22 Using a border region II Object-object distances are measured for all objects within the inner region and can be to points within the border.

23 Using a border region III Donnelly (1978) presented approximations for E[R] and Var[R] when a border is ignored. For perimeter P, E[R] 1 2 λ + P N Var[R] λn P N 2 λ. ( ), N

24 Using a toroidal correction If a rectangular study region, an alternative is to assume the region lies on a torus, so opposite edges are adjacent to each other. The study region (centre below) is surrounded by a grid of identical regions. Object-object distances are measured for all objects within the central region and can be to points outside the centre.

25 Example 1: Simulated data I The object-object nearest neighbour distances for the N = 11 objects within the inner study region below are:

26 Example 1: Simulated data II Data are: These have mean r = The inner region has area 9m 2 so λ can be estimated by λ = 11/9 = The test statistic is thus z = r 1 2 λ 4 π 4λπN = = Here z < Accept the randomness hypothesis at 5% level. Notice many of the object-object distances are the same.

27 Example 2: Ground ant nests in Panama I Levings and Franks (1982) present data for the number of ground ant nests in various study regions on Barro Colorado Island, in Gatun Lake, Panama. For one 100m 2 square study region the number of nests of Ectatomma ruidum per m 2 was given as 0.61 with φ R = This suggests λ = 0.61, N = 100λ = 61 and r = φ R 2 = The Clark-Evans test statistic is then λ z = r 1 2 λ 4 π 4λπN = = As a two-sided test the P-value of this test is P = pr{ Z > 2.390} = so reject randomness hypothesis. φ R > 1 suggests the ant nests are distributed regularly.

28 Example 2: Ground ant nests in Panama II Unfortunately Levings and Franks did not appear to use a border so that their results are invalid. Using the corrected values for E[R] and Var[R] obtained by Donnelly (1978) the test statistic becomes z = which is not significant. There is thus no evidence to reject the randomness hypothesis. For perimeter P (here 40m), this gives E[R] 1 2 λ + P N ( N ) = , Var[R] λn P N 2 λ =

29 Intensive sampling If all the nearest neighbour distances are calculated in a region, then the values are not independent. Cressie (1993, p ) refers to this as intensive sampling. The consequence is that the true variance of R is greater than that assumed (due to the correlations) so the test statistic Z used in the test tends to be larger than it should be resulting in clustering being suggested more often than it should be. One solution is to use Monte-Carlo tests for inference. Independent realizations of the data assuming the null hypothesis is true are simulated and the test statistic Z i calculated for each. The observed value of the test statistic Z can be compared with the ones simulated and the test rejects the null hypothesis if the observed Z is too large or too small when compared with the simulated Z i.

Probability Calculator

Probability Calculator Chapter 95 Introduction Most statisticians have a set of probability tables that they refer to in doing their statistical wor. This procedure provides you with a set of electronic statistical tables that

More information

Statistics 100A Homework 7 Solutions

Statistics 100A Homework 7 Solutions Chapter 6 Statistics A Homework 7 Solutions Ryan Rosario. A television store owner figures that 45 percent of the customers entering his store will purchase an ordinary television set, 5 percent will purchase

More information

Math 461 Fall 2006 Test 2 Solutions

Math 461 Fall 2006 Test 2 Solutions Math 461 Fall 2006 Test 2 Solutions Total points: 100. Do all questions. Explain all answers. No notes, books, or electronic devices. 1. [105+5 points] Assume X Exponential(λ). Justify the following two

More information

12.5: CHI-SQUARE GOODNESS OF FIT TESTS

12.5: CHI-SQUARE GOODNESS OF FIT TESTS 125: Chi-Square Goodness of Fit Tests CD12-1 125: CHI-SQUARE GOODNESS OF FIT TESTS In this section, the χ 2 distribution is used for testing the goodness of fit of a set of data to a specific probability

More information

Spatial Statistics Chapter 3 Basics of areal data and areal data modeling

Spatial Statistics Chapter 3 Basics of areal data and areal data modeling Spatial Statistics Chapter 3 Basics of areal data and areal data modeling Recall areal data also known as lattice data are data Y (s), s D where D is a discrete index set. This usually corresponds to data

More information

Notes on the Negative Binomial Distribution

Notes on the Negative Binomial Distribution Notes on the Negative Binomial Distribution John D. Cook October 28, 2009 Abstract These notes give several properties of the negative binomial distribution. 1. Parameterizations 2. The connection between

More information

Maximum Likelihood Estimation

Maximum Likelihood Estimation Math 541: Statistical Theory II Lecturer: Songfeng Zheng Maximum Likelihood Estimation 1 Maximum Likelihood Estimation Maximum likelihood is a relatively simple method of constructing an estimator for

More information

Department of Mathematics, Indian Institute of Technology, Kharagpur Assignment 2-3, Probability and Statistics, March 2015. Due:-March 25, 2015.

Department of Mathematics, Indian Institute of Technology, Kharagpur Assignment 2-3, Probability and Statistics, March 2015. Due:-March 25, 2015. Department of Mathematics, Indian Institute of Technology, Kharagpur Assignment -3, Probability and Statistics, March 05. Due:-March 5, 05.. Show that the function 0 for x < x+ F (x) = 4 for x < for x

More information

Feb 28 Homework Solutions Math 151, Winter 2012. Chapter 6 Problems (pages 287-291)

Feb 28 Homework Solutions Math 151, Winter 2012. Chapter 6 Problems (pages 287-291) Feb 8 Homework Solutions Math 5, Winter Chapter 6 Problems (pages 87-9) Problem 6 bin of 5 transistors is known to contain that are defective. The transistors are to be tested, one at a time, until the

More information

1 Prior Probability and Posterior Probability

1 Prior Probability and Posterior Probability Math 541: Statistical Theory II Bayesian Approach to Parameter Estimation Lecturer: Songfeng Zheng 1 Prior Probability and Posterior Probability Consider now a problem of statistical inference in which

More information

HYPOTHESIS TESTING: POWER OF THE TEST

HYPOTHESIS TESTING: POWER OF THE TEST HYPOTHESIS TESTING: POWER OF THE TEST The first 6 steps of the 9-step test of hypothesis are called "the test". These steps are not dependent on the observed data values. When planning a research project,

More information

Aggregate Loss Models

Aggregate Loss Models Aggregate Loss Models Chapter 9 Stat 477 - Loss Models Chapter 9 (Stat 477) Aggregate Loss Models Brian Hartman - BYU 1 / 22 Objectives Objectives Individual risk model Collective risk model Computing

More information

Lecture 6: Discrete & Continuous Probability and Random Variables

Lecture 6: Discrete & Continuous Probability and Random Variables Lecture 6: Discrete & Continuous Probability and Random Variables D. Alex Hughes Math Camp September 17, 2015 D. Alex Hughes (Math Camp) Lecture 6: Discrete & Continuous Probability and Random September

More information

Chapter 3 RANDOM VARIATE GENERATION

Chapter 3 RANDOM VARIATE GENERATION Chapter 3 RANDOM VARIATE GENERATION In order to do a Monte Carlo simulation either by hand or by computer, techniques must be developed for generating values of random variables having known distributions.

More information

1 Sufficient statistics

1 Sufficient statistics 1 Sufficient statistics A statistic is a function T = rx 1, X 2,, X n of the random sample X 1, X 2,, X n. Examples are X n = 1 n s 2 = = X i, 1 n 1 the sample mean X i X n 2, the sample variance T 1 =

More information

Statistics 100A Homework 8 Solutions

Statistics 100A Homework 8 Solutions Part : Chapter 7 Statistics A Homework 8 Solutions Ryan Rosario. A player throws a fair die and simultaneously flips a fair coin. If the coin lands heads, then she wins twice, and if tails, the one-half

More information

Hypothesis Testing for Beginners

Hypothesis Testing for Beginners Hypothesis Testing for Beginners Michele Piffer LSE August, 2011 Michele Piffer (LSE) Hypothesis Testing for Beginners August, 2011 1 / 53 One year ago a friend asked me to put down some easy-to-read notes

More information

2WB05 Simulation Lecture 8: Generating random variables

2WB05 Simulation Lecture 8: Generating random variables 2WB05 Simulation Lecture 8: Generating random variables Marko Boon http://www.win.tue.nl/courses/2wb05 January 7, 2013 Outline 2/36 1. How do we generate random variables? 2. Fitting distributions Generating

More information

Lesson19: Comparing Predictive Accuracy of two Forecasts: Th. Diebold-Mariano Test

Lesson19: Comparing Predictive Accuracy of two Forecasts: Th. Diebold-Mariano Test Lesson19: Comparing Predictive Accuracy of two Forecasts: The Diebold-Mariano Test Dipartimento di Ingegneria e Scienze dell Informazione e Matematica Università dell Aquila, umberto.triacca@univaq.it

More information

UNIT I: RANDOM VARIABLES PART- A -TWO MARKS

UNIT I: RANDOM VARIABLES PART- A -TWO MARKS UNIT I: RANDOM VARIABLES PART- A -TWO MARKS 1. Given the probability density function of a continuous random variable X as follows f(x) = 6x (1-x) 0

More information

BA 275 Review Problems - Week 6 (10/30/06-11/3/06) CD Lessons: 53, 54, 55, 56 Textbook: pp. 394-398, 404-408, 410-420

BA 275 Review Problems - Week 6 (10/30/06-11/3/06) CD Lessons: 53, 54, 55, 56 Textbook: pp. 394-398, 404-408, 410-420 BA 275 Review Problems - Week 6 (10/30/06-11/3/06) CD Lessons: 53, 54, 55, 56 Textbook: pp. 394-398, 404-408, 410-420 1. Which of the following will increase the value of the power in a statistical test

More information

Sales forecasting # 2

Sales forecasting # 2 Sales forecasting # 2 Arthur Charpentier arthur.charpentier@univ-rennes1.fr 1 Agenda Qualitative and quantitative methods, a very general introduction Series decomposition Short versus long term forecasting

More information

Introduction to General and Generalized Linear Models

Introduction to General and Generalized Linear Models Introduction to General and Generalized Linear Models General Linear Models - part I Henrik Madsen Poul Thyregod Informatics and Mathematical Modelling Technical University of Denmark DK-2800 Kgs. Lyngby

More information

PSTAT 120B Probability and Statistics

PSTAT 120B Probability and Statistics - Week University of California, Santa Barbara April 10, 013 Discussion section for 10B Information about TA: Fang-I CHU Office: South Hall 5431 T Office hour: TBA email: chu@pstat.ucsb.edu Slides will

More information

Section 5.1 Continuous Random Variables: Introduction

Section 5.1 Continuous Random Variables: Introduction Section 5. Continuous Random Variables: Introduction Not all random variables are discrete. For example:. Waiting times for anything (train, arrival of customer, production of mrna molecule from gene,

More information

Summary of Formulas and Concepts. Descriptive Statistics (Ch. 1-4)

Summary of Formulas and Concepts. Descriptive Statistics (Ch. 1-4) Summary of Formulas and Concepts Descriptive Statistics (Ch. 1-4) Definitions Population: The complete set of numerical information on a particular quantity in which an investigator is interested. We assume

More information

Permutation Tests for Comparing Two Populations

Permutation Tests for Comparing Two Populations Permutation Tests for Comparing Two Populations Ferry Butar Butar, Ph.D. Jae-Wan Park Abstract Permutation tests for comparing two populations could be widely used in practice because of flexibility of

More information

CHAPTER 6: Continuous Uniform Distribution: 6.1. Definition: The density function of the continuous random variable X on the interval [A, B] is.

CHAPTER 6: Continuous Uniform Distribution: 6.1. Definition: The density function of the continuous random variable X on the interval [A, B] is. Some Continuous Probability Distributions CHAPTER 6: Continuous Uniform Distribution: 6. Definition: The density function of the continuous random variable X on the interval [A, B] is B A A x B f(x; A,

More information

Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression

Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression Objectives: To perform a hypothesis test concerning the slope of a least squares line To recognize that testing for a

More information

Class 19: Two Way Tables, Conditional Distributions, Chi-Square (Text: Sections 2.5; 9.1)

Class 19: Two Way Tables, Conditional Distributions, Chi-Square (Text: Sections 2.5; 9.1) Spring 204 Class 9: Two Way Tables, Conditional Distributions, Chi-Square (Text: Sections 2.5; 9.) Big Picture: More than Two Samples In Chapter 7: We looked at quantitative variables and compared the

More information

ECE302 Spring 2006 HW5 Solutions February 21, 2006 1

ECE302 Spring 2006 HW5 Solutions February 21, 2006 1 ECE3 Spring 6 HW5 Solutions February 1, 6 1 Solutions to HW5 Note: Most of these solutions were generated by R. D. Yates and D. J. Goodman, the authors of our textbook. I have added comments in italics

More information

CSU Fresno Problem Solving Session. Geometry, 17 March 2012

CSU Fresno Problem Solving Session. Geometry, 17 March 2012 CSU Fresno Problem Solving Session Problem Solving Sessions website: http://zimmer.csufresno.edu/ mnogin/mfd-prep.html Math Field Day date: Saturday, April 21, 2012 Math Field Day website: http://www.csufresno.edu/math/news

More information

Interaction between quantitative predictors

Interaction between quantitative predictors Interaction between quantitative predictors In a first-order model like the ones we have discussed, the association between E(y) and a predictor x j does not depend on the value of the other predictors

More information

Math 151. Rumbos Spring 2014 1. Solutions to Assignment #22

Math 151. Rumbos Spring 2014 1. Solutions to Assignment #22 Math 151. Rumbos Spring 2014 1 Solutions to Assignment #22 1. An experiment consists of rolling a die 81 times and computing the average of the numbers on the top face of the die. Estimate the probability

More information

Simple Linear Regression Inference

Simple Linear Regression Inference Simple Linear Regression Inference 1 Inference requirements The Normality assumption of the stochastic term e is needed for inference even if it is not a OLS requirement. Therefore we have: Interpretation

More information

Questions and Answers

Questions and Answers GNH7/GEOLGG9/GEOL2 EARTHQUAKE SEISMOLOGY AND EARTHQUAKE HAZARD TUTORIAL (6): EARTHQUAKE STATISTICS Question. Questions and Answers How many distinct 5-card hands can be dealt from a standard 52-card deck?

More information

6.041/6.431 Spring 2008 Quiz 2 Wednesday, April 16, 7:30-9:30 PM. SOLUTIONS

6.041/6.431 Spring 2008 Quiz 2 Wednesday, April 16, 7:30-9:30 PM. SOLUTIONS 6.4/6.43 Spring 28 Quiz 2 Wednesday, April 6, 7:3-9:3 PM. SOLUTIONS Name: Recitation Instructor: TA: 6.4/6.43: Question Part Score Out of 3 all 36 2 a 4 b 5 c 5 d 8 e 5 f 6 3 a 4 b 6 c 6 d 6 e 6 Total

More information

Psychology 60 Fall 2013 Practice Exam Actual Exam: Next Monday. Good luck!

Psychology 60 Fall 2013 Practice Exam Actual Exam: Next Monday. Good luck! Psychology 60 Fall 2013 Practice Exam Actual Exam: Next Monday. Good luck! Name: 1. The basic idea behind hypothesis testing: A. is important only if you want to compare two populations. B. depends on

More information

Stat 515 Midterm Examination II April 6, 2010 (9:30 a.m. - 10:45 a.m.)

Stat 515 Midterm Examination II April 6, 2010 (9:30 a.m. - 10:45 a.m.) Name: Stat 515 Midterm Examination II April 6, 2010 (9:30 a.m. - 10:45 a.m.) The total score is 100 points. Instructions: There are six questions. Each one is worth 20 points. TA will grade the best five

More information

3.4 Statistical inference for 2 populations based on two samples

3.4 Statistical inference for 2 populations based on two samples 3.4 Statistical inference for 2 populations based on two samples Tests for a difference between two population means The first sample will be denoted as X 1, X 2,..., X m. The second sample will be denoted

More information

A LOGNORMAL MODEL FOR INSURANCE CLAIMS DATA

A LOGNORMAL MODEL FOR INSURANCE CLAIMS DATA REVSTAT Statistical Journal Volume 4, Number 2, June 2006, 131 142 A LOGNORMAL MODEL FOR INSURANCE CLAIMS DATA Authors: Daiane Aparecida Zuanetti Departamento de Estatística, Universidade Federal de São

More information

CAB TRAVEL TIME PREDICTI - BASED ON HISTORICAL TRIP OBSERVATION

CAB TRAVEL TIME PREDICTI - BASED ON HISTORICAL TRIP OBSERVATION CAB TRAVEL TIME PREDICTI - BASED ON HISTORICAL TRIP OBSERVATION N PROBLEM DEFINITION Opportunity New Booking - Time of Arrival Shortest Route (Distance/Time) Taxi-Passenger Demand Distribution Value Accurate

More information

Nominal and ordinal logistic regression

Nominal and ordinal logistic regression Nominal and ordinal logistic regression April 26 Nominal and ordinal logistic regression Our goal for today is to briefly go over ways to extend the logistic regression model to the case where the outcome

More information

. (3.3) n Note that supremum (3.2) must occur at one of the observed values x i or to the left of x i.

. (3.3) n Note that supremum (3.2) must occur at one of the observed values x i or to the left of x i. Chapter 3 Kolmogorov-Smirnov Tests There are many situations where experimenters need to know what is the distribution of the population of their interest. For example, if they want to use a parametric

More information

Two-Sample T-Tests Allowing Unequal Variance (Enter Difference)

Two-Sample T-Tests Allowing Unequal Variance (Enter Difference) Chapter 45 Two-Sample T-Tests Allowing Unequal Variance (Enter Difference) Introduction This procedure provides sample size and power calculations for one- or two-sided two-sample t-tests when no assumption

More information

Chapter 8 Hypothesis Testing Chapter 8 Hypothesis Testing 8-1 Overview 8-2 Basics of Hypothesis Testing

Chapter 8 Hypothesis Testing Chapter 8 Hypothesis Testing 8-1 Overview 8-2 Basics of Hypothesis Testing Chapter 8 Hypothesis Testing 1 Chapter 8 Hypothesis Testing 8-1 Overview 8-2 Basics of Hypothesis Testing 8-3 Testing a Claim About a Proportion 8-5 Testing a Claim About a Mean: s Not Known 8-6 Testing

More information

Chapter 5. Random variables

Chapter 5. Random variables Random variables random variable numerical variable whose value is the outcome of some probabilistic experiment; we use uppercase letters, like X, to denote such a variable and lowercase letters, like

More information

Chapter 7 Notes - Inference for Single Samples. You know already for a large sample, you can invoke the CLT so:

Chapter 7 Notes - Inference for Single Samples. You know already for a large sample, you can invoke the CLT so: Chapter 7 Notes - Inference for Single Samples You know already for a large sample, you can invoke the CLT so: X N(µ, ). Also for a large sample, you can replace an unknown σ by s. You know how to do a

More information

Monte Carlo tests for spatial patterns and their change a

Monte Carlo tests for spatial patterns and their change a 1 Monte Carlo tests for spatial patterns and their change a Finnish Forest Research Institute Unioninkatu 40 A, 00170 Helsinki juha.heikkinen@metla.fi Workshop on Spatial Statistics and Ecology Perämeri

More information

START Selected Topics in Assurance

START Selected Topics in Assurance START Selected Topics in Assurance Related Technologies Table of Contents Introduction Some Statistical Background Fitting a Normal Using the Anderson Darling GoF Test Fitting a Weibull Using the Anderson

More information

6.2 Permutations continued

6.2 Permutations continued 6.2 Permutations continued Theorem A permutation on a finite set A is either a cycle or can be expressed as a product (composition of disjoint cycles. Proof is by (strong induction on the number, r, of

More information

MATH 10: Elementary Statistics and Probability Chapter 5: Continuous Random Variables

MATH 10: Elementary Statistics and Probability Chapter 5: Continuous Random Variables MATH 10: Elementary Statistics and Probability Chapter 5: Continuous Random Variables Tony Pourmohamad Department of Mathematics De Anza College Spring 2015 Objectives By the end of this set of slides,

More information

Pr(X = x) = f(x) = λe λx

Pr(X = x) = f(x) = λe λx Old Business - variance/std. dev. of binomial distribution - mid-term (day, policies) - class strategies (problems, etc.) - exponential distributions New Business - Central Limit Theorem, standard error

More information

Statistical Machine Learning

Statistical Machine Learning Statistical Machine Learning UoC Stats 37700, Winter quarter Lecture 4: classical linear and quadratic discriminants. 1 / 25 Linear separation For two classes in R d : simple idea: separate the classes

More information

4. How many integers between 2004 and 4002 are perfect squares?

4. How many integers between 2004 and 4002 are perfect squares? 5 is 0% of what number? What is the value of + 3 4 + 99 00? (alternating signs) 3 A frog is at the bottom of a well 0 feet deep It climbs up 3 feet every day, but slides back feet each night If it started

More information

Non-Inferiority Tests for Two Means using Differences

Non-Inferiority Tests for Two Means using Differences Chapter 450 on-inferiority Tests for Two Means using Differences Introduction This procedure computes power and sample size for non-inferiority tests in two-sample designs in which the outcome is a continuous

More information

A Uniform Asymptotic Estimate for Discounted Aggregate Claims with Subexponential Tails

A Uniform Asymptotic Estimate for Discounted Aggregate Claims with Subexponential Tails 12th International Congress on Insurance: Mathematics and Economics July 16-18, 2008 A Uniform Asymptotic Estimate for Discounted Aggregate Claims with Subexponential Tails XUEMIAO HAO (Based on a joint

More information

Pearson's Correlation Tests

Pearson's Correlation Tests Chapter 800 Pearson's Correlation Tests Introduction The correlation coefficient, ρ (rho), is a popular statistic for describing the strength of the relationship between two variables. The correlation

More information

The sample space for a pair of die rolls is the set. The sample space for a random number between 0 and 1 is the interval [0, 1].

The sample space for a pair of die rolls is the set. The sample space for a random number between 0 and 1 is the interval [0, 1]. Probability Theory Probability Spaces and Events Consider a random experiment with several possible outcomes. For example, we might roll a pair of dice, flip a coin three times, or choose a random real

More information

Introduction to Hypothesis Testing OPRE 6301

Introduction to Hypothesis Testing OPRE 6301 Introduction to Hypothesis Testing OPRE 6301 Motivation... The purpose of hypothesis testing is to determine whether there is enough statistical evidence in favor of a certain belief, or hypothesis, about

More information

Confidence Intervals for Exponential Reliability

Confidence Intervals for Exponential Reliability Chapter 408 Confidence Intervals for Exponential Reliability Introduction This routine calculates the number of events needed to obtain a specified width of a confidence interval for the reliability (proportion

More information

Premaster Statistics Tutorial 4 Full solutions

Premaster Statistics Tutorial 4 Full solutions Premaster Statistics Tutorial 4 Full solutions Regression analysis Q1 (based on Doane & Seward, 4/E, 12.7) a. Interpret the slope of the fitted regression = 125,000 + 150. b. What is the prediction for

More information

The Exponential Distribution

The Exponential Distribution 21 The Exponential Distribution From Discrete-Time to Continuous-Time: In Chapter 6 of the text we will be considering Markov processes in continuous time. In a sense, we already have a very good understanding

More information

Math 425 (Fall 08) Solutions Midterm 2 November 6, 2008

Math 425 (Fall 08) Solutions Midterm 2 November 6, 2008 Math 425 (Fall 8) Solutions Midterm 2 November 6, 28 (5 pts) Compute E[X] and Var[X] for i) X a random variable that takes the values, 2, 3 with probabilities.2,.5,.3; ii) X a random variable with the

More information

VISUALIZATION OF DENSITY FUNCTIONS WITH GEOGEBRA

VISUALIZATION OF DENSITY FUNCTIONS WITH GEOGEBRA VISUALIZATION OF DENSITY FUNCTIONS WITH GEOGEBRA Csilla Csendes University of Miskolc, Hungary Department of Applied Mathematics ICAM 2010 Probability density functions A random variable X has density

More information

CHI-SQUARE: TESTING FOR GOODNESS OF FIT

CHI-SQUARE: TESTING FOR GOODNESS OF FIT CHI-SQUARE: TESTING FOR GOODNESS OF FIT In the previous chapter we discussed procedures for fitting a hypothesized function to a set of experimental data points. Such procedures involve minimizing a quantity

More information

5.1 Identifying the Target Parameter

5.1 Identifying the Target Parameter University of California, Davis Department of Statistics Summer Session II Statistics 13 August 20, 2012 Date of latest update: August 20 Lecture 5: Estimation with Confidence intervals 5.1 Identifying

More information

Properties of Future Lifetime Distributions and Estimation

Properties of Future Lifetime Distributions and Estimation Properties of Future Lifetime Distributions and Estimation Harmanpreet Singh Kapoor and Kanchan Jain Abstract Distributional properties of continuous future lifetime of an individual aged x have been studied.

More information

Notes on Continuous Random Variables

Notes on Continuous Random Variables Notes on Continuous Random Variables Continuous random variables are random quantities that are measured on a continuous scale. They can usually take on any value over some interval, which distinguishes

More information

Online Appendix to Stochastic Imitative Game Dynamics with Committed Agents

Online Appendix to Stochastic Imitative Game Dynamics with Committed Agents Online Appendix to Stochastic Imitative Game Dynamics with Committed Agents William H. Sandholm January 6, 22 O.. Imitative protocols, mean dynamics, and equilibrium selection In this section, we consider

More information

α α λ α = = λ λ α ψ = = α α α λ λ ψ α = + β = > θ θ β > β β θ θ θ β θ β γ θ β = γ θ > β > γ θ β γ = θ β = θ β = θ β = β θ = β β θ = = = β β θ = + α α α α α = = λ λ λ λ λ λ λ = λ λ α α α α λ ψ + α =

More information

MATH4427 Notebook 2 Spring 2016. 2 MATH4427 Notebook 2 3. 2.1 Definitions and Examples... 3. 2.2 Performance Measures for Estimators...

MATH4427 Notebook 2 Spring 2016. 2 MATH4427 Notebook 2 3. 2.1 Definitions and Examples... 3. 2.2 Performance Measures for Estimators... MATH4427 Notebook 2 Spring 2016 prepared by Professor Jenny Baglivo c Copyright 2009-2016 by Jenny A. Baglivo. All Rights Reserved. Contents 2 MATH4427 Notebook 2 3 2.1 Definitions and Examples...................................

More information

Hypothesis Testing --- One Mean

Hypothesis Testing --- One Mean Hypothesis Testing --- One Mean A hypothesis is simply a statement that something is true. Typically, there are two hypotheses in a hypothesis test: the null, and the alternative. Null Hypothesis The hypothesis

More information

Multinomial and Ordinal Logistic Regression

Multinomial and Ordinal Logistic Regression Multinomial and Ordinal Logistic Regression ME104: Linear Regression Analysis Kenneth Benoit August 22, 2012 Regression with categorical dependent variables When the dependent variable is categorical,

More information

Math 1B, lecture 5: area and volume

Math 1B, lecture 5: area and volume Math B, lecture 5: area and volume Nathan Pflueger 6 September 2 Introduction This lecture and the next will be concerned with the computation of areas of regions in the plane, and volumes of regions in

More information

IEOR 6711: Stochastic Models I Fall 2012, Professor Whitt, Tuesday, September 11 Normal Approximations and the Central Limit Theorem

IEOR 6711: Stochastic Models I Fall 2012, Professor Whitt, Tuesday, September 11 Normal Approximations and the Central Limit Theorem IEOR 6711: Stochastic Models I Fall 2012, Professor Whitt, Tuesday, September 11 Normal Approximations and the Central Limit Theorem Time on my hands: Coin tosses. Problem Formulation: Suppose that I have

More information

Survival Distributions, Hazard Functions, Cumulative Hazards

Survival Distributions, Hazard Functions, Cumulative Hazards Week 1 Survival Distributions, Hazard Functions, Cumulative Hazards 1.1 Definitions: The goals of this unit are to introduce notation, discuss ways of probabilistically describing the distribution of a

More information

5. Continuous Random Variables

5. Continuous Random Variables 5. Continuous Random Variables Continuous random variables can take any value in an interval. They are used to model physical characteristics such as time, length, position, etc. Examples (i) Let X be

More information

Introduction to Hypothesis Testing

Introduction to Hypothesis Testing I. Terms, Concepts. Introduction to Hypothesis Testing A. In general, we do not know the true value of population parameters - they must be estimated. However, we do have hypotheses about what the true

More information

Practice problems for Homework 11 - Point Estimation

Practice problems for Homework 11 - Point Estimation Practice problems for Homework 11 - Point Estimation 1. (10 marks) Suppose we want to select a random sample of size 5 from the current CS 3341 students. Which of the following strategies is the best:

More information

Experimental Design. Power and Sample Size Determination. Proportions. Proportions. Confidence Interval for p. The Binomial Test

Experimental Design. Power and Sample Size Determination. Proportions. Proportions. Confidence Interval for p. The Binomial Test Experimental Design Power and Sample Size Determination Bret Hanlon and Bret Larget Department of Statistics University of Wisconsin Madison November 3 8, 2011 To this point in the semester, we have largely

More information

Introduction. Hypothesis Testing. Hypothesis Testing. Significance Testing

Introduction. Hypothesis Testing. Hypothesis Testing. Significance Testing Introduction Hypothesis Testing Mark Lunt Arthritis Research UK Centre for Ecellence in Epidemiology University of Manchester 13/10/2015 We saw last week that we can never know the population parameters

More information

How to assess the risk of a large portfolio? How to estimate a large covariance matrix?

How to assess the risk of a large portfolio? How to estimate a large covariance matrix? Chapter 3 Sparse Portfolio Allocation This chapter touches some practical aspects of portfolio allocation and risk assessment from a large pool of financial assets (e.g. stocks) How to assess the risk

More information

How To Calculate The Power Of A Cluster In Erlang (Orchestra)

How To Calculate The Power Of A Cluster In Erlang (Orchestra) Network Traffic Distribution Derek McAvoy Wireless Technology Strategy Architect March 5, 21 Data Growth is Exponential 2.5 x 18 98% 2 95% Traffic 1.5 1 9% 75% 5%.5 Data Traffic Feb 29 25% 1% 5% 2% 5 1

More information

MAINTAINED SYSTEMS. Harry G. Kwatny. Department of Mechanical Engineering & Mechanics Drexel University ENGINEERING RELIABILITY INTRODUCTION

MAINTAINED SYSTEMS. Harry G. Kwatny. Department of Mechanical Engineering & Mechanics Drexel University ENGINEERING RELIABILITY INTRODUCTION MAINTAINED SYSTEMS Harry G. Kwatny Department of Mechanical Engineering & Mechanics Drexel University OUTLINE MAINTE MAINTE MAINTAINED UNITS Maintenance can be employed in two different manners: Preventive

More information

Inference for two Population Means

Inference for two Population Means Inference for two Population Means Bret Hanlon and Bret Larget Department of Statistics University of Wisconsin Madison October 27 November 1, 2011 Two Population Means 1 / 65 Case Study Case Study Example

More information

1.3. DOT PRODUCT 19. 6. If θ is the angle (between 0 and π) between two non-zero vectors u and v,

1.3. DOT PRODUCT 19. 6. If θ is the angle (between 0 and π) between two non-zero vectors u and v, 1.3. DOT PRODUCT 19 1.3 Dot Product 1.3.1 Definitions and Properties The dot product is the first way to multiply two vectors. The definition we will give below may appear arbitrary. But it is not. It

More information

Introduction to the Monte Carlo method

Introduction to the Monte Carlo method Some history Simple applications Radiation transport modelling Flux and Dose calculations Variance reduction Easy Monte Carlo Pioneers of the Monte Carlo Simulation Method: Stanisław Ulam (1909 1984) Stanislaw

More information

Basics of Statistical Machine Learning

Basics of Statistical Machine Learning CS761 Spring 2013 Advanced Machine Learning Basics of Statistical Machine Learning Lecturer: Xiaojin Zhu jerryzhu@cs.wisc.edu Modern machine learning is rooted in statistics. You will find many familiar

More information

Probability and Statistics Prof. Dr. Somesh Kumar Department of Mathematics Indian Institute of Technology, Kharagpur

Probability and Statistics Prof. Dr. Somesh Kumar Department of Mathematics Indian Institute of Technology, Kharagpur Probability and Statistics Prof. Dr. Somesh Kumar Department of Mathematics Indian Institute of Technology, Kharagpur Module No. #01 Lecture No. #15 Special Distributions-VI Today, I am going to introduce

More information

Dŵr y Felin Comprehensive School. Perimeter, Area and Volume Methodology Booklet

Dŵr y Felin Comprehensive School. Perimeter, Area and Volume Methodology Booklet Dŵr y Felin Comprehensive School Perimeter, Area and Volume Methodology Booklet Perimeter, Area & Volume Perimeters, Area & Volume are key concepts within the Shape & Space aspect of Mathematics. Pupils

More information

SOCIETY OF ACTUARIES/CASUALTY ACTUARIAL SOCIETY EXAM C CONSTRUCTION AND EVALUATION OF ACTUARIAL MODELS EXAM C SAMPLE QUESTIONS

SOCIETY OF ACTUARIES/CASUALTY ACTUARIAL SOCIETY EXAM C CONSTRUCTION AND EVALUATION OF ACTUARIAL MODELS EXAM C SAMPLE QUESTIONS SOCIETY OF ACTUARIES/CASUALTY ACTUARIAL SOCIETY EXAM C CONSTRUCTION AND EVALUATION OF ACTUARIAL MODELS EXAM C SAMPLE QUESTIONS Copyright 005 by the Society of Actuaries and the Casualty Actuarial Society

More information

CONTENTS OF DAY 2. II. Why Random Sampling is Important 9 A myth, an urban legend, and the real reason NOTES FOR SUMMER STATISTICS INSTITUTE COURSE

CONTENTS OF DAY 2. II. Why Random Sampling is Important 9 A myth, an urban legend, and the real reason NOTES FOR SUMMER STATISTICS INSTITUTE COURSE 1 2 CONTENTS OF DAY 2 I. More Precise Definition of Simple Random Sample 3 Connection with independent random variables 3 Problems with small populations 8 II. Why Random Sampling is Important 9 A myth,

More information

Multivariate normal distribution and testing for means (see MKB Ch 3)

Multivariate normal distribution and testing for means (see MKB Ch 3) Multivariate normal distribution and testing for means (see MKB Ch 3) Where are we going? 2 One-sample t-test (univariate).................................................. 3 Two-sample t-test (univariate).................................................

More information

Bivariate Statistics Session 2: Measuring Associations Chi-Square Test

Bivariate Statistics Session 2: Measuring Associations Chi-Square Test Bivariate Statistics Session 2: Measuring Associations Chi-Square Test Features Of The Chi-Square Statistic The chi-square test is non-parametric. That is, it makes no assumptions about the distribution

More information

INSURANCE RISK THEORY (Problems)

INSURANCE RISK THEORY (Problems) INSURANCE RISK THEORY (Problems) 1 Counting random variables 1. (Lack of memory property) Let X be a geometric distributed random variable with parameter p (, 1), (X Ge (p)). Show that for all n, m =,

More information

Measuring Line Edge Roughness: Fluctuations in Uncertainty

Measuring Line Edge Roughness: Fluctuations in Uncertainty Tutor6.doc: Version 5/6/08 T h e L i t h o g r a p h y E x p e r t (August 008) Measuring Line Edge Roughness: Fluctuations in Uncertainty Line edge roughness () is the deviation of a feature edge (as

More information

Math 431 An Introduction to Probability. Final Exam Solutions

Math 431 An Introduction to Probability. Final Exam Solutions Math 43 An Introduction to Probability Final Eam Solutions. A continuous random variable X has cdf a for 0, F () = for 0 <

More information

research/scientific includes the following: statistical hypotheses: you have a null and alternative you accept one and reject the other

research/scientific includes the following: statistical hypotheses: you have a null and alternative you accept one and reject the other 1 Hypothesis Testing Richard S. Balkin, Ph.D., LPC-S, NCC 2 Overview When we have questions about the effect of a treatment or intervention or wish to compare groups, we use hypothesis testing Parametric

More information

Error Type, Power, Assumptions. Parametric Tests. Parametric vs. Nonparametric Tests

Error Type, Power, Assumptions. Parametric Tests. Parametric vs. Nonparametric Tests Error Type, Power, Assumptions Parametric vs. Nonparametric tests Type-I & -II Error Power Revisited Meeting the Normality Assumption - Outliers, Winsorizing, Trimming - Data Transformation 1 Parametric

More information