Characteristics and statistics of digital remote sensing imagery



Similar documents
SAMPLE MIDTERM QUESTIONS

Resolutions of Remote Sensing

1) Write the following as an algebraic expression using x as the variable: Triple a number subtracted from the number

Data Exploration Data Visualization

DESCRIPTIVE STATISTICS. The purpose of statistics is to condense raw data to make it easier to answer specific questions; test hypotheses.

STATS8: Introduction to Biostatistics. Data Exploration. Babak Shahbaba Department of Statistics, UCI

Module 3: Correlation and Covariance

PIXEL-LEVEL IMAGE FUSION USING BROVEY TRANSFORME AND WAVELET TRANSFORM

Descriptive Statistics

How To Understand And Solve A Linear Programming Problem

Selecting the appropriate band combination for an RGB image using Landsat imagery

Engineering Problem Solving and Excel. EGN 1006 Introduction to Engineering

03 The full syllabus. 03 The full syllabus continued. For more information visit PAPER C03 FUNDAMENTALS OF BUSINESS MATHEMATICS

How To Write A Data Analysis

An introduction to Value-at-Risk Learning Curve September 2003

The Role of SPOT Satellite Images in Mapping Air Pollution Caused by Cement Factories

WATER BODY EXTRACTION FROM MULTI SPECTRAL IMAGE BY SPECTRAL PATTERN ANALYSIS

This unit will lay the groundwork for later units where the students will extend this knowledge to quadratic and exponential functions.

Biostatistics: DESCRIPTIVE STATISTICS: 2, VARIABILITY

Environmental Remote Sensing GEOG 2021

How Landsat Images are Made

business statistics using Excel OXFORD UNIVERSITY PRESS Glyn Davis & Branko Pecar

EXPLORING SPATIAL PATTERNS IN YOUR DATA

Calculation of Minimum Distances. Minimum Distance to Means. Σi i = 1

4.1 Exploratory Analysis: Once the data is collected and entered, the first question is: "What do the data look like?"

Chapter 10. Key Ideas Correlation, Correlation Coefficient (r),

Dimensionality Reduction: Principal Components Analysis

MBA 611 STATISTICS AND QUANTITATIVE METHODS

NCSS Statistical Software Principal Components Regression. In ordinary least squares, the regression coefficients are estimated using the formula ( )

Spectral Response for DigitalGlobe Earth Imaging Instruments

Exercise 1.12 (Pg )

Descriptive statistics parameters: Measures of centrality

Data Mining: Exploring Data. Lecture Notes for Chapter 3. Introduction to Data Mining

Diagrams and Graphs of Statistical Data

Introduction to Quantitative Methods

INTERPRETING THE ONE-WAY ANALYSIS OF VARIANCE (ANOVA)

Big Ideas in Mathematics

DATA INTERPRETATION AND STATISTICS

Summarizing and Displaying Categorical Data

Exploratory Data Analysis

Digital image processing

Statistics. Measurement. Scales of Measurement 7/18/2012

Descriptive Statistics and Measurement Scales

Concepts in Investments Risks and Returns (Relevant to PBE Paper II Management Accounting and Finance)

CHAPTER THREE COMMON DESCRIPTIVE STATISTICS COMMON DESCRIPTIVE STATISTICS / 13

Supervised Classification workflow in ENVI 4.8 using WorldView-2 imagery

Algebra 2 Chapter 1 Vocabulary. identity - A statement that equates two equivalent expressions.

Algebra I Vocabulary Cards

Factor Analysis. Chapter 420. Introduction

Lecture 2: Descriptive Statistics and Exploratory Data Analysis

CALCULATIONS & STATISTICS

GRADES 7, 8, AND 9 BIG IDEAS

Why Taking This Course? Course Introduction, Descriptive Statistics and Data Visualization. Learning Goals. GENOME 560, Spring 2012

Multiple regression - Matrices

An introduction to using Microsoft Excel for quantitative data analysis

II. DISTRIBUTIONS distribution normal distribution. standard scores

5/31/ Normal Distributions. Normal Distributions. Chapter 6. Distribution. The Normal Distribution. Outline. Objectives.

determining relationships among the explanatory variables, and

Module 4: Data Exploration

Business Statistics. Successful completion of Introductory and/or Intermediate Algebra courses is recommended before taking Business Statistics.

Probability and Statistics Vocabulary List (Definitions for Middle School Teachers)

Introduction; Descriptive & Univariate Statistics

Geostatistics Exploratory Analysis

Data exploration with Microsoft Excel: univariate analysis

Introduction to Data Visualization

BNG 202 Biomechanics Lab. Descriptive statistics and probability distributions I

AP STATISTICS REVIEW (YMS Chapters 1-8)

2. Simple Linear Regression

Lesson 4 Measures of Central Tendency

Object Recognition and Template Matching

Measures of Central Tendency and Variability: Summarizing your Data for Others

ANALYSIS OF FOREST CHANGE IN FIRE DAMAGE AREA USING SATELLITE IMAGES

Analysis of Landsat ETM+ Image Enhancement for Lithological Classification Improvement in Eagle Plain Area, Northern Yukon

seven Statistical Analysis with Excel chapter OVERVIEW CHAPTER

Algebra Academic Content Standards Grade Eight and Grade Nine Ohio. Grade Eight. Number, Number Sense and Operations Standard

Mathematics Online Instructional Materials Correlation to the 2009 Algebra I Standards of Learning and Curriculum Framework

RESOLUTION MERGE OF 1: SCALE AERIAL PHOTOGRAPHS WITH LANDSAT 7 ETM IMAGERY

Simple linear regression

Normality Testing in Excel

Common Core Unit Summary Grades 6 to 8

Arithmetic and Algebra of Matrices

MEASURES OF VARIATION

2. Filling Data Gaps, Data validation & Descriptive Statistics

Course Text. Required Computing Software. Course Description. Course Objectives. StraighterLine. Business Statistics

Spreadsheets and Databases For Information Literacy

Vertical Alignment Colorado Academic Standards 6 th - 7 th - 8 th

Fairfield Public Schools

Data exploration with Microsoft Excel: analysing more than one variable

Lecture 1: Review and Exploratory Data Analysis (EDA)

The right edge of the box is the third quartile, Q 3, which is the median of the data values above the median. Maximum Median

Generation of Cloud-free Imagery Using Landsat-8

Review for Introduction to Remote Sensing: Science Concepts and Technology

Investment Statistics: Definitions & Formulas

3: Summary Statistics

Finding and Downloading Landsat Data from the U.S. Geological Survey s Global Visualization Viewer Website

Java Modules for Time Series Analysis

Descriptive Statistics. Purpose of descriptive statistics Frequency distributions Measures of central tendency Measures of dispersion

Descriptive statistics Statistical inference statistical inference, statistical induction and inferential statistics

STAT355 - Probability & Statistics

Adaptive HSI Data Processing for Near-Real-time Analysis and Spectral Recovery *

Transcription:

Characteristics and statistics of digital remote sensing imagery There are two fundamental ways to obtain digital imagery: Acquire remotely sensed imagery in an analog format (often referred to as hard-copy) and then convert it to a digital format through the process of digitization, and Acquire remotely sensed imagery already in a digital format, such as that obtained by a Landsat sensor system. 1

Image Digitization: Analog to Digital (A/D) Conversion Using Linear Array Scanners Image digitization is the process of turning a hard-copy analog imagery into a digital image. (e.g., a des-top scanner)

RGB = True Color. Any addition array that collect spectral information beyond visible light will be able to generate pseudo color image. Additional array? 3

Panchromatic Blac & White Infrared True-color vs. Pseudo (false)-color 4

With all images are digital, nowledge and sills in digital image processing are necessary. Digital Image With raster data structure, each image is treated as an array of values of the pixels. Image data is organized as rows and columns (or lines and pixels) start from the upper left corner of the image. Each pixel (picture element) is treated as a separate unite. 5

Statistics of Digital Images: Looing at the frequency of occurrence of individual brightness values in the image displayed in a histogram Viewing individual pixel brightness values at specific locations or within a geographic area; Computing univariate descriptive statistics to determine if there are unusual anomalies in the image data; and Computing multivariate statistics to determine the amount of between-band correlation (e.g., to identify redundancy). Statistics of Digital Images It is useful to calculate fundamental univariate and multivariate statistics of the multispectral remote sensor data. This normally involved computing of maximum and minimum value for each band of imagery, the range, mean, standard deviation, between-band variance-covariance matrix, correlation matrix, and frequencies of brightness values for each band, which are used to produce histograms. Such statistics provide valuable information necessary for processing and analyzing remote sensing data. 6

Digital Images: A population is an infinite or finite set of elements. A sample is a subset of the elements taen from a population used to mae inferences about certain characteristics of the population. (e.g., training signatures) Large samples drawn randomly from natural populations usually produce a symmetrical frequency distribution. Most values are clustered around some central value, and the frequency of occurrence declines away from this central point. A graph of the distribution appears bell shaped is called a normal distribution. 7

Histogram and Its Significance to Digital Remote Sensing Image Processing Many statistical tests used in the analysis of remotely sensed data assume that the brightness values recorded in a scene are normally distributed. Unfortunately, remotely sensed data may not be normally distributed. In such instance, nonparametric statistical theory may be preferred. Forest Water 8

Histogram and Its Significance to Digital Remote Sensing Image Processing The histogram is a useful graphic representation of the information content of a remote sensing image, which provide readers with an appreciation of the quality of the original data, e.g. whether it is low in contrast, high in contrast, or multimodal in nature. Univariate Descriptive Image Statistics Measures of Central Tendency in Remote Sensor Data Mode: is the value that occurs most frequently in a distribution and is usually the highest point on the curve. Multiple modes are common in image dataset. Median: is the value midway in the frequency distribution. Mean: is the arithmetic average and if defined as the sum of all observations divided by the number of observations. 9

Measures of Central Tendency in Remote Sensor Data The mean is the arithmetic average and is defined as the sum of all brightness value observations divided by the number of observations. It is the most commonly used measure of central tendency. The mean ( ) of a single band of imagery composed of n brightness values (BV i ) is computed using the formula: Sample mean is a poor measure of central tendency when the set of observations is sewed or contains an outlier. 9 Further measurements are needed. Outlier Mean = 7/9 = 3 (Does not represent the dataset well) 0 1 0 1 0 1 Sewed Mean = 9/9 = 1 Mean = 9/9 = 1 10

Measures of Dispersion Measures of the dispersion about the mean of a distribution provide valuable information about the image. For example, the range of a band of imagery (range ) is computed as the difference between the maximum (max ) and minimum (min ) values: Measures of Dispersion Unfortunately, when the minimum or maximum values are extreme or unusual observations, the range could be a misleading measure of dispersion. Such extreme values are not uncommon because the remote sensor data are often collected by detector systems with delicate electronics that can experience spies in voltage and other unfortunate malfunctions. When unusual values are not encountered, the range is a very important statistic often used in image enhancement functions such as min max contrast stretching. 11

Measures of Dispersion Variance: is the average squared deviation of all possible observations from the sample mean. The variance of a band of imagery, var, is computed using the equation: BV 11 BV 1 BV 13 BV 1 BV BV 3 BV 31 BV 3 BV 33 var n ( i 1 BV n i 1 ) BV 11 BV 1 BV 13 BV 1 BV BV 3 var n ( i 1 BV i n 1 ) BV 31 BV 3 BV 33 var 9 i 1 (1 1 ) 9 1 0 19 1 1 18 1 13 var (19 43 8 n i 1 54 ( BV i ) n 1 (1 (1 (18 (1 8 (13 (1 1 14 1 (14 (1 1

Standard Deviation (s ): is the positive square root of the variance. s var A small s. suggests that observations clustered tightly around a central value. A large s indicates that values are scattered widely about the mean. The sample having the largest variance or standard deviation has the greater spread among the values of the observations. Mean Standard Deviations Standard Deviation (s ): is the positive square root of the variance. s var 19 1 1 18 1 13 7.35 1 14 1 var (19 43 8 n i 1 54 ( BV i n 1 (1 ) (1 (18 (1 8 (13 (1 (14 (1 s var 54 7. 35 13

Measures of Distribution (Histogram) Asymmetry and Pea Sharpness Sewness is a measure of the asymmetry of a histogram and is computed using the formula: BV 11 BV 1 BV 13 BV 1 BV BV 3 BV 31 BV 3 BV 33 A perfectly symmetric histogram has a sewness value of zero. Histogram of A Single Band of Landsat Thematic Mapper Data Max. = 10 Min. = 6 Mean = 7 Median = 5 Mode = 9 14

Histogram of Thermal Infrared Imagery of a Thermal Plume in the Savannah River Max. = 188 Min. = 38 Mean = 73 Median = 68 Mode = 75 Remote Sensing Multivariate Statistics Remote sensing is often concerned with the measurement of how much radiant flux is reflected or emitted from an object in more than one band (e.g., in red and near-infrared bands). It is useful to compute multivariate statistical measures such as covariance and correlation among the several bands to determine how the measurements covary. 15

Remote Sensing Multivariate Statistics The different remote-sensing-derived spectral measurements for each pixel often change together in some predictable fashion. Landsat-7 ETM+ Band 1 (Blue band) Landsat-7 ETM+ Band (Green band) Remote Sensing Multivariate Statistics If there is no relationship between the brightness value in one band and that of another for a given pixel, the values are mutually independent; i.e., an increase or decrease in one band s brightness value is not accompanied by a predictable change in another band s brightness value. Landsat-7 ETM+ Band 3 (Red band) Landsat-7 ETM+ Band 4 (Near IR band) 16

Correlation between Multiple Bands of Remotely Sensed Data To estimate the degree of interrelation between variables in a manner not influenced by measurement units, the correlation coefficient, r, is commonly used. The correlation between two bands of remotely sensed data, r l, is the ratio of their covariance (cov l ) to the product of their standard deviations (s s l ); thus: r l cov s s l l Correlation Coefficient: A correlation coefficient of +1 indicates a positive, perfect relationship between the brightness values of the two bands. A correlation coefficient of -1 indicates that the two bands are inversely related. A correlation coefficient of zero suggests that there is no linear relationship between the two bands of data. 17

Remote Sensing Multivariate Statistics Because spectral measurements of individual pixels may not be independent, some measure of their mutual interaction is needed. This measure, called the covariance, is the joint variation of two variables about their common mean. To calculate covariance, we first compute the corrected sum of products (SP) defined by the equation: 18

Remote Sensing Multivariate Statistics It is computationally more efficient to use the following formula to arrive at the same result: This quantity is called the uncorrected sum of products. Remote Sensing Multivariate Statistics Covariance is calculated by dividing SP by (n 1). Therefore, the covariance between brightness values in bands and l, cov l, is equal to: 19

Covariance: is the joint variation of two variables about their common mean. SP l is the corrected Sum of Products between bands and l. SP cov l l n 1 SP l n ( i 1 BV i )( BV il l ) Band = 9 19 1 1 18 1 13 1 14 1 Band l l = 1 SP l = (19-9)x(1-1)+(1-9)x(1-1)+(1-9)x(1-1)+(18-9)x(1-1)+(1-9)x(1-1)+(13-9)x(1-1)+ (1-9)x(1-1)+(14-9)x(1-1)+(1-9)x(1-1) = 0 Cov l = 0 Covariance: A more efficient formula is: n SP l ( BV i BV il ) i 1 n BV i 1 n i BV i 1 n il 19 1 1 Band 18 1 13 Band l Total = 91 1 14 1 Total = 9 SP l = [(19x1)+(1x1)+(1x1)+(18x1)+(1x1)+(13x1)+(1x1)+(14x1)+(1x1)] [91x9/9] = 0 Cov l = 0 The sums of products (SP) and sums of squares (SS) can be computed for all possible band combinations. 0

Format of a Variance-Covariance Matrix Band 1 (green) Band (red) Band 3 (nearinfrared) Band 4 (nearinfrared) Band 1 SS 1 cov 1, cov 1,3 cov 1,4 Band cov,1 SS cov,3 cov,4 Band 3 cov 3,1 cov 3, SS 3 cov 3,4 Band 4 cov 4,1 cov 4, cov 4,3 SS 4 1

Remote Sensing Multivariate Statistics Variance covariance and correlation matrices are the ey for principal components analysis (PCA), feature selection, classification and accuracy assessment.