Choosing a classification method



Similar documents
Symbolizing your data

GIS Tutorial 1. Lecture 2 Map design

Morningstar Calculated Fixed-Income Style Box Methodology

Modifying Colors and Symbols in ArcMap

CHAPTER TWELVE TABLES, CHARTS, AND GRAPHS

Introduction to Map Design

Tutorial 3 - Map Symbology in ArcGIS

Visualization Quick Guide

Crime Mapping Methods. Assigning Spatial Locations to Events (Address Matching or Geocoding)

Data Visualization. Prepared by Francisco Olivera, Ph.D., Srikanth Koka Department of Civil Engineering Texas A&M University February 2004

CSU, Fresno - Institutional Research, Assessment and Planning - Dmitri Rogulkin

Lab 7. Exploratory Data Analysis

Intro to GIS Winter Data Visualization Part I

University of Arkansas Libraries ArcGIS Desktop Tutorial. Section 2: Manipulating Display Parameters in ArcMap. Symbolizing Features and Rasters:

Advanced Microsoft Excel 2010

Web Editing Tutorial. Copyright Esri All rights reserved.

Editing Common Polygon Boundary in ArcGIS Desktop 9.x

Petrel TIPS&TRICKS from SCM

TEACHER NOTES MATH NSPIRED

A Tutorial on Color Symbolization and Data Classification for Mapping and Visualization

Means, standard deviations and. and standard errors

Scientific Graphing in Excel 2010

Getting Started With Mortgage MarketSmart

Data Visualization. Brief Overview of ArcMap

CHAPTER 7 INTRODUCTION TO SAMPLING DISTRIBUTIONS

Scatter Plots with Error Bars

Tutorial. VISUALIZATION OF TERRA-i DETECTIONS

Statistics Chapter 2

Optimizing Hadoop Block Placement Policy & Cluster Blocks Distribution

Tobii AB. Accuracy and precision Test report. Tobii Pro X3-120 fw Date: Methodology/Software version: 2.1.7*

Lab 1 Introduction to Microsoft Project

Best Practices. Create a Better VoC Report. Three Data Visualization Techniques to Get Your Reports Noticed

2. Incidence, prevalence and duration of breastfeeding

Risk Management : Using SAS to Model Portfolio Drawdown, Recovery, and Value at Risk Haftan Eckholdt, DayTrends, Brooklyn, New York

Jitter Measurements in Serial Data Signals

Microsoft Excel 2010 Part 3: Advanced Excel

Describing, Exploring, and Comparing Data

Biostatistics: DESCRIPTIVE STATISTICS: 2, VARIABILITY

Employee Engagement Survey Results. SampleCo International. Executive Summary. Sample Report

UK application rates by country, region, constituency, sex, age and background. (2015 cycle, January deadline)

CentropeSTATISTICS a Tool for Cross-Border Data Presentation Manfred Schrenk, Clemens Beyer, Norbert Ströbinger

Introduction; Descriptive & Univariate Statistics

SAS Rule-Based Codebook Generation for Exploratory Data Analysis Ross Bettinger, Senior Analytical Consultant, Seattle, WA

An Introduction to Point Pattern Analysis using CrimeStat

Modeling Fire Hazard By Monica Pratt, ArcUser Editor

Data exploration with Microsoft Excel: analysing more than one variable

Learning Module 4 - Thermal Fluid Analysis Note: LM4 is still in progress. This version contains only 3 tutorials.

MBA 611 STATISTICS AND QUANTITATIVE METHODS

Descriptive Statistics and Measurement Scales

IBM SPSS Direct Marketing 23

NASA Explorer Schools Pre-Algebra Unit Lesson 2 Student Workbook. Solar System Math. Comparing Mass, Gravity, Composition, & Density

Measurement with Ratios

International Financial Reporting Standards (IFRS) Financial Instrument Accounting Survey. CFA Institute Member Survey

GEOGRAPHIC INFORMATION SYSTEMS CERTIFICATION

Scheduling Resources and Costs

Adverse Impact Ratio for Females (0/ 1) = 0 (5/ 17) = Adverse impact as defined by the 4/5ths rule was not found in the above data.

Best of the Best Benchmark. Adobe Digital Index APAC 2015

DESCRIPTIVE STATISTICS. The purpose of statistics is to condense raw data to make it easier to answer specific questions; test hypotheses.

Summarizing and Displaying Categorical Data

Linear Programming Problems

Dealing with Data in Excel 2010

Tutorial 3: Working with Tables Joining Multiple Databases in ArcGIS

How To Check For Differences In The One Way Anova

Introduction to Exploratory Data Analysis

The Study of Chinese P&C Insurance Risk for the Purpose of. Solvency Capital Requirement

Drawing a histogram using Excel

INTERPRETING THE ONE-WAY ANALYSIS OF VARIANCE (ANOVA)

Exploratory data analysis (Chapter 2) Fall 2011

Doing Multiple Regression with SPSS. In this case, we are interested in the Analyze options so we choose that menu. If gives us a number of choices:

Intermediate PowerPoint

Simple Regression Theory II 2010 Samuel L. Baker

Skills Knowledge Energy Time People and decide how to use themto accomplish your objectives.

Savvy Dashboard. User Guide. Savvy Dashboard Version 1.0 Release Date 2014 TradeStation Version Compatibility 9.1 Update 20-25

Tutorial 3: Graphics and Exploratory Data Analysis in R Jason Pienaar and Tom Miller

Association Between Variables

Chapter 4. Probability Distributions

PURPOSE OF GRAPHS YOU ARE ABOUT TO BUILD. To explore for a relationship between the categories of two discrete variables

Data Mining Techniques Chapter 5: The Lure of Statistics: Data Mining Using Familiar Tools

DATA VISUALIZATION GABRIEL PARODI STUDY MATERIAL: PRINCIPLES OF GEOGRAPHIC INFORMATION SYSTEMS AN INTRODUCTORY TEXTBOOK CHAPTER 7

Normality Testing in Excel

Tobii Technology AB. Accuracy and precision Test report. X2-60 fw Date: Methodology/Software version: 2.1.7

Lab 11: Budgeting with Excel

STA-201-TE. 5. Measures of relationship: correlation (5%) Correlation coefficient; Pearson r; correlation and causation; proportion of common variance

Descriptive Statistics

Quantitative vs. Categorical Data: A Difference Worth Knowing Stephen Few April 2005

Formulas, Functions and Charts

Performance Assessment Task Baseball Players Grade 6. Common Core State Standards Math - Content Standards

Directions for Frequency Tables, Histograms, and Frequency Bar Charts

Maplex Tutorial. Copyright Esri All rights reserved.

Guidance on Professional Judgment for CPAs

SAS Analyst for Windows Tutorial

Announcements. SE 1: Software Requirements Specification and Analysis. Review: Use Case Descriptions

Excel 2007 Basic knowledge

2. Creating Bar Graphs with Excel 2007

Excel Formatting: Best Practices in Financial Models

Transcription:

Chapter 6 Symbolizing your data 103 Choosing a classification method ArcView offers five classification methods for making a graduated color or graduated symbol map: Natural breaks Quantile Equal area (polygons only) Equal interval Standard deviation You can also type your own class range values directly into the Legend Editor s Value field to specify your own classes. The classification method you choose depends on the nature of your data, and what you want to show about the data. The maps in the next section of this chapter illustrate how your choice of classification method, a chart is also shown illustrating how the same set of attribute values are divided into classes by that method.

Chapter 6 Symbolizing your data 104 Natural breaks Natural Breaks is ArcView s default classification method. This method identifies breakpoints by looking for groupings and patterns inherent in the data. ArcView uses a rather complex statistical formula (Jenk s optimization) that minimizes the variation within each class. The example in the chart shows how this works: a set of features are ordered (left to right) from the smallest to the largest by population value. The features are divided into classes whose boundaries are set where there are relatively big jumps in the values. This map shows the population of provinces in part of Southeast Asia. The nine population classes were determined using natural breaks. On this map, the extreme values are obvious: the provinces in China have the highest population, while those in Laos have the lowest. Vietnam has provinces that range in population between the two extremes. This map presents population as you would expect it to look: it could be considered the most realistic view of the data.

Chapter 6 Symbolizing your data 105 Quantile In the quantile classification method, each class is assigned the same number of features. In the chart shown below, the lowest five provinces are put in the first class, the next five in the second class, and so on. It doesn t matter that the provinces on either side of a class boundary have almost the same population. Quantile classes can therefore be misleading because low values are often included in the same class as high values. You can help overcome this distortion by increasing the number of classes. Quantile classification is best suited for data that is linearly distributed; in other words, data that does not have disproportionate numbers of features with similar values. It is especially useful when you want to emphasize the relative position of a feature among other features, for example, to show that a store is in the top one-third of all stores by sales. This map shows the same data as the last map, but the population of the provinces is mapped using quantile classification. Differences among the provinces with an intermediate population, such as in Vietnam, can be more easily distinguished on this map than on the natural breaks map.

Chapter 6 Symbolizing your data 106 Equal area The equal area method classifies polygon features by finding breakpoints in the attribute values so that the total area of the polygons in each class is approximately the same. ArcView determines the total area only from polygons that have valid data values for the attribute. Classes formed with the equal area method are typically similar to quantile classes. Equal area classification is similar to quantile classification except that each feature is given a weight in the classification equal to its area, rather than equal to 1. When the population data is classified with the equal area method, the very largest provinces (in China) are in classes by themselves. The smaller provinces are put into the remaining classes. This tends to hide the variation in population between smaller provinces.

Chapter 6 Symbolizing your data 107 Equal interval The equal interval method divides the range of attribute values into equal sized sub-ranges. For example, if the features in your theme have attributes values ranging from 12 to 351, the total range of the values is 339, so if you classify these features into three classes using the equal interval method, each class will represent a range of 113, and the class value ranges will therefore be 12-125, 126-238, and 239-351, as shown in the chart. Equal interval classifications are useful when you want to emphasize the amount of an attribute value relative to other values, for example, to show that a store is part of the group of stores that made up the top one-third of all sales. Equal interval classification is ideal for data whose range is already familiar, such as percentages or temperatures. Population counts or other data for which people do not have strong conceptual association with a data range may be better communicated using another classification method. In the map below, population is mapped using an equal interval classification. This map shows that there is a very great disparity between the least populous and the most populous provinces in this area. Because the Chinese provinces are so populous, all the provinces in Vietnam, Laos and Thailand fall into the first class (161,000 7,879,000). Clearly, the equal interval classification is not a good one to use if you want to reveal subtle differences between features with fairly similar values.

Chapter 6 Symbolizing your data 108 Standard Deviation Standard deviation shows you the extent to which an attribute s values differ from the mean of all the values. When data is classified using the standard deviation method, ArcView finds the mean value and then places class breaks above and below the mean at intervals of either 1, 0.5, or 0.25 standard deviations, until all the data values are included in a class. ArcView will aggregate any values beyond three standard deviations from the mean into two classes: greater than three standard deviations above the mean ( >3 Std. Dev. ) and less than three standard deviations below the mean ( < -3 Std. Dev. ). On a graduated color map, the default color ramp for this classification is dichromatic (e.g., blue to red) and the mean data value is given a neutral color (e.g., white). The chart below shows the same set of attribute values presented on the vertical axis so you can compare this chart to those for the other classification methods. In the map below, population is mapped using a standard deviation classification. All the provinces in Vietnam, Laos, and Thailand all fall slightly below the mean, while the provinces in China fall well above the mean. It is clear from the population classes listed in the view s Table of Contents that the population data is skewed by the highly populous Chinese provinces, because there is only one class below the mean but there are seven classes above the mean.

Chapter 6 Symbolizing your data 109