Quantitative vs. Categorical Data: A Difference Worth Knowing Stephen Few April 2005

Similar documents
Common Mistakes in Data Presentation Stephen Few September 4, 2004

Visualizing Multidimensional Data Through Time Stephen Few July 2005

Dashboard Design for Rich and Rapid Monitoring

Examples of Data Representation using Tables, Graphs and Charts

Visualization Quick Guide

Valor Christian High School Mrs. Bogar Biology Graphing Fun with a Paper Towel Lab

Quantitative Displays for Combining Time-Series and Part-to-Whole Relationships

Effectively Communicating Numbers

EXPERIMENTAL DESIGN REFERENCE

Scientific Graphing in Excel 2010

Principles of Data Visualization for Exploratory Data Analysis. Renee M. P. Teate. SYS 6023 Cognitive Systems Engineering April 28, 2015

Introduction to Geographical Data Visualization

Excel Chart Best Practices

Scope and Sequence KA KB 1A 1B 2A 2B 3A 3B 4A 4B 5A 5B 6A 6B

Excel -- Creating Charts

Sometimes We Must Raise Our Voices

Unit 9 Describing Relationships in Scatter Plots and Line Graphs

Tutorial 3: Graphics and Exploratory Data Analysis in R Jason Pienaar and Tom Miller

FREE FALL. Introduction. Reference Young and Freedman, University Physics, 12 th Edition: Chapter 2, section 2.5

Unit Charts Are For Kids

Dealing with Data in Excel 2010

Best Practices. Create a Better VoC Report. Three Data Visualization Techniques to Get Your Reports Noticed

Visualization Software

Descriptive Statistics and Measurement Scales

Choosing Colors for Data Visualization Maureen Stone January 17, 2006

Introduction to Microsoft Excel 2007/2010

Scatter Plots with Error Bars

Part 1: Background - Graphing

Lab 11: Budgeting with Excel

Criteria for Evaluating Visual EDA Tools

Tableau Data Visualization Cookbook

CBA Fractions Student Sheet 1

Creating an Excel XY (Scatter) Plot

Numeracy and mathematics Experiences and outcomes

Describing, Exploring, and Comparing Data

Math Content by Strand 1

CHAPTER TWELVE TABLES, CHARTS, AND GRAPHS

Exploring All of Your Options

Excel Tutorial. Bio 150B Excel Tutorial 1

Chapter 1: Looking at Data Section 1.1: Displaying Distributions with Graphs

-2- Reason: This is harder. I'll give an argument in an Addendum to this handout.

Bar Charts, Histograms, Line Graphs & Pie Charts

Unit 13 Handling data. Year 4. Five daily lessons. Autumn term. Unit Objectives. Link Objectives

Information Visualization Multivariate Data Visualization Krešimir Matković

Consolidation of Grade 3 EQAO Questions Data Management & Probability

Using SPSS, Chapter 2: Descriptive Statistics

Open-Ended Problem-Solving Projections

Introduction to the TI-Nspire CX

University of Arkansas Libraries ArcGIS Desktop Tutorial. Section 2: Manipulating Display Parameters in ArcMap. Symbolizing Features and Rasters:

HOW MUCH WILL I SPEND ON GAS?

2.5 Transformations of Functions

4.4 Transforming Circles

This file contains 2 years of our interlibrary loan transactions downloaded from ILLiad. 70,000+ rows, multiple fields = an ideal file for pivot

1 One Dimensional Horizontal Motion Position vs. time Velocity vs. time

Creating Charts in Microsoft Excel A supplement to Chapter 5 of Quantitative Approaches in Business Studies

6.4 Normal Distribution

CS171 Visualization. The Visualization Alphabet: Marks and Channels. Alexander Lex [xkcd]

Updates to Graphing with Excel

Measurement with Ratios

Concepts of Variables. Levels of Measurement. The Four Levels of Measurement. Nominal Scale. Greg C Elvers, Ph.D.

Three daily lessons. Year 5

Table of Contents Find the story within your data

Effective Big Data Visualization

Create Charts in Excel

Charts, Tables, and Graphs

Copyright 2008 Stephen Few, Perceptual Edge Page of 11

Student Writing Guide. Fall Lab Reports

Bar Graphs with Intervals Grade Three

What Does the Normal Distribution Sound Like?

PIZZA! PIZZA! TEACHER S GUIDE and ANSWER KEY

Problem of the Month: Cutting a Cube

ITS Training Class Charts and PivotTables Using Excel 2007

7 Relations and Functions

Assignment 2: Animated Transitions Due: Oct 12 Mon, 11:59pm, 2015 (midnight)

Drawing a histogram using Excel

Charlesworth School Year Group Maths Targets

EVERY DAY COUNTS CALENDAR MATH 2005 correlated to

Pennsylvania System of School Assessment

Lecture 2: Descriptive Statistics and Exploratory Data Analysis

1 BPS Math Year at a Glance (Adapted from A Story of Units Curriculum Maps in Mathematics P-5)

x 2 + y 2 = 1 y 1 = x 2 + 2x y = x 2 + 2x + 1

Information visualization examples

Chapter 2: Frequency Distributions and Graphs

Stephen Few, Perceptual Edge Visual Business Intelligence Newsletter November/December 2008

Exploratory data analysis (Chapter 2) Fall 2011

Variable: characteristic that varies from one individual to another in the population

Managerial Economics Prof. Trupti Mishra S.J.M. School of Management Indian Institute of Technology, Bombay. Lecture - 13 Consumer Behaviour (Contd )

Lesson 2: Constructing Line Graphs and Bar Graphs

1 of 7 9/5/2009 6:12 PM

Part 1 Expressions, Equations, and Inequalities: Simplifying and Solving

Grade 6 Mathematics Performance Level Descriptors

One-Way ANOVA using SPSS SPSS ANOVA procedures found in the Compare Means analyses. Specifically, we demonstrate

This activity will show you how to draw graphs of algebraic functions in Excel.

Independent samples t-test. Dr. Tom Pierce Radford University

The Center for Teaching, Learning, & Technology

Numbers as pictures: Examples of data visualization from the Business Employment Dynamics program. October 2009

Transcription:

Quantitative vs. Categorical Data: A Difference Worth Knowing Stephen Few April 2005 When you create a graph, you step through a series of choices, including which type of graph you should use and several aspects of its appearance. Most people walk through these choices as if they were sleepwalking, with only a vague sense at best of what works, of why one choice is better than another. Without guiding principles rooted in a clear understanding of graph design, choices are arbitrary and the resulting communication fails in a way that can be costly to the business. To communicate effectively using graphs, you must understand the nature of the data, graphing conventions and a bit about visual perception not only what works and what doesn't, but why. This month's column focuses on the nature of quantitative information. Graphs display quantitative information: numbers that measure performance, predict the future and identify opportunities. The nature of quantitative information varies in some fundamental ways that tie directly to some of the choices you must make when graphing that information. Quantitative information consists not only of numbers, but also of data that identifies what the numbers mean. If I walked up to you, looked you in the eye, and said, "The answer is 24,901," you would probably be confused, understandably suspicious that I had a few screws loose. By itself, a number means nothing. However, if I were to tell you that the circumference of the earth at the equator is 24,901 miles, that would mean something. To be complete and meaningful, quantitative information consists of both quantitative data the numbers and categorical data the labels that tell us what the numbers measure. The graph in Figure 1 highlights this distinction by displaying the categorical data in black and the quantitative data in various other colors. (Note: The gray axes, excluding the red tick marks, are neither quantitative nor categorical data, and in fact are not data at all, but simply visual objects that support the graph by defining the plot area.) Perceptual Edge Quantitative vs. Categorical Data: A Difference Worth Knowing Page 1

Figure 1: Illustration of the difference between quantitative data (red) and categorical data (black). The graph in Figure 1 displays two scales: a quantitative scale along the vertical axis and a categorical scale along the horizontal axis. They differ in what they identify: quantitative values on the one hand and categorical items on the other. Most two-dimensional graphs consist of one quantitative scale and one categorical scale, although a familiar exception is the scatterplot, which has quantitative scales along both axes (see Figure 2). In a line graph, the categorical scale always appears on the horizontal axis. In a bar graph, the categorical scale can appear on either axis, with bars running horizontally or vertically. Data points simple symbols such as dots, squares, triangles and so forth are rarely used by themselves to encode values other than in scatterplots. Unlike bars and lines, data points can encode a quantitative value simultaneously along two scales. Figure 2: A scatterplot is the only commonly used 2-D graph that lacks a categorical scale along one of its two axes. Perceptual Edge Quantitative vs. Categorical Data: A Difference Worth Knowing Page 2

Three Types of Categorical Scales When used in graphs, categorical scales come in three fundamental types: nominal, ordinal and interval. Nominal scales consist of discrete items that belong to a common category, but really don't relate to one another in any particular way. They differ in name only (that is, nominally). The items in a nominal scale, in and of themselves, have no particular order and don't represent quantitative values. Typical examples include regions (e.g., The Americas, Asia and Europe) and departments (e.g., sales, marketing and finance). Ordinal scales consist of items that have an intrinsic order, but like a nominal scale, the items in and of themselves do not represent quantitative values. Typical examples involve rankings, such as "A, B and C," "small, medium and large," and "poor, below average, average, above average and excellent." Figure 3: Examples of nominal, ordinal and interval scales. Interval scales also consist of items that have an intrinsic order; but in this case, they represent quantitative values as well. An interval scale starts out as a quantitative scale that is then converted into a categorical scale by subdividing the range of values into a sequential series of smaller ranges of equal size and by giving each range a label. Consider the quantitative range that appears along the vertical scale in Figure 2. This range, from 55 to 80, could be converted into a categorical scale consisting of the following smaller ranges: 1. > 55 and 3/4 60, 2. > 60 and 3/4 65, 3. > 65 and 3/4 70, 4. > 70 and 3/4 75, and 5. > 75 and 3/4 80. Here's a quick (and somewhat sneaky) test to see how well you've grasped these concepts. Can you identify the type of categorical scale that appears in Figure 4? Figure 4: Example of a categorical scale that is commonly used in graphs. Can you determine which kind it is? Perceptual Edge Quantitative vs. Categorical Data: A Difference Worth Knowing Page 3

Months of the year obviously have an intrinsic order, which leaves the question: "Do the items correspond to quantitative values?" In fact, they do. Units of time such as years, quarters, months, weeks, days and hours are measures of quantity, and the individual items in any given unit of measure (e.g., years) represent equal intervals. Actually, months aren't exactly equal and even years vary in size occasionally due to leap years, but they are close enough in size to constitute an interval scale. Categorical Scales and Graph Design The primary graph design principle that applies to this distinction between nominal, ordinal and interval scales involves the use of lines to encode quantitative data. You should only use lines (as in a line graph) to encode data along an interval scale. In nominal and ordinal scales, the individual items are not related closely enough to be linked with lines, so you should use bars instead. The strength of lines in a graph is their ability to reveal the trend of and patterns in the data. Lines suggest change from one item to the next, but change isn't happening if the items aren't closely related as sequential subdivisions of a continuous range of values. For instance, it is appropriate to use lines to display change from one day to the next or from one price range to the next, but not from one sales region to the next. (See Figure 5.) Figure 5: Examples of inappropriate and appropriate uses of lines in a graph. With interval scales, you are not forced in all cases to use lines; you can use bars as well. If you want to emphasize the overall shape of the data or changes from one item to the next, lines work best. If, however, you want to emphasize individual items, such as individual Perceptual Edge Quantitative vs. Categorical Data: A Difference Worth Knowing Page 4

months, or to support discrete comparisons of multiple values at the same location along the interval scale, such as revenues and expenses for individual months, then bars work best. These concepts are relevant to many issues and principles of graph design, so keep them handy as you continue to read this column in the future to learn more about data visualization. (This article was originally published in DM Review.) About the Author Stephen Few has worked for over 20 years as an IT innovator, consultant, and teacher. Today, as Principal of the consultancy Perceptual Edge, Stephen focuses on data visualization for analyzing and communicating quantitative business information. He provides training and consulting services, writes the monthly Visual Business Intelligence Newsletter, speaks frequently at conferences, and teaches in the MBA program at the University of California, Berkeley. He is the author of two books: Show Me the Numbers: Designing Tables and Graphs to Enlighten and Information Dashboard Design: The Effective Visual Communication of Data. You can learn more about Stephen s work and access an entire library of articles at www.perceptualedge.com. Between articles, you can read Stephen s thoughts on the industry in his blog. Perceptual Edge Quantitative vs. Categorical Data: A Difference Worth Knowing Page 5