Graphics and Web Design Based on Edward Tufte's Principles



Similar documents
Principles of Data Visualization

Principles of Data Visualization for Exploratory Data Analysis. Renee M. P. Teate. SYS 6023 Cognitive Systems Engineering April 28, 2015

Visualization Quick Guide

Numbers as pictures: Examples of data visualization from the Business Employment Dynamics program. October 2009

Unresolved issues with the course, grades, or instructor, should be taken to the point of contact.

Effective Big Data Visualization

Choosing Colors for Data Visualization Maureen Stone January 17, 2006

This file contains 2 years of our interlibrary loan transactions downloaded from ILLiad. 70,000+ rows, multiple fields = an ideal file for pivot

Part 2: Data Visualization How to communicate complex ideas with simple, efficient and accurate data graphics

Expert Color Choices for Presenting Data

GRAPHING DATA FOR DECISION-MAKING

3D Interactive Information Visualization: Guidelines from experience and analysis of applications

CSU, Fresno - Institutional Research, Assessment and Planning - Dmitri Rogulkin

Data representation and analysis in Excel

Plotting: Customizing the Graph

Effective Visualization Techniques for Data Discovery and Analysis

Graphic Design. Background: The part of an artwork that appears to be farthest from the viewer, or in the distance of the scene.

TEXT-FILLED STACKED AREA GRAPHS Martin Kraus

Motion Charts: Telling Stories with Statistics

OBI 11g Data Visualization Best Practices

Chapter 4 Creating Charts and Graphs

Information Literacy Program

Atomic Force Microscope and Magnetic Force Microscope Background Information

Excel -- Creating Charts

Data Visualization Techniques

Intro to Excel spreadsheets

3D Data Visualization / Casey Reas

Sometimes We Must Raise Our Voices

Copyright 2008 Stephen Few, Perceptual Edge Page of 11

Introduction to Microsoft Excel 2007/2010

Chapter 6: Constructing and Interpreting Graphic Displays of Behavioral Data

an introduction to VISUALIZING DATA by joel laumans

GUIDELINES FOR PREPARING POSTERS USING POWERPOINT PRESENTATION SOFTWARE

MicroStrategy Analytics Express User Guide

MetroBoston DataCommon Training

Guide To Creating Academic Posters Using Microsoft PowerPoint 2010

MARS STUDENT IMAGING PROJECT

Microsoft Excel 2010 Charts and Graphs

Best Practices for Dashboard Design with SAP BusinessObjects Design Studio

Visualizing Historical Agricultural Data: The Current State of the Art Irwin Anolik (USDA National Agricultural Statistics Service)

STRAND: Number and Operations Algebra Geometry Measurement Data Analysis and Probability STANDARD:

Message, Audience, Production (MAP) Framework for Teaching Media Literacy Social Studies Integration PRODUCTION

Diagrams and Graphs of Statistical Data

Exploratory Data Analysis with R

Plotting Data with Microsoft Excel

On History of Information Visualization

Choosing a successful structure for your visualization

CS171 Visualization. The Visualization Alphabet: Marks and Channels. Alexander Lex [xkcd]

Lecture 2: Descriptive Statistics and Exploratory Data Analysis

Excel Guide for Finite Mathematics and Applied Calculus

Advice to USENIX authors: preparing presentation slides

Scanners and How to Use Them

Quantitative vs. Categorical Data: A Difference Worth Knowing Stephen Few April 2005

Visualization methods for patent data

A Short Introduction to Computer Graphics

INF2793 Research Design & Academic Writing Prof. Simone D.J. Barbosa simone@inf.puc-rio.br sala 410 RDC. presentation

Manual for simulation of EB processing. Software ModeRTL

Numeracy Targets. I can count at least 20 objects

A Picture Really Is Worth a Thousand Words

Data Visualization Handbook

Visualizing Data from Government Census and Surveys: Plans for the Future

Plots, Curve-Fitting, and Data Modeling in Microsoft Excel

Dashboard Design for Rich and Rapid Monitoring

Top 5 best practices for creating effective dashboards. and the 7 mistakes you don t want to make

MassArt Studio Foundation: Visual Language Digital Media Cookbook, Fall 2013

AP Physics 1 and 2 Lab Investigations

ART A. PROGRAM RATIONALE AND PHILOSOPHY

OBI 11g Data Visualization Best Practices

Data Visualization. Prepared by Francisco Olivera, Ph.D., Srikanth Koka Department of Civil Engineering Texas A&M University February 2004

Data visualisation. Statistics Methods (201209) Statistics Netherlands. The Hague/Heerlen, 2012

Exercise 1.12 (Pg )

Interactive Voting System. IVS-Basic IVS-Professional 4.4

Data Exploration Data Visualization

Design Elements & Principles

R Graphics Cookbook. Chang O'REILLY. Winston. Tokyo. Beijing Cambridge. Farnham Koln Sebastopol

How To Create A Data Visualization

DATA VISUALIZATION GABRIEL PARODI STUDY MATERIAL: PRINCIPLES OF GEOGRAPHIC INFORMATION SYSTEMS AN INTRODUCTORY TEXTBOOK CHAPTER 7

Charting LibQUAL+(TM) Data. Jeff Stark Training & Development Services Texas A&M University Libraries Texas A&M University

Periodontology. Digital Art Guidelines JOURNAL OF. Monochrome Combination Halftones (grayscale or color images with text and/or line art)

Summarizing and Displaying Categorical Data

Create Charts in Excel

Excel Unit 4. Data files needed to complete these exercises will be found on the S: drive>410>student>computer Technology>Excel>Unit 4

Security visualisation

Excel Chart Best Practices

Using Excel (Microsoft Office 2007 Version) for Graphical Analysis of Data

Good Graphs: Graphical Perception and Data Visualization

Data Visualization Techniques

Form Design Guidelines Part of the Best-Practices Handbook. Form Design Guidelines

Principles of Information Display for Visualization Practitioners

Table of Contents Find the story within your data

Scientific Graphing in Excel 2010

Each function call carries out a single task associated with drawing the graph.

Interpreting Data in Normal Distributions

Visualizing Data. Contents. 1 Visualizing Data. Anthony Tanbakuchi Department of Mathematics Pima Community College. Introductory Statistics Lectures

Data Visualization. BUS 230: Business and Economic Research and Communication

Figure 1. An embedded chart on a worksheet.

Transcription:

Last Modified: 2/17/05 Graphics and Web Design Based on Edward Tufte's Principles Graphics and Web Design Based on Edward Tufte's Principles This is an outline of Edward Tufte's pioneering work on the use of graphics to display quantitative information. It mainly consists of text and ideas taken from his three books on the subject along with some additional material of my own. This page is in text only format: in order to understand the concepts you need to read the books because the concepts cannot really be grasped without the illustrations, and current video monitor technology is too low in resolution to do them justice. His work has been described as "a visual Strunk and White" (here is a German translation of this article). Throughout this outline I have included references to the illustrations in his books that are labeled with the abbreviations VD-pp, VE-pp, and EI-pp, where "pp" is a page number and: Course Topics Graphics Design Based on Edward Tufte's Principles Setting up your First Web Page Acquiring Graphics From the Web Itself Acquiring Graphics from Scanners and Digital Cameras VD is "the Visual Display of Quantitative Information" VE is "Visual Explanations" EI is "Envisioning Information" Graphics File Formats and Graphic File Conversion Efficient Graphics Outline 1. #Introduction 2. #History of Plots 3. #The Explanatory Power of Graphics 4. #Basic Philosophy of Approach 5. #Graphical Integrity 6. #Data Densities 7. #Data Compression 8. #Multifunctioning Graphical Elements 9. #Maximize data-ink; minimize non-data ink Colors, Backgrounds, Transparencies Creating and Modifying Graphics Using Software How to Do It http://www.washington.edu/computing/training/560/zz-tufte.html (1 of 13)6/20/2005 3:59:16 PM

10. #Small Multiples 11. #Chartjunk 12. #Colors 13. #General Philosophy for Increasing Data Comprehension 14. #Techniques for Increasing Data Comprehension 15. #When NOT to Use Graphics 16. #Aesthetics Introduction Tufte's works address the following issues: The Problem: The problem is that of presenting large amounts of information in a way that is compact, accurate, adequate for the purpose, and easy to understand. Specifically, to show cause and effect, to insure that the proper comparisons are made, and to achieve the (valid) goals that are desired. Its Importance: Printed and graphical information is now the driving force behind all of our lives. It no longer is confined to specialized workers in selected fields but impacts nearly all people through the widespread use of computing and the Internet. Rapid and accurate transfers of information can be a life and death matter for many people (an example being the Challenger disaster). The extent to which symbols and graphics affect our lives can be seen by the dramatic increase in IQ scores in all cultures which have acquired information technology: in the United States there has been an average increase of 3 IQ points per decade over the last 60 years, for a total of an 18 IQ point increase. There is no known biological explanation for this increase and the most likely cause is widespread exposure to text, symbols, and graphics that accompany modern life. As mentioned above, this increase has been seen in all cultures exposed to information technology. Its Application: Some of the information relates to the displays of statistical information, but much applies to any type of display, even plain text. http://www.washington.edu/computing/training/560/zz-tufte.html (2 of 13)6/20/2005 3:59:16 PM

The Solution: To develop a consistent approach to the display of graphics which enhances its dissemination, accuracy, and ease of comprehension. History of Plots The very first known plot dates back to the 10-th century (VD-28: first known graph). This was about the same time that Guido of Arezzo was developing the two-dimensional musical staff notation very similar to the one we use today. In the 15-th century Nicolas of Cusa developed graphs of distance versus speed. In the 17th century Rene Descartes established analytic geometry which was used only for the display of mathematical functions. But the main initiator for informative graphics was William Playfair (1759-1823) who developed the line, bar, and pie charts as we know them today. The Explanatory Power of Graphics The importance and explanatory power of graphics can be seen in these examples: Illustration VD-13/14 shows 4 plots which have a large number of absolutely identical statistical measures and properties and yet are very different, as can be immediately seen from their graphs. The Challenger disaster: the data graphs shown to NASA did not convey the real information which was needed (VE- 47 versus VE-45). If NASA had seen the appropriate, but very simple, graphics which showed the effects of low temperature and damage to the solid rocket boosters, the Challenger would not have been launched that (very cold) day. The Broad Street Pump cholera epidemic in 1854 in http://www.washington.edu/computing/training/560/zz-tufte.html (3 of 13)6/20/2005 3:59:16 PM

London, as displayed by John Snow (VE-31: cholera deaths). This graph showed clusters of cholera deaths around the site of the pump. Illustration VD-166: "communes in France" shows an extremely dense plot which displays the boundaries of more than 30,000 communes in France. Basic Philosophy of Approach Important rules and themes to use when presenting graphics: Assume that the audience is intelligent (a paraphrase from E.B. White). Even publications, such as NY Times, assume that people are intelligent enough to read complex prose, but too stupid to read complex graphics. Don't limit people by "dumbing" the data -- allow people to use their abilities to get the most out of it. To clarify -- add detail (don't omit important detail; e.g., serif fonts are more "detailed" than san serif fonts but are actually easier to read). And Einstein once said that "an explanation should be as simple as possible, but no simpler". Above all else, show the data. Graphics is "intelligence made visible" Data rich plots can show huge amounts of information from many different perspectives: cause & effect, relationships, parallels, etc. (VD-31: train schedule, VD-17: Chloroplethic map, VD-41: Napoleon's campaign, EI-49: space junk) Plots need annotation to show data, data limitations, authentication, and exceptions (VE-32: text of exceptions) Don't use graphics to decorate a few numbers Graphical Integrity http://www.washington.edu/computing/training/560/zz-tufte.html (4 of 13)6/20/2005 3:59:16 PM

In addition to "lies, damn lies, and statistics", graphics can also be used to deceive. For example, deceptive graphics may: Compare full time periods with smaller time periods (VD- 60: Nobel prizes, which compares 10 year time periods with one 5 year period) Use a "lie factor" [= (size of graphic)/(size of data)] to exaggerate differences or similarities Use area or volume representations instead of linear scales to exaggerate differences. See VD-69: "Shrinking family doctor" as an example of how to confuse people using 1 versus 2- and 3- dimensional size comparisons. Area and volume representations fool people with the square/cube law: an increase in linear size leads to a square of the increase for areas and a cube of the increase for volumes. Fail to adjust for population growth or inflation in financial graphs Make use of design variation to obscure or exaggerate data variation (VD-61: exaggeration of OPEC prices) Exaggerate the vertical scale Show only a part of a cycle so that data from other parts of the cycle cannot be used for proper comparison Graphical errors may be more common today than in the past due to the easy and frequent use of computers. Guidelines to help insure graphical integrity include the following: Avoid chartjunk Don't dequantify: provide real data as accurately as is reasonable. For example, ranking products as better or worse according to one criteria when several factors are involved is often not useful unless the magnitudes of the differences are indicated. Don't exaggerate for visual effects, unless it is needed to convey the information. Sometimes such exaggerations are essential: for example, it is virtually impossible to show both the size and the orbits of planets at the right scale on the same chart. On the other hand, illustration VE-24: "Exaggerated vertical Venus scale", shows such dramatic mis-information, that one researcher called for the formation of "a flat Venus society". Avoid dis-information: thick surrounding boxes and underlined san serif text make reading more difficult http://www.washington.edu/computing/training/560/zz-tufte.html (5 of 13)6/20/2005 3:59:16 PM

Watch out for effects of aggregation: e.g., dot maps are often more honest in this respect than chloroplethic maps which group results based on (sometimes arbitrary) boundaries. Ask the right questions: 1. Does the display tell the truth 2. Is the representation accurate 3. Are the data documented 4. Do the display methods tell the truth 5. Are appropriate comparisons, contrasts, and contexts shown Data Densities Graphics are at their best when they represents very dense and rich datasets. Tufte defines data density as follows: Data density = (no. of entries in data matrix)/(area of graphic) Note that low data densities on computer displays force us to view information sequentially, rather than spatially, which is bad for comprehension. Good quality graphics are: Comparative Multivariate High density Able to reveal interactions, comparisons, etc And where nearly all of the ink is actual data ink Example data densities include: 110,000 numbers/sq-inch for an astronomical graph. This is the maximum known density for a graph. For most scientific journals we get about 50-200 numbers/sq-inch 150 Mbits = human eye 8 Mbits = typical computer screen 25 Mbits = color slide http://www.washington.edu/computing/training/560/zz-tufte.html (6 of 13)6/20/2005 3:59:16 PM

150 Mbits = large foldout map 28,000 Characters = Reference book 18,000 Characters = phone book 15,000 Characters = non-fiction An excellent example of a data rich plot is a graphical train schedule (VD-31: train schedule) which shows start and stop times, locations, directions, routes, transfers, and speeds all on one sheet of paper. Data Compression Use data compression to reveal (not hide) data. For example, EI-22: "Sun Spot cycles" displays sunspots as thin vertical lines in the y-axis direction only in order to present many such spots over a period of time on a single graph Use compression to show lots of information in a single graph, such as a plot that shows x-axis, y-axis, and x/y interactions. (VD-134: Pulsar signals; VE-111) Exclude bi-lateral symmetry when it is redundant (e.g., charnoff faces) or extend it when it aids comprehension (50% more view of the world on a world map provides a wrap-around context that aids understanding). Studies show that we often concentrate on one side of a symmetrical figure and only glance at the other side. Multifunctioning Graphical Elements Graphical structures can often serve several purposes once. For example, Stem and leaf plots display sequences of numbers which directly portray structure by the physical length of each http://www.washington.edu/computing/training/560/zz-tufte.html (7 of 13)6/20/2005 3:59:16 PM

sequence. (VD-140: stem/leaf; VD-141: army divisions; VD- 143: Normal curve) The Consumer Reports listing of automobile defects (VD- 174: Consumer Reports) reveal a micro/macro structure: the overall display of black ink immediately reveals which cars are most troublesome, whereas each individual element in the display identifies a particular weakness. The data grid itself may be the data, revealing both the values and the coordinate system at the same time (VD- 152: data-based markers) Maximize Data-ink; Minimize non-data Ink Tufte defines the data ink ratio as: Data Ink Ratio = (data-ink)/(total ink in the plot) The goal is to make this as large as is reasonable. To do this you: Avoid heavy grids Replace box plots with interrupted lines (VD-125: reduced box plot) Replace enclosing box with an x/y grid Use white space to indicate grid lines in bar charts (VD- 128: white spaces) Use tics (w/o line) to show actual locations of x and y data Prune graphics by: replacing bars with single lines, erasing non-data ink; eliminating lines from axes; starting x/y axes at the data values [range frames]) Avoid over busy grids, excess ticks, redundant representation of simple data, boxes, shadows, pointers, legends. Concentrate on the data and NOT the data containers. Always provide as much scale information (but in muted form) as is needed http://www.washington.edu/computing/training/560/zz-tufte.html (8 of 13)6/20/2005 3:59:16 PM

Small Multiples Small multiples are sets of thumbnail sized graphics on a single page that represent aspects of a single phenomenon. They: Depict comparison, enhance dimensionality, motion, and are good for multivariate displays (VD-114: particle momentum) Invite comparison, contrasts, and show the scope of alternatives or range of options (VE-111: medical charts) Must use the same measures and scale. Can represent motion through ghosting of multiple images Are particularly useful in computers because they often permit the actual overlay of images, and rapid cycling. Chartjunk Chartjunk consists of decorative elements that provide no data and cause confusion. Tufte discusses the rule of 1+1=3 (or more): 2 elements in close proximity cause a visible interaction. Such interactions can be very fatiguing (e.g., moiré patterns, optical vibration) and can show information that is not really there (EI-60: data that is not there, VD-111: chart junk) In major science publications we see 2% to 20% moiré vibration. For example, in recent statistical and computer publications chartjunk ranges from 12% to 68% Techniques to avoid chartjunk include replacing crosshatching with (pastel) solids or gray, using direct labeling as opposed to legends, and avoiding heavy data containers http://www.washington.edu/computing/training/560/zz-tufte.html (9 of 13)6/20/2005 3:59:16 PM

Colors Colors can often greatly enhance data comprehension. Layering with colors is often effective Color grids are a form of layer which provides context but which should be unobtrusive and muted Pure bright colors should be reserved for small highlight areas and almost never used as backgrounds. Use color as the main identifier on computer screens as different objects are often considered the same if they have the same color regardless of their shape, size, or purpose Contour lines that change color based on the background standout without producing the 1+1=3 effects Colors can be used as labels, as measures, and to imitate reality (e.g., blue lakes in maps). Don't place bright colors mixed with White next to each other. Color spots against a light gray are effective Colors can convey multi-dimensional values Scroll bars should be solid pastel colors Note that surrounding colors can make two different colors look alike, and two similar colors look very different (EI- 92/93: effects of context on colors). Subtle shades of color or gray scale are best if they are delimited with fine contour lines (EI-94: shades with contours) Be aware that 5-10% of people are color blind to some degree (red-green is the most common type followed by blue-yellow, which usually includes blue-green) General Philosophy for Increasing Data Comprehension High density is good: the human eye/brain can select, filter, edit, group, structure, highlight, focus, blend, outline, cluster, itemize, winnow, sort, abstract, smooth, isolate, http://www.washington.edu/computing/training/560/zz-tufte.html (10 of 13)6/20/2005 3:59:16 PM

idealize, summarize, etc. Give people the data so they can exercise their full powers -- don't limit them. Clutter/confusion are failures of design and not complexity Information consists of differences that make a difference: so you can "hide" that data which does not make a difference in what you are trying to depict In showing parallels, only the relevant differences should appear Value and power of parallelism: once you have seen one element all the others are accessible Important concepts in good design: separating figure and background (for example, a blurry background often brings the foreground into sharper focus), layering & separation, use of white space (e.g., Chinese landscapes emphasize space, as in the painter known as "one corner Ma"; oriental music is often there to emphasize the silence and not the sound). Graphics should emphasize the horizontal direction Techniques for Increasing Data Comprehension To increase data comprehension you: Make marks or labels as small as possible, but as small as possible to still be clear. Avoid pie charts as they are low density and fail to order values along a visual dimension Usually use dot maps in place of chloroplethic maps because they show more exact detail Closely interweave text and graphics: attach names directly to parts, place small messages next to the data, avoid legends if possible and annotate the data directly on the graph (VE-99: anatomy of a font) Avoid abbreviations if possible, and use horizontal text Use serif fonts in upper/lower case Use transforms of scaling if they (honestly) can reveal information which might otherwise be overlooked. Use different structures to reveal 3D and motion, such as http://www.washington.edu/computing/training/560/zz-tufte.html (11 of 13)6/20/2005 3:59:16 PM

the exploded hexagon, true stereo, and extreme foreshortening (as on the edge of a sphere: see EI-15 "exploded hexagon") When NOT to Use Graphics Often text tables can replace graphs for simple data; you can also use 2D text tables, where row and column summaries are useful. Non-comparative data sets usually belong in tables, not charts Poster designs are meant just to capture attention, as opposed to conveying information -- generally they are not good designs for graphs. If a picture is not worth a 1000 words, to hell with it (quote from Ad Reinhardt -- note this is from the original Chinese quote that "a picture is worth 10,000 words). Aesthetics Graphical excellence consists of simplicity of design and complexity and truth of data. To achieve this Use words, numbers, drawings in close proximity Display an accessible complexity of data Let the graphics tell the story Avoid context-free decoration Use lines of different weights as an attractive and compact way to display data (VD-185: Mondrian) Make use of symmetry to add beauty (although someone once said that "all true beauty requires some degree of asymmetry") http://www.washington.edu/computing/training/560/zz-tufte.html (12 of 13)6/20/2005 3:59:16 PM

Other Topics: Other References 1999 UW Computing & Communications http://www.washington.edu/computing/training/560/zz-tufte.html (13 of 13)6/20/2005 3:59:16 PM