A Tutorial on Color Symbolization and Data Classification for Mapping and Visualization Cynthia Brewer, Penn State Geography Prepared for STIS conference sponsored by BioMedware, January 9-10, 2003, in Ann Arbor
Color Symbolization
www.colorbrewer.org with Mark Harrower Digital Government Quality Graphics NSF Grant No. 9983451 www.geovista.psu.edu
Full view - solution
Color scheme types Sequential: light-to-dark; low-to-high data Diverging: dark-light-dark, two hues; emphasize critical midrange in ordered data Qualitative: different hues, similar lightness; categorical data
From Mapping Census 2000 atlas Example maps Sequential Diverging Qualitative
Use perceptual system Figure from www.munsell.com/munsell1.htm such as Munsell (HVC)
Break for Munsell chip organization task to be sure you understand the perceptual dimensions of color
Hue and lightness, sequential schemes from ColorBrewer
Sequential scheme paths hue and value graph
Sequential paths collapsed
Example chroma & value paths for sequential schemes
Diverging paths collapsed
Example diverging schemes
Qualitative schemes
Summary Each scheme type has a characteristic path through perceptual color space
Classification
Classification for map comparison is key issue in multi-map contexts
Stroke, White Male Time Series Matched legends
Classification literature Brewer & Pickle (Dec. 2002 Annals of AAG) Reviews: Jenks, Coulson, Evans, Paslawski, Slocum On Comparison (most in 70s): Monmonier, Lloyd & Steinke, Olson, Muller 1990s: Cromley
Experiment Classification 7 map series, 6 questions each, 58 subjects Matched Legends 2 map series one with matched legends, 48 subjects Questions about polygons, regions, whole maps; Within map and comparison questions
Classification types Tested: - Quantile (percentile) - Min. boundary error - Jenks optimization, Natural breaks - Equal interval with class for extremes -Mean andst. deviation -Shared area - Box plot based Others: - Arithmetic, Geometric -Nested means - Significance based - Equal area - Min. difference from class midpoints or medians - Critical values
Map series Lung cancer for WM, WF, BM, BF HIV, unintentional injuries for WM, BM All causes, heart, cancer, stroke (WF) Motor vehicle, suicide, homicide (WM) and % urban Breast cancer (WF), income, education, urban Heart disease, 4 time periods (WM) Stroke, lung cancer, 2 time periods each (WM) Liver disease, COPD for WM, WF Stroke, 4 time periods (WM)
Quantile example map Stroke, White Female
Hybrid equal interval example Stroke, White Female
Quantile data classing
Hybrid equal interval
Results Graph
Conclusions Classifications suited for choropleth maps in series intended for a wide range of map reading tasks: Quantile Minimum boundary error Natural breaks (Jenks)
Use same legends Matched legends aid map comparison 28% improvement!
Difficult tasks - interpreting broader map patterns - comparing patterns between maps - questions requiring map legend reading
Classing strategies for series - share class breaks between maps -use meaningful breaks: national rate, median, zero, threshold - round aggressively
One race or One or more including Two change maps
Example comparison One race, AIAN One or more AIAN 26.4% 110.3%
Classing series - class aggregate of all data for series - use many classes/colors - map with subset of classes for each map - limit to true max and min within each map
Classing series examples 0 10 20 30 40 50 60 70 80 0 10 20 30 20 30 40 50 60 0 10 50 70 80 0 8 16 24 32 20 30 40 50 60 0 10 50 70 80 Same colors each map? OR? 0 10 20 30 20 30 40 50 60 0 10 50 70 80
Same map pair with and without STIS matched 2003 / BioMedware legends
Color and classes reveal your data; use them smartly www.personal.psu.edu/cab38 www.colorbrewer.org cbrewer@psu.edu