COMP 388/441: Human-Computer Interaction Today's Topics Overview of visualization techniques 1D charts, 2D plots, 3D+ techniques, maps A few guidelines for scientific visualization methods, guidelines, Survey of visualization tools and software Note: What this lecture is NOT Data Visualization: April 10, 2013 fully comprehensive, by any means strongly advocating any one tool for all uses a survey of creative uses of these tools 1D techniques simple 2D plotting Line plots Bar Charts Pie Charts Scatter plots Ancient plotting techniques The stem-and-leaf plot? row: first digit, column: second digit
Better: histograms Scatterplot matrix Plots frequency of 1D data in bins equivalent in 2D: contour maps of scatter plot densities Useful for viewing multiple relationships simultaneously Global Maps Q-Q plots comparing data in across distributions 3D surface on 2D leads to distortions used to quickly determine if a scaling relation exists between two distributions linear = scaling relation exists most common: normal Q-Q plots implicitly test normality Cloropleth maps Useful in indicating one dimension of information as an overlay on a map Mercator: scale increases near poles Gall-Peters: distorts shape horizontally for equal areas Mollweide: warps less dramatically at poles Goode's: is equal area, but sacrifices distances Robinson: a compromise, neither equal area nor conformal However, for smaller maps this is not an issue Flow map Indicating both location, strength, and possibly time
Graduated symbol maps Capable of showing multiple dimensions of data graphically - here overall population AND % hispanic for each state Cartogram of 2012 election Warping areas to represent data Which map doesn't help you see who won? cartogram: counties cartogram: states Treemap - US Budget Like cartogram, but when location doesn't matter 1D 2D 2D+ Visualizations summary Pie charts, bar charts line plots, scatter plots, histograms Scatter-plot matrix, contour maps Maps global projections, cloropleth, graduated symbols Using area as a dimension Cartogram, Treemap Scientific Visualization Guidelines Keep It Simple... (KISS) Primary goals quick to understand - use simple, standard forms highlight the important aspect of the data avoid misrepresentation/biased interpretation IMPORTANT: For output formats that scale/print well... use vector graphics: SVG, EPS, PDF software: Inkscape (free), Illustrator (expensive) instead of rastor graphics: gif, jpg, png, tiff... software: Gimp (free), Photoshop The following tips are a sample from: Kelleher, C., Wagener, T., Ten guidelines for effective data visualization in scientific publications, Environmental Modelling & Software (2011), doi:10.1016/j.envsoft.2010.12.006 Create the simplest graph that conveys the information you want to convey
Select meaningful axis ranges Axis ranges across plots Keep axis ranges similar to compare across plots Using lines Use lines only to connect sequential data Appears to not change in interval Implies data is not known a very brief survey of Tools you can use for data visualization Tools for data visualization EXCEL, OpenOffice Spreadsheet,... Local data vs. stored on database desktop, server vs. client rendered Browser compatibility static images vs javascript and SVG Good for... novice or onetime users Expertise necessary Novice: spreadsheets Intermediate: manipulating scripts, graphical selections Advanced: Python/pylab, Weka, R, matlab creating static images Unacceptable for... automation interaction graphics
Google Charts Flot available in spreadsheets Uses jquery - small, lightweight javascript library Relies on canvas - works across many browsers online: can integrate with web forms Can only plot line and bar charts, but can be interactive through callbacks resulting charts are interactive app engine allows advanced programming interaction D3 Raphaël JavaScript library that produces SVG and VML output. Graphics are crisp, but may load slowly. Many options makes the learning curve a bit steeper Leaflet a lightweight mapping framework, to work comfortably even on mobile devices feature-rich allows CSS-like customization aimed at data visualization Kartograph a powerful javascript or python library for generating SVG-rendered maps CartoDB D3 (Data-Driven Documents) is a JavaScript library for interactive SVG rendering Similar concerns to Raphaël. Advanced graphics are possible, but require more effort Processing Mapping frameworks Polymaps, Openlayers quick data tables --> maps A popular cross-platform Java-like programming language for creating visualizations Desktop application for interactive visualizations Also there is Processing.js ports for embedding in browsers, and Processing in objective-c for ios
Pro-tools for automating analyses using high-level statistical packages as needed Commercial data analysis packages available MATLAB (and the free alternative, Octave) SPSS SAS Problem: expensive, and locked-in but there are free alternatives... R A free software environment for statistical computing The tool of choice for statisticians Weka A cross-platform collection of machine learning algorithms for data mining tasks. tools for pre-processing, classification, regression, clustering, association rules, and visualization. Can be used directly, or called through Java Python Python with associated modules numpy/scipy, matplotlib, many others Available as combined packages Sage, Enthought, Python(x,y) RPy - to work with R and Python simultaneously Today's Summary Overview of visualization techniques 1D charts, 2D plots, 3D+ techniques, maps Some guidelines for scientific visualization keep it simple, use appropriate data ranges... A brief survey of visualization tools novice: Microsoft, OpenOffice, or Google Spreadsheets interactive: Flot, Raphael, D3, Processing pro-tools: R, Weka, Python